A Pattern Discovery Approach to Retail Fraud Detection
Prasad Gabbur, Sharath Pankanti, Quanfu Fan, and Hoang Trinh
Exploratory Computer Vision Group, IBM Research
** ******* *****, *********, ** 10532
*******@*****.*******.***, {sharat, qfan, trinhh}@us.ibm.com
ABSTRACT 1. INTRODUCTION
A major source of revenue shrink in retail stores is the
A major source of revenue shrink in retail stores is the inten-
intentional or unintentional failure of proper checking out
tional or unintentional failure of proper checking out of items
of items by the cashier. It is estimated to cause billions
by the cashier. More recently, a few automated surveillance
of dollars in revenue losses each year [8]. In the case of
systems have been developed to monitor cashier lanes and
an intentional failure, it is referred to as sweethearting be-
detect non-compliant activities such as fake item checkouts
tween a cashier and a customer as both intend to derive
or scans done with the intention of deriving monetary bene-
monetary bene t from it. More recently a few automated
t. These systems use data from surveillance video cameras
surveillance systems have been developed to monitor cashier
and transaction logs (TLog) recorded at the Point-of-Sale
lanes and detect non-compliant activities. These systems use
(POS). In this paper, we present a pattern discovery based
data from surveillance video cameras and transaction logs
approach to detect fraudulent events at the POS. Our ap-
(TLog) recorded at the Point-of-Sale (POS). Video-based
proach is based on mining time-ordered text streams, rep-
[8, 1, 2] systems visually monitor the activities of a cashier
resenting retail transactions, formed from a combination of
around the Point-of-Sale to detect item checkouts and ver-
visually detected checkout related activities called primitives
ify them using TLog data. Being able to detect as many
and barcodes from TLog data. Patterns representing single
non-compliant events as possible while keeping the number
item checkouts, i.e. anchored around a single barcode, are
of false alarms low is key to the successful deployment of
discovered from these text streams using an e cient pat-
these systems. It is a challenging problem to optimize the
tern discovery technique called Teiresias. The discovered
two con icting objectives due to variations and noise within
patterns are used to build models for true and fake item
the input data streams.
scans by retaining or discarding the anchoring barcodes in
Instead of using the two streams (video and TLog) of data
those patterns respectively. A pattern matching and classi-
separately, we posit that much can be learned about the na-
cation scheme is designed to robustly detect non-compliant
ture of an item scan performed by a cashier by combining
cashier activities in the presence of noise in either the TLog
them into a single stream. This is because most item check-
or the video data. Di erent weighting schemes for quanti-
outs are normal (no fraud) and a barcode is registered in
fying the relative importance of the discovered patterns are
the TLog. By analyzing visual information around the reg-
explored: Frequency, Support Vector Machine (SVM) and
istered barcode events, it is possible to model variations in
Frequency+SVM. Using a large scale dataset recorded from
the cashier s activities for checking out an item. This is
retail stores, our approach discovers semantically meaning-
helpful in detecting non-compliant cashier activities more
ful cashier scan patterns. Our experiments also suggest that
robustly in the presence of noise in either the TLog or the
di erent weighting schemes result in varied false and true
video data.
positive performances on the task of fake scan detection.
Video-based systems usually make use of complex ana-
Categories and Subject Descriptors: I.5.4 [Pattern Recog- lytics to detect and monitor cashier activities [7]. Many of
nition]: Applications computer vision, text processing these approaches make adhoc assumptions such as only a
constrained set of activities being performed by a cashier
General Terms: Security
during retail transactions. Real-world data presents a num-
Keywords: Retail, Security, Sweethearting, Pattern dis- ber of challenges to such approaches due to those assump-
covery, Teiresias tions being violated very often. Computer vision systems are
based on extracting visual features from input video streams.
Recently, there has been some work in transforming videos
into a sequence of discrete features [11] and applying text-
based data mining algorithms on the resulting streams. This
Permission to make digital or hard copies of all or part of this work for has opened up new possibilities for looking at video data in a
personal or classroom use is granted without fee provided that copies are di erent light in order to infer useful knowledge. Text-based
not made or distributed for prof t or commercial advantage and that copies
algorithms are simpler and faster than many sophisticated
bear this notice and the full citation on the f rst page. To copy otherwise, to
video analysis techniques. But their potential in addressing
republish, to post on servers or to redistribute to lists, requires prior specif c
some of the vision challenges have not been explored. In
permission and/or a fee.
KDD 11 August 21 24, 2011, San Diego, California, USA. this work, we propose a text-based approach to analyzing
Copyright 2011 ACM 978-1-4503-0813-7/11/08 10.00.
as another primitive, the alphabet set from which the text
strings are formed has a cardinality of four: {P, S, D, B }.
The output of the pattern discovery algorithm is a set of
patterns within the input strings that are repeated at least
a minimum number of times across those strings. Only those
patterns that have a single barcode B primitive within them
and with particular con gurations of the other three prim-
itives around it are selected and the others are discarded.
Practically most item checkouts in retail transactions are
genuine and fake checkouts are very rare. With this assump-
tion, the selected patterns are representative of single item
checkouts and model the variations in the process of check-
1: A typical checkout involves the pickup (P ), scan (S )
ing out an item represented as a combination of discrete
and drop (D) of the item of interest. Each such action is
primitives. Conceptually, a fake scan or a fraudulent item
detected as a primitive by a motion-based primitive detector
checkout is similar in essence to a genuine checkout except
(See Section 3.1).
that a barcode is not registered in the TLog. So a model
representing fake scans is derived from the model for true
scans by discarding the barcode primitives from the pattern
videos represented as time-ordered discrete features called
set. For the sake of convenience, we term the models cor-
primitives in the context of a retail surveillance application.
responding to true and fake scans as positive and negative
A retail transaction involves the checkout of a set of items
models respectively.
the customer intends to buy. Ideally, each checkout event
The positive and negative models are used to detect the
can be thought of as comprising three sub-events or primi-
presence of any fake scans within a new transaction with the
tives: Pickup (P ), Scan (S ), and Drop (D) in that temporal
help of a pattern matching and classi cation scheme. Each
order, as illustrated in Fig. 1. A visual compliance approach
primitive within the transaction stream is assigned a vote
[7] is designed to detect these sub-events in a video and
by each of the patterns in the positive and negative groups
group them into P SD triplets corresponding to visual scans
when they match a certain neighborhood of that primitive.
of items. A visual scan with an associated barcode con rms
Following a Bayesian approach, these votes are transformed
a valid checkout, otherwise it is a potential candidate for a
into posterior likelihoods of the primitive being a part of
fraud or operational error. We refer to a valid checkout as
a true or a fake scan. A classi cation based on maximum
a true or visual scan and a fraud or operational error as a
posterior likelihood (above a certain minimum con dence
fake scan.
level) is done to assign the primitive to the most likely class.
Usually the output of the primitive and event detection
Finally a grouping of the classi ed primitives based on their
modules is not a single P SD triplet for every checkout.
class labels and temporal adjacency leads to a segmentation
Some of the sub-events may be repeated more than once
of the transaction stream into true and fake scans.
or may not even be detected due to various reasons. One
common situation arises when a cashier has di culty scan-
ning the item in a single attempt and ends up doing mul-
3. CHECKOUT PATTERN DISCOVERY
tiple scans resulting in repeated S primitives for a single
item. Therefore the primitive sequence output by the de- More details of the primitive detection and pattern dis-
tection system is usually noisy with repetitions and/or no covery steps are given in the following subsections.
occurrences of one or more of the P, S and D primitives.
3.1 Primitive Detection
This poses di culties to any approach looking for the pres-
ence of a P SD triplet corresponding to an item checkout.
To detect primitives: Pickup (P ), Scan (S ), and Drop (D)
A more robust method accounting for missing and repeated
from the retail video stream, we implement an e cient vi-
primitives is needed.
sion technique proposed in [14]. The approach is based on
In what follows, a paradigm for visual compliance based
the observation that each primitive P, S, D follows a spe-
on pattern discovery and matching is proposed. A brief de-
ci c motion pattern - each primitive is an in/out process in
scription of the proposed approach is given in Section 2. The
which the cashier s hand enters and exits a region of interest
details of the various steps involved including pattern discov-
(ROI). To detect hand motion, a hand color model is rst
ery and matching, primitive classi cation and true/fake scan
adaptively learned by continuously collecting hand pixel ex-
detection are given in Sections 3 and 4. Section 5 describes
amples. Based on this model, each pixel in the motion map
a few weighting schemes to quantify the relative importance
computed by frame di erencing is classi ed as hand pixel or
of the discovered patterns for classi cation. Experimental
not, to form a hand motion map for each video frame. This
details and results are given in Section 6 followed by a brief
map explicitly captures the hand motion robustly in the
discussion of the results in Section 7 and conclusions in Sec-
presence of motion noise from belt movement, background
tion 8.
changes, and customer interactions. A hierarchical nite
state machine (FSM) is then applied to deterministically
2. OUR APPROACH check whether or not the hand motion follows the prede-
Various patterns of item checkouts are discovered using ned in/out pattern corresponding to each primitive. After
an e cient pattern discovery algorithm called Teiresias [12]. the primitives P, S, D are detected, these are combined with
The input to the pattern discovery algorithm are text strings the barcode primitives (B ) from the TLog data to form a
formed from temporally ordering the output of the primi- temporally ordered text string of {P, S, D, B }, which is used
tive detectors and the TLog data. By treating the barcode as input string to the pattern discovery algorithm.
3.2 TEIRESIAS any wildcard characters so that exact pattern matching can
be done using those patterns. For the experiments in this
Teiresias [12] is a widely used algorithm for discovering re-
work, the L = W values range from 3 to 10. We choose
peated patterns across nite alphabet streams. It was orig-
a conservative value for K (2) so that even patterns that
inally developed for discovering rigid patterns in biological
are repeated only once are captured. From the resulting
sequences, eg., motifs in amino acid sequences. However, it
set, only those patterns that match the regular expression
has been applied to problems in other domains such as spam
P [P SD] B [P SD] D are retained. These are all the sin-
ltering [13] and intrusion detection [18] in UNIX processes.
gle barcode (B ) patterns that start with a pickup (P ) and
Teiresias discovers all unique maximal patterns
end in a drop (D) primitive with any possible number and
having a certain minimum support K within a given set of
arrangement of P, S and D primitives between them.
alphabetic sequences. A pattern is de ned to be
Some of the patterns discovered by Teiresias are represen-
one in which every sub-pattern of length W or more has at
tative of di erent possible ways an item checkout occurs at
most (W L) wildcard characters, the others are alphabets.
the POS. Table 1 lists the rst eight patterns based on their
The term maximal implies that the discovered patterns can-
repeatability across transactions used for pattern discovery
not be made more speci c without reducing their support
along with the number of transactions in which they oc-
sets within the input streams. The computational complex-
cur. Upon visual inspection of the corresponding videos, the
ity of the algorithm is seen to scale quasi-linearly with the
patterns P BSD and P SBD represent idealized item check-
size of the output patterns, hence making it a very e cient
outs involving one each of the P, S, D and B primitives in
pattern discovery tool.
them. P DBSD, P BSP D, and P DBSP D are representa-
3.3 Patterns in retail transactions tive of overlaps between checkouts of two successive items.
P DBSD indicates that the drop of the previous item oc-
The Teiresias pattern discovery algorithm is used to dis-
curs in parallel or after the pickup of the next item, whose
cover patterns that are representative of genuine item scans
barcode is included in the pattern. Similarly, P BSP D rep-
within retail transactions. In our system, a transaction is
resents picking up the next item while or before dropping o
a stream of P, S, D and B primitives, where the ordering
the current item, whose barcode forms a part of the pattern.
is determined by their occurrence in time. Since a typical
These are instances of parallel scanning. Parallel scanning
item checkout involves picking, scanning and then dropping
of the current item with both the previous and the next one
it, some representative patterns that correspond to this pro-
results in the pattern P DBSP D, with the B primitive cor-
cess would be P SBD, P BSD or P SDB, allowing for time
responding to the current item. The other patterns in the
delays in primitive detection or barcode registration. In re-
table result from a missing S primitive in one or more of
ality, there are a number of other patterns that are possible
the cases described above. Similarly, we found that multi-
due to noisy primitive streams rendered by the primitive
ple scanning was represented by the patterns: P DSDSBD
detector and TLog data. Some typical examples include:
and P DSDBSD, although they did not occur in as many
transactions (19 and 53 respectively) as the patterns above.
1. Multiple scan: An item is scanned more than once be-
fore a successful barcode registration. Representative
3.4 Reducing pattern redundancy
patterns in this case might be P SDSBD, P SDSDSBD,
and many more. The output of the Teiresias algorithm is a set of unique
maximal patterns for a particular setting of the
2. Late barcode: A successful scan but a delay in the
parameter values. We capture various levels of noise in the
arrival of barcode might result in patterns such as
representation of a true scan by discovering patterns with
P SDB, P SDP SB, etc. Such patterns might also re-
di erent settings for L = W . This leads to some redun-
sult from items which have to be manually registered
dancy in the discovered pattern set in that a few patterns
at the POS without scanning. Examples include or-
are repeatedly discovered for two di erent settings of L = W
ganic produce such as vegetables and fruits.
and there are overlaps between some patterns. One example
of an overlapping pair of patterns is P SB and SBD, which
3. Parallel scan: Following a successful scan, an item is
form sub-patterns of the canonical true scan pattern P SBD.
dropped after the next item in the pickup queue is
This might be the result of slightly di erent support sets for
picked up. This can be represented using P SP BD,
the two patterns caused by a missing P or a D primitive in
P SBP D and so on.
one of the scans within the input transactions. Repetitions
Variations in the patterns representing item scans are cap- are easily taken care of by retaining only the unique pat-
tured by mining primitive streams, resulting from transac- terns. One possible way to account for overlaps is to nd
tions, for patterns that repeat across transactions. Most of all pairs of patterns with overlaps among them and stitch
the item checkouts in transactions are true scans with a bar- them together into longer and longer patterns in an itera-
code being registered for each scan. Therefore patterns that tive process. But this has a high likelihood of reducing the
are repeated across transactions and have a single barcode support set of the resulting longer patterns relative to the
(B ) primitive within them can be thought of as correspond- component patterns if the component patterns did not come
ing to true scans. We seek such patterns by rst discovering from a single parent pattern. This is not desirable because
unique maximal patterns with support at least each of the overlapping patterns may uniquely represent a
K from a set of transactions, employing the Teiresias algo- true scan with some missing primitives.
rithm. This process is repeated for a few di erent L = W In order to address redundancy due to overlapping pat-
values to capture representative patterns of di erent lengths, terns, we do a pattern-transaction co-occurrence analysis,
i.e., with di erent levels of noise embedded in them. Set- which is similar to Latent Semantic Analysis [6] used in
ting L = W ensures that the resulting patterns do not have the area of information retrieval. A pattern-transaction co-
occurrence matrix C is constructed, where each row in the pattern matching and classi cation procedure using the two
matrix corresponds to a unique discovered pattern and each pattern sets. Each pattern in the positive set contributes a
column to an input transaction used for pattern discovery. positive vote to every primitive of any corresponding sub-
The (i, j )the entry of the matrix is a count of the number of string it matches within the transaction stream. Similarly
times pattern i appears in the j th transaction. Usually C is negative votes are accumulated for each primitive based on
a sparse matrix with a few positive entries scattered among how many patterns in the negative set exactly match a sub-
zeros. Each row of this sparse matrix can be thought of as string containing the primitive. These votes are used for
a representation of the corresponding pattern in a space of classifying the primitives as described below.
dimensionality equal to the number of transactions used for
4.1 Classif cation
pattern discovery. In general, these dimensions are not un-
Each primitive p in the input transaction stream is as-
correlated. So a measure of similarity among patterns such
signed a true (T ) or a fake (F ) scan label depending on the
as the correlation coe cient cannot be reliably computed in
number of votes it has accumulated for the two categories.
this space. Singular Value Decomposition is performed on
In order to do this classi cation, a Bayesian framework is
the co-occurrence matrix C resulting in the following factor-
adopted. Let ST and SF denote the sets of all true and
ization of C
fake scan patterns respectively. Then the likelihood of the
C = U V T (1)
primitive p being a part of a true scan can be written as
where U and V are orthogonal matrices of size m m and
P (T p) P (T, p) (2)
n n, assuming C is of size m n, i.e., there are a total
of m patterns discovered from a set of n transactions. is Similarly, the likelihood of p belonging to a fake scan is given
a rectangular matrix with only non-zero elements along its by
diagonal, which correspond to singular values. The columns
P (F p) P (F, p) (3)
of U and V are the eigenvectors of CC T and C T C respec-
tively. Choosing only the rst k largest singular values and We de ne the joint likelihoods P (T, p) and P (F, p) to be the
the corresponding eigenvectors leads to the best rank k ap- probabilities of the context of p belonging to the set ST and
proximation of C in the least squares sense. Further, each SF respectively.
row of matrix U is a representation of the corresponding
P (T, p) P (p context ST ) (4)
pattern (in C) in an orthogonalized space.
P (F, p) P (p context SF ) (5)
Since the number of transactions n is usually much smaller
than the number of patterns m, there are only n non-zero
where p context is used to denote the context or neighbor-
singular values in the SVD of C . Therefore only the rst n
hood of p. Each of the above likelihoods are proportional
columns of U comprise a full representation of the patterns
to the number of patterns in ST and SF respectively that
in an orthogonal space while allowing a complete reconstruc-
match a sub-string containing p. If we denote the positive
tion of C . Correlation coe cient computed in this space is a
and negative votes accumulated by p with V otesT (p) and
more reliable measure of similarity between patterns based
V otesF (p) respectively, then
on their co-occurrence within the transactions. We use this
P (p context ST ) V otesT (p) (6)
measure for clustering patterns together that have a high
correlation among them. A greedy agglomerative clustering P (p context SF ) V otesF (p) (7)
scheme is adopted, where the patterns are rst ordered in the
Appropriate normalization of the cumulative votes of the
decreasing order of their total number of occurrences. Then
primitive p for the two categories yields the posterior prob-
all the patterns that have a high correlation (> 0.5) with
abilities P (T p) and P (F p). The primitive is assigned to
the most repeated pattern are put into one cluster. This is
the class with the maximum posterior likelihood given it is
repeated with the next available most repeated pattern and
above a certain threshold value.
so on until all patterns are exhausted. Each cluster is then
4.2 True and fake scan detection
represented by the most occurring pattern among its mem-
bers. Only the cluster representatives are used for pattern
Classi cation of a transaction s primitive stream results
matching.
in a class label (T or F ) for those primitives whose poste-
4. PATTERN MATCHING AND CLASSIFI- rior likelihoods for the respective class are above the cho-
sen threshold. Other primitives remain unclassi ed. This
CATION usually results in continuous stretches of T s and F s with
The patterns derived from the above procedure de ne a unclassi ed primitives in between them. One continuous
model for true scans, which we refer to as the positive model stretch of T or F labels need not necessarily correspond to
for the sake of convenience. It can be hypothesized that fake a single true or fake scan respectively. A segmentation like
scans are similar to true scans but without the registration of procedure groups subsets of adjacent primitives that are as-
a barcode. With this assumption, a corresponding negative signed T labels into true scans. This is achieved by choosing
model is constructed for fake scans by eliminating the B a B primitive within each such subset as the anchor for the
primitive from the true scan patterns. Two tables are built, corresponding true scan. Other primitives such as P, S and
one corresponding to true scans (positive) and the other cor- D that form a part of the true scan are searched for locally,
responding to fake scans (negative). The two tables contain i.e., the nearest ones to the anchoring B primitive within
the same set of patterns except for the barcode primitive a small window. The size of the window is limited to the
being deleted from each pattern in the negative set. Given a set of non-B primitives around the anchor, which have been
primitive stream from a new transaction, each primitive in classi ed as belonging to the class T . Only one P, S and D
the stream is assigned a true or fake scan label based on a primitive is assigned to a true scan.
Among the available P s and S s within a window, prefer- a weighting scheme where the weights were derived from a
ence is given to the ones that occur closest to B but earlier classi er trained to distinguish true from fake scans using
in the stream. On the other hand, a D that is closest to B ground truth data.
but occurs later in the stream is preferably assigned to the
5.2 Support Vector Machine (SVM) weights
true scan. In the absence of a P, S or D primitive within
the neighborhood window of an anchoring B, a new such
Support Vector Machine (SVM) [15, 3] is used in a variety
primitive is created and assigned to the true scan. Similar
of classi cation problems because of a number of its favor-
grouping is performed to detect fake scans using S primitives
able properties including superior generalization ability. A
labeled as belonging to class F as anchors and searching for
two-class SVM estimates a hyperplane separating the train-
the nearest P and D primitives. In order to keep the number
ing points in a feature space such that the margin between
of false positives manageable, we assume that at least one
the hyperplane and the two classes is maximized. Assuming
each of P, S and D primitives should be present to provide
a linear kernel SVM, the weights are indicative of how useful
enough evidence for a fake scan and that two successive fake
the corresponding features are for the purpose of classi ca-
scans be at least 3 seconds apart.
tion. This has been used for feature selection [5, 9]. We
exploit this property of SVMs to learn the relative impor-
5. WEIGHTING VOTES tance of the discovered patterns for classifying a primitive
into true or fake scan based on the matches to its context.
Within a transaction stream, a primitive is assigned a vote
In order to achieve this, we sample a set of training points
by each of the patterns matching a sub-string containing the
belonging to true and fake scan classes and map them onto
primitive. A voting scheme where each pattern contributes
a feature space with dimensionality equal to the number of
the same vote to the matching primitives might not be opti-
patterns. The training points are a set of primitive strings
mal. Our goal is to be able to reliably distinguish true scans
centered around the B primitive for a true scan and the S
from fake scans based on the discovered patterns. Not all the
primitive for a fake scan.
patterns contain the same amount of information to distin-
The B primitive locations of true scans are randomly sam-
guish between the two classes. There may still be some re-
pled from the set of transaction streams used for training
dundancies left among patterns ltered out by co-occurrence
leaving out those transactions which have a fake scan in
analysis. Therefore it is not guaranteed that each pattern
them. Note that we hand labeled the locations of the fake
contributes independent information relative to others in the
scans in our transaction streams by a visual inspection of
set. Further some of the patterns may even be noisy, eg.,
the corresponding videos. There are a total of 13 fake scans
resulting from missing or multiple primitive detections for
in our data consisting of 1660 transactions. The number of
single pickup, scan or drop events during transactions. A
randomly sampled true scan points from training data is in
weighting scheme where patterns that carry independent in-
the order of a few hundreds. There is a large imbalance in
formation and are the most important for classi cation are
the number of training points for the two classes. Train-
weighted more relative to others is desirable. We experi-
ing on such data leads to a SVM that is biased towards the
ment with two weighting schemes that assign di erent rela-
class with the larger size [10]. A number of approaches have
tive votes to the patterns.
been proposed for SVM training in the presence of a large
5.1 Frequency weights imbalance in data for the two classes. These methods either
This is a simple weighting scheme where each pattern is undersample the majority class [17], oversample the minor-
assigned a weight inversely proportional to the number of ity class [4] or adjust the SVM margin to reduce the bias
times it occurs across all the transactions used for pattern towards majority class [16, 10].
discovery. This is mostly an empirical weighting scheme Our use of SVM is to learn a weight for each of the fea-
driven by experimentation. We tried di erent heuristic schemes ture dimensions, corresponding to a pattern, that quanti es
that were based on the frequency of patterns and the size the importance of that pattern for classi cation. It is desir-
of their clusters in the co-occurrence analysis (Section 3.4). able to have a non-negative weight for each pattern so that
Among them, this simple weighting scheme led to the best they can be easily interpreted. In order to avoid the data
results in terms of the number of falsely detected fake scans imbalance e ects and to ensure positive weights, we learn a
relative to the total number of detected true scans. zero-bias SVM with a particular mapping of the true and
A possible reason might be that smaller patterns tend to fake scans into a feature space. The mapping is such that
occur more frequently than longer ones among our discov- the true scan points always lie in the positive hyper-quadrant
ered patterns. Using a uniform voting scheme, the smaller and the fake scan points in the negative hyper-quadrant. A
patterns have a higher in uence in the classi cation decision hyper-quadrant refers to the extension of a quadrant in a
of a majority of primitives because smaller patterns match 2D space to a higher dimensional space. So the positive
more sub-strings than longer ones. Further each pattern in hyper-quadrant is the set of all points with only positive
the negative (F ) class is one primitive shorter than its coun- coordinates and similarly for the negative hyper-quadrant.
terpart in the positive class. All these factors contribute The sampled true scan points are matched with the set of
to an increased number of short noisy primitive segments patterns in the positive set to obtain a feature vector for
that are assigned to the negative (F ) class. With the in- each of those points. Only the dimensions corresponding to
verse frequency based voting scheme, longer (less frequent) the patterns with exact matches are set to suitable non-zero
patterns in uence the classi cation decision of many prim- positive values and the others are zero. Similarly the feature
itives leading to longer, less noisy segments of continuous vectors for fake scan points are obtained by pattern match-
T or F labels. However, this scheme causes many genuine ing with the negative set of patterns. However the values for
fake scans to be missed, based on our experiments (See Sec- the dimensions with exact matches are set to suitable nega-
tion 6). In order to circumvent this problem, we designed tive values in this case. Fig. 2 illustrates the mapping and
tern matches regardless of the pattern. Such a mapping
causes true scan points that match to only a few of the pat-
terns occurring less frequently to be moved more towards the
origin in the feature space relative to others. Since the SVM
hyperplane passes through the origin it seems that points
matching to less frequent patterns are most likely to be the
support vectors, which in uence the nal weights. This has
a tendency to favor lesser number of false positives in our
experiments. Table 1 shows the SVM weights obtained by
averaging the learned normalized mean SVM weights across
all the divisions of the data from 6 lanes into 3 training and
3 held-out lanes (See Section 6.1). Interestingly, the two ide-
alized checkout patterns (P BSD and P SBD) have the most
signi cant average SVM weights among the above, implying
2: A zero-bias SVM for classifying true scan points from
their usefulness for classi cation.
fake scans. The mapping of points is such that true scan
5.3 Combining frequency and SVM weights
samples always lie in the positive quadrant and the fake
scan samples in the negative quadrant. There are very few
The SVM-based weights emphasize those patterns that
fake scan samples relative to the number of true scan sam-
are useful for distinguishing true and fake scan points within
ples. The SVM weights are all positive and quantify the
the training data. Interestingly, these weights also lead to
relative importance of the feature dimensions (P1 and P2 )
an increased number of genuine fake scan detections (true
for classi cation.
positives) within the held-out data relative to using the fre-
quency weights (Section 6). However the false positive per-
Mean Normalized Weights
formance of the classi er using SVM weights is worse com-
0.7
0.6
pared to that using the frequency weights. This suggests
Mean weight
0.5
an interesting possibility of combining the two weights such
0.4
0.3
that the combined weights help achieve a better false pos-
0.2
0.1
itive performance relative to the SVM weights and better
0
0-50-100-***-*** 250 300
Pattern index
true positive performance relative to the frequency weights.
Normalized Weights
1
The combined weights are obtained by summing the SVM
0.8
and frequency weights, thereby reinforcing only those fre-
Weight
0.6
quently occurring short patterns that are useful for classi-
0.4
cation. For convenience, these weights are referred to as
0.2
0
(Frequency+SVM) weights.
0-50-100-***-*** 250 300
Pattern index
3: (Below) SVM weights after normalization (maximum
6. EXPERIMENTS
value 1) for each random split of the available data. The
learned weights for the di erent splits are superimposed.
6.1 Data
(Above) Averaged weights across all the random splits. The
Video data recorded from 6 lanes of a retail store during
weights imply relative importance of patterns for the task
one business day (approx. 16 hrs) of transactions was used
of classi cation. Clearly, some patterns are more important
for all the experiments reported in this work. During this pe-
than others.
riod, 18 di erent cashiers checked out a total of about 32,969
registered items within 1660 transactions. The video data
the SVM geometry in a 2D space corresponding to patterns was transformed into discretized and time ordered primi-
P1 and P2 . tives and merged with the barcode events from the TLog
The available data within a training set (3 lanes) (See Sec- data. Primitive streams corresponding to transactions on
tion 6) is split into two equal parts randomly (approx. 250 3 of the lanes were chosen for discovering patterns as de-
true scans and 2-3 fake scans) with one half used for SVM scribed in Section 3.3. The same set of patterns were used
training and the other half held out for testing the learned for all the experiments reported here. The entire data was
classi er. SVMs are trained with 100 di erent such ran- divided into two parts (3 lanes each). These two sets are
dom splits and the weights are averaged to obtain the nal referred to as training and held-out lanes respectively. All
such divisions ( 6 = 20) were considered and the results
`
weights. Considering one training division of all the data, 3
Fig. 3 shows the normalized weights after SVM training us- were averaged across the divisions. Note that the frequency
ing each of the 100 random splits of the data and also the weights remain the same for all the data divisions resulting
averaged weights across those splits. in the same true and fake scan detections unlike the other
The mapping into the feature space can be done in many two weighting schemes. Only the held-out statistics change
di erent ways. Our experiments favored an asymmetric across those divisions. Similarly for the unweighted Teiresias
mapping scheme between the true and fake scan training and all patterns (Section 6.2). The data from all the lanes
points. In this scheme, the positive value assigned to a was manually evaluated by six human subjects for the pres-
match between a true scan and a pattern is proportional to ence of sweethearting activities or operational errors leading
the frequency of the pattern. On the other hand the same to the failure of barcode registration for the corresponding
negative value is assigned to all the fake scan point and pat- items. A total of 13 such events were found.
Pattern #Repeats across transactions Average SVM weight
1584 0.005
P BD
1302 0.593
P BSD
782 0.009
P DBD
774 0.010
P BP D
734 0.127
P BSP D
629 0.127
P DBSD
590 0.597
P SBD
585 0.014
P DBSP D
1: The rst eight most repeated patterns across all transactions used for pattern discovery by Teiresias. Each of those pattern s
number of repeats (transactions in which they occur) and the learned average SVM weight are reported in the second and
third columns respectively. P BSD and P SBD represent idealized item checkouts and also have the most signi cant average
SVM weights. Other patterns represent parallel scans between successive items and scans with a missing S primitive. See
text for more details.
6.2 All patterns from 0.5 to 0.9 in steps of 0.1 and plot the true and fake
scan detection performances. The number of detected true
Our model for true scans is a set of discovered primitive
(or visual) and fake scans for data from the held out lanes
patterns with di erent combinations of P, S and D primi-
of one particular division of all the data is plotted in Fig. 4.
tives anchored around a single B primitive. The discovered
In a
Copyright 2011 ACM 978-1-4503-0813-7/11/08 10.00.