Image classification
in retail.
Lessons from the real world
Valentina Bono and Paul Klinger
Summary
• Data Science at Tesco
• Image classification
• Dataset generation: data sources
• Dataset generation: unanticipated situations
• Label quality & label refinement
• Imbalanced datasets
• Distribution shift
• Metric learning
2
3
We Work End-to-End Across Tesco
WORKFORCE
MANAGEMENT
SECURITY
SUPPLY CHAIN
PROPERTY
SUPPORT
FULFILMENT
STORE
STORE OPERATIONS
COMMERCIAL PLANNING
ONLINE
MARKETING
TRANSPORT & LOYALTY
FINANCE
4
What? Image Classification CLASSES
IMAGES
5
What? Image Classification
6
Data sources: checkout data and CCTV videos
7
Unanticipated situations: Empty checkout
8
Unanticipated situations: Covered checkout
9
Unanticipated situations: what class?
10
Unanticipated situations: loose versus packaged
11
Unanticipated situations: brown bag
12
Label quality: how to improve the labels
Can you
recognise Is class 1 Is class 2
Class label
the visible? visible?
product?
Yes Class 1 & 2
Yes
No Class 1
Yes
Yes Class 2
No
Look at image
No Other
No Occluded/Empty
13
Label quality: how to improve the labels
Occluded/empty Avocado Avocado
Is this correct? Avocado or onion? Avocado or mango?
Can Carrot Pepper
14
Is this a can or a box? Carrot or onion? Pepper or tomato?
Label quality: how to improve the labels
What class is this? Can you recognise it now?
15
Dealing with imbalanced datasets
Real-world datasets are often very imbalanced. Images per class
• How do you train on them
(oversampling / undersampling)?
• What distribution will the model encounter once
deployed?
• How do you evaluate (against real or corrected
distribution)?
• Is there an additional weighting from business
requirements?
16
Recalibrating models to a new distribution
Given a model that gives calibrated probabilities for a Effect of recalibration
distribution A, how do we get calibrated probabilities
0.9
for a different distribution B?
0.8
0.7
Simply rescale probabilities by the ratio of the 0.6
distributions and normalize! 0.5
0.4
0.3
0.2
0.1
0
Train Test raw output transformed
output
apple banana pear
17
Recalibrating models to a new distribution
This means that we can train a model on distribution A But what if it’s not?
and then transform it to one adapted to the test
distribution. We can just rescale each output probability by an
arbitrary factor (and normalise).
Assuming the model was well calibrated before, this
choice of transformation gives the best possible So we have number_of_classes extra free parameters to
performance on the test set. tune the model!
18
Distribution invariant metrics?
In the binary classification setting the equivalent to For multiclass problems there is no obvious way to
recalibrating is just setting a threshold. extend this.
The AUC gives a metric that’s invariant under setting the
threshold.
There are multiclass versions of AUC, but they are not
independent of the rescaling!
(If you know of a metric that is independent let me know!)
19
Distribution shift
The obvious and the not-so-obvious
• 2 models can perform similarly on an unseen test set,
but differ a lot in their ability to generalise
• Synthetic transformations (data augmentation) can
help a bit, but real world effects can be subtle.
• Models that have seen a wider variety of images are
better at handling distribution shift (at least if they are
not fine-tuned to convergence).
The Evolution of Out-of-Distribution Robustness Throughout
Fine-Tuning, Andreassen et al.
arXiv:2106.15831
20
Distribution shift
The obvious and the not-so-obvious
When the shift looks similar to what the model is supposed to detect it gets tricky!
21
Distribution shift
The obvious and the not-so-obvious
When the shift looks similar to what the model is supposed to detect it gets tricky!
22
Classify all the products!
Metric learning for scalable product classification
Work by our PhD intern Charles
Goal:
• Classify every product (10s of thousands)
• Handle changes (new products added, old ones
removed) without retraining
-> Metric learning
• Embed images into a ”product space” such that
images of the same (or similar) products cluster
together.
• At test time, embed the query and compare to stored
database of embeddings.
• Can change the product range by changing the
database, without touching the model.
23
Classify all the products!
Metric learning for scalable product classification
Challenges: Training with softmax works surprisingly well, see also
• Lots of occlusion, various other objects in view Classification is a Strong Baseline for Deep Metric Learning
(BMVC ‘19) Andrew Zhai, Hao-Yu Wu
• Need the model to focus on the product, not anything
else (different from many academic datasets)
24
Questions?
We are hiring!
https://www.tesco-careers.com/technology/uk/en/c/data-jobs
• Data Science
• (Senior) Data Scientist – Machine Learning
• (Senior) Data Scientist – Time Series Forecasting
• (Senior) Data Scientist – Computer Vision
• Data Science Intern
Contact us:
• Data Science Engineering
• Data Science Software Development Manager [email protected] (Recruiter)
• Lead Machine Learning Engineer – Computer Vision [email protected] (DS Leadership)
• (Senior) Machine Learning Engineer
• Analytics
• (Senior) Data Analyst
26
Backup Slides
Rescale probabilities formulas
Rescale from distribution A to distribution B
𝑝! (𝑥" ) General rescaling
𝑝 (𝑥 |𝑥)
𝑝# (𝑥" ) # " 𝑓" 𝑝# (𝑥" |𝑥)
𝑝! 𝑥" 𝑥) = 𝑝! 𝑥" 𝑥) = '
𝑝 𝑥 ∑$%& 𝑓$ 𝑝# (𝑥$ |𝑥)
∑'$%& ! $ 𝑝# (𝑥$ |𝑥)
𝑝# 𝑥$
(for n classes).
Adjusting the Outputs of a Classifier to New a Priori Probabilities
May Significantly Improve Classification Accuracy: Evidence from a
Multi-Class Problem in Remote Sensing
Latinne, Saerens, Decaestecker
2001, ICML '01
28