Efficient Large-Scale Product Classification using Machine Learning and Crowdsourcing

 
Chong Sun, Narasimhan Rampalli, Frank Yang, AnHai Doan
@WalmartLabs & UW-Madison
 
Presenter: Jun Xie, @WalmartLabs
 
Chimera: Large-Scale Classification
using Machine Learning, Rules, and
Crowdsourcing
 
@
W
a
l
m
a
r
t
L
a
b
s
 
Problem Definition
 
C
l
a
s
s
i
f
y
 
t
e
n
s
 
o
f
 
m
i
l
l
i
o
n
s
 
o
f
 
p
r
o
d
u
c
t
s
 
i
n
t
o
 
5
0
0
0
+
 
t
y
p
e
s
E
a
c
h
 
p
r
o
d
u
c
t
:
 
a
 
r
e
c
o
r
d
 
o
f
 
a
t
t
r
i
b
u
t
e
-
v
a
l
u
e
 
p
a
i
r
s
title: 
Gerber folding knife 0 KN-Knives
description: 
most versatile knife in its category ...
manufacturer, color, etc.
many products have just title attribute
P
r
o
d
u
c
t
 
t
y
p
e
s
laptop computers, area rugs, laptop bags & cases, dining chairs, etc.
 
2
 
Challenges
 
V
e
r
y
 
l
a
r
g
e
 
#
 
o
f
 
p
r
o
d
u
c
t
 
t
y
p
e
s
 
(
5
0
0
0
+
)
started out having very little training data
creating training data for 5000+ is very difficult
V
e
r
y
 
l
i
m
i
t
e
d
 
h
u
m
a
n
 
r
e
s
o
u
r
c
e
s
1 developer and 1-2 analysts (who can’t write code)
P
r
o
d
u
c
t
s
 
o
f
t
e
n
 
a
r
r
i
v
e
 
i
n
 
b
u
r
s
t
s
e.g., a batch of 300K items just come, must classify fast
makes it hard to provision for analysts and outsourcing
N
e
e
d
 
v
e
r
y
 
h
i
g
h
 
p
r
e
c
i
s
i
o
n
 
(
>
9
2
%
)
can tolerate lower recall, but want to increase recall over time
 
 
 
 
 
C
u
r
r
e
n
t
 
a
p
p
r
o
a
c
h
e
s
 
c
a
n
t
 
h
a
n
d
l
e
 
t
h
e
s
e
 
s
c
a
l
e
s
/
c
h
a
l
l
e
n
g
e
s
 
 
 
3
 
Manually Classifying the Items
 
U
s
i
n
g
 
a
n
a
l
y
s
t
s
can accurately classify about 100 items per day
must understand the item, navigate through a large space of
possible types, decide on the most appropriate one
e.g., 
Misses’ Jacket, Pants and Blouse – 14 -16-18-20 Pattern
 
sewing patterns
e.g., 
Gerber folding knife 0 KN-Knives
 
 
utility knives
? 
pocket
knives
? 
tactical
 
knives
? 
multitools
?
would take 5 analysts 200 days to classify 100K items
U
s
i
n
g
 
o
u
t
s
o
u
r
c
i
n
g
very expensive: $770K for 1M items
outsourcing is not “elastic”
U
s
i
n
g
 
c
r
o
w
d
s
o
u
r
c
i
n
g
crowd workers can’t navigate a complex and large taxonomy of types
 
4
 
Learning-Based Solutions
 
D
i
f
f
i
c
u
l
t
 
t
o
 
g
e
n
e
r
a
t
e
 
t
r
a
i
n
i
n
g
 
d
a
t
a
too many prod types (5000+)
to label just 200 items per prod type, must label 1M items
D
i
f
f
i
c
u
l
t
 
t
o
 
g
e
n
e
r
a
t
e
 
r
e
p
r
e
s
e
n
t
a
t
i
v
e
 
s
a
m
p
l
e
s
random sampling would severely under-sample certain types
analysts and outsourced workers don’t know how to obtain a random
sample, e.g., for 
handbags, computer cables
new product types appear all the time 
 the universe of items keeps
changing
D
i
f
f
i
c
u
l
t
 
t
o
 
h
a
n
d
l
e
 
c
o
r
n
e
r
 
c
a
s
e
s
items coming from special sources, need to be handled specially
hard to “go the last mile”, e.g., increasing precision from 90 to 95%
C
o
n
c
e
p
t
 
d
r
i
f
t
 
a
n
d
 
c
h
a
n
g
i
n
g
 
d
i
s
t
r
i
b
u
t
i
o
n
e.g., 
smart phone
 
5
 
Rule-Based Solutions
 
A
n
a
l
y
s
t
s
 
&
 
o
u
t
s
o
u
r
c
i
n
g
 
w
o
r
k
e
r
s
 
w
r
i
t
e
 
r
u
l
e
s
 
t
o
 
c
l
a
s
s
i
f
y
i
t
e
m
s
W
r
i
t
i
n
g
 
r
u
l
e
s
 
t
o
 
c
o
v
e
r
 
5
0
0
0
+
 
p
r
o
d
u
c
t
 
t
y
p
e
s
 
i
s
 
v
e
r
y
 
s
l
o
w
doesn’t scale
 
O
u
r
 
C
h
i
m
e
r
a
 
s
o
l
u
t
i
o
n
combines the above approaches
uses learning & hand-crafted rules
uses developers, analysts, and crowsourcing
continuously improves over time
keeps precision high while trying to improve recall
 
6
 
Our Chimera Solution
 
7
 
Examples
 
R
u
l
e
s
rings? 
 rings
wedding bands? 
 rings
diamond.*trio sets? 
 rings
macbook 
 ! Fruit (a blacklist rule)
C
l
a
s
s
i
f
i
c
a
t
i
o
n
 
e
v
a
l
u
a
t
i
o
n
 
u
s
i
n
g
 
c
r
o
w
d
s
o
u
r
c
i
n
g
 
8
 
Key Novelties of Our Solution
 
U
s
e
 
b
o
t
h
 
l
e
a
r
n
i
n
g
 
a
n
d
 
r
u
l
e
s
 
e
x
t
e
n
s
i
v
e
l
y
rules are not “nice to have”, they are critical for high accuracy
U
s
e
 
b
o
t
h
 
c
r
o
w
d
 
a
n
d
 
a
n
a
l
y
s
t
s
 
f
o
r
 
e
v
a
l
u
a
t
i
o
n
/
a
n
a
l
y
s
i
s
using both in-house analysts and crowdsourcing is critical at our
scale to achieve an accurate, continuously improving, and cost-
effective solution
S
c
a
l
a
b
l
e
 
i
n
 
t
e
r
m
s
 
o
f
 
h
u
m
a
n
 
r
e
s
o
u
r
c
e
s
taps into crowdsourcing (very elastic) and analysts
T
r
e
a
t
 
h
u
m
a
n
 
a
n
d
 
m
a
c
h
i
n
e
s
 
a
s
 
f
i
r
s
t
-
c
l
a
s
s
 
c
i
t
i
z
e
n
s
solution carefully spells out what techniques are used where, who is
doing what, and how to coordinate among them
 
9
 
Evaluation
 
C
h
i
m
e
r
a
 
h
a
s
 
b
e
e
n
 
d
e
v
e
l
o
p
e
d
 
a
n
d
 
d
e
p
l
o
y
e
d
 
f
o
r
 
2
 
y
e
a
r
s
A
p
p
l
i
e
d
 
t
o
 
2
.
5
M
 
i
t
e
m
s
 
f
r
o
m
 
m
a
r
k
e
t
 
p
l
a
c
e
 
v
e
n
d
o
r
s
classified more than 90% with 92% precision
A
p
p
l
i
e
d
 
t
o
 
1
4
M
 
i
t
e
m
s
 
f
r
o
m
 
w
a
l
m
a
r
t
.
c
o
m
classified 93% with 93% precision
A
s
 
o
f
 
M
a
r
c
h
 
2
0
1
4
has 852K items in training data for 3,663 types
20,459 rules for 4,930 types
C
r
o
w
d
s
o
u
r
c
i
n
g
evaluating 1,000 items takes 1 hour with 15-25 workers
S
t
a
f
f
i
n
g
1 developer + 1 dedicated analyst + 1 more analyst when needed
 
10
 
Conclusion & Lessons Learned
 
C
h
i
m
e
r
a
:
 
c
l
a
s
s
i
f
y
i
n
g
 
m
i
l
l
i
o
n
s
 
o
f
 
i
t
e
m
s
 
i
n
t
o
 
5
0
0
0
+
 
t
y
p
e
s
A
t
 
t
h
i
s
 
s
c
a
l
e
,
 
e
x
i
s
t
i
n
g
 
a
p
p
r
o
a
c
h
e
s
 
d
o
 
n
o
t
 
w
o
r
k
 
w
e
l
l
W
e
 
h
a
v
e
 
d
e
v
e
l
o
p
e
d
 
a
 
h
i
g
h
l
y
 
s
c
a
l
a
b
l
e
,
 
a
c
c
u
r
a
t
e
 
s
o
l
u
t
i
o
n
using learning, rules, crowdsourcing, analysts
 
L
e
s
s
o
n
s
 
l
e
a
r
n
e
d
both learning + rules are critical
crowdsourcing is critical but must be closely monitored
crowdsourcing must be coupled with in-house analysts and developers
outsourcing does not work at a very large scale
hybrid human-machine systems are here to stay
 
M
o
r
e
 
d
e
t
a
i
l
s
 
i
n
 
o
u
r
 
p
a
p
e
r
 
11
Slide Note
Embed
Share

The project aims to classify tens of millions of products into over 5000 categories efficiently. Challenges include limited training data, scarce human resources, and the need for high precision. Manual classification by analysts is slow and outsourcing is expensive. Learning-based solutions face difficulties in generating training data, handling new product types, corner cases, and concept drift. The goal is to improve precision and recall in product classification using innovative approaches.

  • Product Classification
  • Machine Learning
  • Crowdsourcing
  • Challenges
  • Learning-Based Solutions

Uploaded on Sep 10, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing Chong Sun, Narasimhan Rampalli, Frank Yang, AnHai Doan @WalmartLabs & UW-Madison Presenter: Jun Xie, @WalmartLabs @WalmartLabs

  2. Problem Definition Classify tens of millions of products into 5000+ types Each product: a record of attribute-value pairs title: Gerber folding knife 0 KN-Knives description: most versatile knife in its category ... manufacturer, color, etc. many products have just title attribute Product types laptop computers, area rugs, laptop bags & cases, dining chairs, etc. ID Title SCC PT EASW1876 Eastern Weavers Rugs EYEBALLWH-8x10 Shag Eyeball White 8x10 Rug Shag Rugs Area Rugs EMLCO655 Royce Leather 643-RED-4 Ladies Laptop Brief - Red Notebook Cases Laptop Bags and Cases International Concepts Stacking Dining Arm Chair (Set of 2) Dining Chairs 14968347 12490924 SouthCarolina Gamecocks Rectangle Toothfairy Pillow Decorative Pillows 2

  3. Challenges Very large # of product types (5000+) started out having very little training data creating training data for 5000+ is very difficult Very limited human resources 1 developer and 1-2 analysts (who can t write code) Products often arrive in bursts e.g., a batch of 300K items just come, must classify fast makes it hard to provision for analysts and outsourcing Need very high precision (>92%) can tolerate lower recall, but want to increase recall over time Current approaches can t handle these scales/challenges 3

  4. Manually Classifying the Items Using analysts can accurately classify about 100 items per day must understand the item, navigate through a large space of possible types, decide on the most appropriate one e.g., Misses Jacket, Pants and Blouse 14 -16-18-20 Pattern sewing patterns e.g., Gerber folding knife 0 KN-Knives utility knives? pocket knives? tactical knives? multitools? would take 5 analysts 200 days to classify 100K items Using outsourcing very expensive: $770K for 1M items outsourcing is not elastic Using crowdsourcing crowd workers can t navigate a complex and large taxonomy of types 4

  5. Learning-Based Solutions Difficult to generate training data too many prod types (5000+) to label just 200 items per prod type, must label 1M items Difficult to generate representative samples random sampling would severely under-sample certain types analysts and outsourced workers don t know how to obtain a random sample, e.g., for handbags, computer cables new product types appear all the time the universe of items keeps changing Difficult to handle corner cases items coming from special sources, need to be handled specially hard to go the last mile , e.g., increasing precision from 90 to 95% Concept drift and changing distribution e.g., smart phone 5

  6. Rule-Based Solutions Analysts & outsourcing workers write rules to classify items Writing rules to cover 5000+ product types is very slow doesn t scale Our Chimera solution combines the above approaches uses learning & hand-crafted rules uses developers, analysts, and crowsourcing continuously improves over time keeps precision high while trying to improve recall 6

  7. Our Chimera Solution Classification Rules Crowd Evaluation Whitelist Rules Blacklist Rules Sample Items to Classify Classified Reports Gatekeeper Rules Voting Master Result Attribute Based Analysis Unclassified Filter K-NN Na ve Bayes Perceptron Regression Training Data SVM 7

  8. Examples Rules rings? rings wedding bands? rings diamond.*trio sets? rings macbook ! Fruit (a blacklist rule) Classification evaluation using crowdsourcing 8

  9. Key Novelties of Our Solution Use both learning and rules extensively rules are not nice to have , they are critical for high accuracy Use both crowd and analysts for evaluation/analysis using both in-house analysts and crowdsourcing is critical at our scale to achieve an accurate, continuously improving, and cost- effective solution Scalable in terms of human resources taps into crowdsourcing (very elastic) and analysts Treat human and machines as first-class citizens solution carefully spells out what techniques are used where, who is doing what, and how to coordinate among them 9

  10. Evaluation Chimera has been developed and deployed for 2 years Applied to 2.5M items from market place vendors classified more than 90% with 92% precision Applied to 14M items from walmart.com classified 93% with 93% precision As of March 2014 has 852K items in training data for 3,663 types 20,459 rules for 4,930 types Crowdsourcing evaluating 1,000 items takes 1 hour with 15-25 workers Staffing 1 developer + 1 dedicated analyst + 1 more analyst when needed 10

  11. Conclusion & Lessons Learned Chimera: classifying millions of items into 5000+ types At this scale, existing approaches do not work well We have developed a highly scalable, accurate solution using learning, rules, crowdsourcing, analysts Lessons learned both learning + rules are critical crowdsourcing is critical but must be closely monitored crowdsourcing must be coupled with in-house analysts and developers outsourcing does not work at a very large scale hybrid human-machine systems are here to stay More details in our paper 11

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#