Building Sentiment Classifier Using Active Learning

 
A
c
t
i
v
e
 
L
e
a
r
n
i
n
g
 
w
i
t
h
 
U
n
b
a
l
a
n
c
e
d
C
l
a
s
s
e
s
 
&
 
E
x
a
m
p
l
e
-
G
e
n
e
r
a
t
i
o
n
 
Q
u
e
r
i
e
s
 
Christopher H. Lin
Microsoft
Mausam
IIT Delhi
Daniel S. Weld
University of Washington
 
 
 
1
 
 
2
 
Suppose you want train a classifier to detect the
sentiment
 of movie reviews.
 
 
How would you build a sentiment classifier?
 
Step 1: Download a million movie reviews
Step 2: Ask the crowd to label examples
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
 
3
 
“This movie sucks!”
 
Favorable
 
Unfavorable
 
Pay:
 $0.01
 
How would you build a sentiment classifier?
 
Step 1: Download a million movie reviews
Step 2: Ask the crowd to label examples
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
 
4
 
“This movie sucks!”
 
Favorable
 
Unfavorable
 
Pay:
 $0.01
 
Step 2: Ask the crowd to label examples
 
Which examples?
 
5
 
 
1.
Randomly sample examples
2.
Active Learning
 
[Lewis and Catlett (1994)]
 
Uncertainty Sampling
 
h
 
6
 
Active Learning
 
How would you build a sentiment classifier?
 
Step 1: Download a million movie reviews
Step 2: Ask the crowd to label examples
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
 
7
 
“This movie sucks!”
 
Favorable
 
Unfavorable
 
Pay:
 $0.01
 
Suppose you want to train a classifier to identify
sentences that talk about
 
 “climate change”
 
 
(So you can figure out who to fire)
 
8
 
Detect sentences about 
climate change
?
 
Step 1: Download a million tweets
Step 2: Ask the crowd to label examples
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
 
9
 
“Global warming is fake news”
 
Climate change
 
Not climate change
 
Pay:
 $0.01
 
The class 
skew
 is too 
high
 
10
 
P(
tweet is positive for “climate change”
)
 
0.0000000000001
 
Detect sentences about 
climate change
?
 
Step 1: Download a million tweets
Step 2: Ask the crowd to generate examples 
[Attenberg and Provost, 2010]
 
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
 
11
 
Please write a sentence about climate change
 
Global warming is fake news.
 
Detect sentences about 
climate change
?
 
Step 1: Download a million tweets
Step 2: Ask the crowd to generate examples 
[Attenberg and Provost, 2010]
.
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
 
12
 
Please write a sentence about climate change
 
Global warming is fake news.
 
Pay: $0.15 (REALLY EXPENSIVE)
 
Detect sentences about 
climate change
?
 
Step 1: Download a million tweets
Step 2: Ask the crowd to generate examples.
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
Step 4: Ask the crowd to label examples
Step 5: Train your favorite classifier
 
13
 
Please write a sentence about climate change
 
Global warming is fake news.
 
Pay: $0.15 (REALLY EXPENSIVE)
 
[Attenberg and Provost, 2010]
 
Detect sentences about 
climate change
?
 
Step 1: Download a bunch of movie reviews
Step 2: 
Switch between labeling and generation.
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
Step 4: Ask the crowd to label examples
Step 5: Train your favorite classifier
 
14
 
Please write a sentence about climate change
 
Global warming is fake news.
 
Pay: $0.15
 
Detect sentences about 
climate change
?
 
Step 1: Download a bunch of movie reviews
Step 2: 
Switch between labeling and generation.
 
 
 
 
 
 
 
Step 3: Train your favorite classifier
Step 4: Ask the crowd to label examples
Step 5: Train your favorite classifier
 
15
 
Please write a sentence about climate change
 
Global warming is fake news.
 
Pay: $0.15
 
Our contribution
 
Question
 
Given a 
domain of unknown skew
, how do we optimally 
switch
between generation and labeling
 to train the 
best classifier at the least
cost?
 
 
16
 
Contributions
 
 
We present 
MC-CB
, an algorithm for dynamically switching between
generation and labeling, given an arbitrary problem with unknown
skew.
 
We show MC-CB yields up to 
14.3 point gain in F1 AUC 
over state-of-
the-art baselines using real and synthetic datasets.
 
17
 
18
 
Best initial strategies
 
low skew
 
What makes these strategies the best in their
respective skew settings?
 
high skew
 
Labeling
 
Generation
 
19
 
What makes these strategies the best in their
respective skew settings?
 
Cheap positive examples!
 
Best initial strategies
 
low skew
 
high skew
 
Labeling
 
Generation
 
20
 
Large factor in which strategy will work well is 
how
cheaply they can obtain positive examples 
at any
given time.
 
MB-CB (MakeBalanced-CostBound)
 
Computes the cost to obtain one positive example,
and picks the cheapest method
 
21
 
MB-CB (MakeBalanced-CostBound)
 
Computes the cost to obtain one positive example,
and picks the cheapest method
 
22
 
Generate Positive Example
 
Label Example
 
$0.15 per example
 
MB-CB
 
Cost to obtain one positive example
 
23
 
Generate Positive Example
 
Label Example
 
$0.15 per example
 
$0.15 for one positive
example
 
MB-CB
 
Cost to obtain one positive example
 
24
 
Generate Positive Example
 
Label Example
 
$0.15 per example
 
$0.15 for one positive
example
 
$0.03 per example
 
 
MB-CB
 
Cost to obtain one positive example
 
25
 
Generate Positive Example
 
Label Example
 
$0.15 per example
 
$0.15 for one positive
example
 
$0.03 per example
 
$1.50 for one positive
 example
 
50 examples
to get one
positive
 
MB-CB
 
Cost to obtain one positive example
 
26
 
Generate Positive Example
 
Label Example
 
$0.15 per example
 
$0.15 for one positive
example
 
$0.03 per example
 
$0.06 for one positive
 example
 
50
 2
examples to
get one
positive
 
MB-CB
 
Cost to obtain one positive example
 
27
 
Generate Positive Example
 
Label Example
 
$0.15 per example
 
$0.15 for one positive
example
 
$0.03 per example
 
$0.06 for one positive
 example
 
50
 2
examples to
get one
positive
 
Reinforcement Learning Problem
 
MB-CB adapts UCB from multi-armed bandit
literature
 
28
 
Generate Positive Example
 
Label Positive
 
$0.15/positive
 
$0.03/positive
 
Lower
Confidence
Bound using UCB
 
$1.50/positive
 
Observation
 
MB-CB adapts UCB from multi-armed bandit
literature
 
29
 
Generate Positive Example
 
Label Positive
 
$0.15/positive
 
$0.03/positive
 
Lower
Confidence
Bound using UCB
 
$1.50/positive
 
Observation
 
Lower confidence bound = average cost + w * #observations)
 
Small detail
 
Every time MB-CB generates positive examples, it randomly samples
examples from the unlabeled dataset and inserts them into the training
set as negative examples.
 
30
 
How good is 
MB-CB
, which dynamically
switches between generation and labeling?
 
31
 
MB-CB’s opponents
 
 
Round-Robin
 
Generate Positive Example* + Active Learning
GL (Guided Learning) [Attenberg and Provost, 2010]
 
Generate Positive Example*
GL-Hybrid  [Attenberg and Provost, 2010]
 
Generate Positive Example*, then switch to Active Learning
forever when derivative of learning curve is small enough
Active Learning
 
*Add 3 Random Negatives (free) per Generate [Weiss and Provost 2003]
 
32
 
422,937 news headlines
152,746 about 
Entertainment
108,465 about 
Science and Technology
115,920 about 
Business
45,615 about 
Health
 
33
 
The Unlabeled Corpus:
News Aggregator Dataset (NADS) [UCI ML Repo]
 
$0.03 per label
 
The Crowd Generated Examples:
NADS-Generate
 
1000 
crowd-generated
 headlines for each topic:
 
Entertainment, Science and Technology, Business, Health
 
 
34
 
$0.15 per example
 
Experimental Setup
 
For each domain:
For each skew in {1,9,99,199,499, 999}
Set budget = $100
Construct dataset from unlabeled corpus to target skew
Compute F1 AUC for each strategy
Average over 10 trials
 
 
35
 
Entertainment
 
 
36
 
Entertainment
 
37
 
 
Entertainment
 
 
38
 
Entertainment
 
 
39
 
 
40
 
 
 
41
 
Extras in the paper
 
42
 
More strategies for obtaining examples in
paper
 
43
 
What if instead of uncertainty sampling, we picked
examples that we predict are positive?
 
44
 
 
45
 
# examples in training set
 
% of examples
That are positive
 
Takeaway
 
Use MB-CB to intelligently switch between positive example generation
and labeling.
 
https://github.com/polarcoconut/thesis-skew
 
46
 
Questions?
 
47
Slide Note
Embed
Share

Learn how to build a sentiment classifier for movie reviews and identify climate change-related sentences by leveraging active learning. The process involves downloading data, crowdsourcing labeling, and training classifiers to improve accuracy efficiently.

  • Sentiment Analysis
  • Active Learning
  • Classifier Training
  • Crowdsourcing
  • Climate Change

Uploaded on Jul 31, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Active Learning with Unbalanced Active Learning with Unbalanced Classes & Example Classes & Example- -Generation Queries Generation Queries Christopher H. Lin Microsoft Mausam IIT Delhi Daniel S. Weld University of Washington 1

  2. Suppose you want train a classifier to detect the sentiment of movie reviews. 2

  3. How would you build a sentiment classifier? Step 1: Download a million movie reviews Step 2: Ask the crowd to label examples Pay: $0.01 Best Crowd Plaform This movie sucks! Unfavorable Favorable Step 3: Train your favorite classifier 3

  4. How would you build a sentiment classifier? Step 1: Download a million movie reviews Step 2: Ask the crowd to label examples Pay: $0.01 Best Crowd Plaform This movie sucks! Unfavorable Favorable Step 3: Train your favorite classifier 4

  5. Step 2: Ask the crowd to label examples Which examples? 1. Randomly sample examples 2. Active Learning 5

  6. Active Learning h Uncertainty Sampling [Lewis and Catlett (1994)] 6

  7. How would you build a sentiment classifier? Step 1: Download a million movie reviews Step 2: Ask the crowd to label examples Pay: $0.01 Best Crowd Plaform This movie sucks! Unfavorable Favorable Step 3: Train your favorite classifier 7

  8. Suppose you want to train a classifier to identify sentences that talk about climate change (So you can figure out who to fire) 8

  9. Detect sentences about climate change? Step 1: Download a million tweets Step 2: Ask the crowd to label examples Pay: $0.01 Best Crowd Plaform Global warming is fake news Not climate change Climate change Step 3: Train your favorite classifier 9

  10. The class skew is too high P(tweet is positive for climate change ) 0.0000000000001 10

  11. Detect sentences about climate change? Step 1: Download a million tweets Step 2: Ask the crowd to generate examples [Attenberg and Provost, 2010] Best Crowd Plaform Please write a sentence about climate change Global warming is fake news. Step 3: Train your favorite classifier 11

  12. Detect sentences about climate change? Step 1: Download a million tweets Step 2: Ask the crowd to generate examples [Attenberg and Provost, 2010] . Please write a sentence about climate change Best Crowd Plaform Pay: $0.15 (REALLY EXPENSIVE) Global warming is fake news. Step 3: Train your favorite classifier 12

  13. Detect sentences about climate change? Step 1: Download a million tweets Step 2: Ask the crowd to generate examples. Please write a sentence about climate change Pay: $0.15 (REALLY EXPENSIVE) Best Crowd Plaform Global warming is fake news. Step 3: Train your favorite classifier Step 4: Ask the crowd to label examples Step 5: Train your favorite classifier [Attenberg and Provost, 2010] 13

  14. Detect sentences about climate change? Step 1: Download a bunch of movie reviews Step 2: Switch between labeling and generation. Please write a sentence about climate change Pay: $0.15 Best Crowd Plaform Global warming is fake news. Step 3: Train your favorite classifier Step 4: Ask the crowd to label examples Step 5: Train your favorite classifier 14

  15. Detect sentences about climate change? Step 1: Download a bunch of movie reviews Step 2: Switch between labeling and generation. Our contribution Please write a sentence about climate change Pay: $0.15 Best Crowd Plaform Global warming is fake news. Step 3: Train your favorite classifier Step 4: Ask the crowd to label examples Step 5: Train your favorite classifier 15

  16. Question Given a domain of unknown skew, how do we optimally switch between generation and labeling to train the best classifier at the least cost? 16

  17. Contributions We present MC-CB, an algorithm for dynamically switching between generation and labeling, given an arbitrary problem with unknown skew. We show MC-CB yields up to 14.3 point gain in F1 AUC over state-of- the-art baselines using real and synthetic datasets. 17

  18. Best initial strategies low skew Labeling high skew Generation What makes these strategies the best in their respective skew settings? 18

  19. Best initial strategies low skew Labeling high skew Generation What makes these strategies the best in their respective skew settings? Cheap positive examples! 19

  20. Large factor in which strategy will work well is how cheaply they can obtain positive examples at any given time. 20

  21. MB-CB (MakeBalanced-CostBound) Computes the cost to obtain one positive example, and picks the cheapest method 21

  22. MB-CB (MakeBalanced-CostBound) Computes the cost to obtain one positive example, and picks the cheapest method Generate Positive Example $0.15 per example Label Example 22

  23. MB-CB Cost to obtain one positive example Generate Positive Example $0.15 per example Label Example $0.15 for one positive example 23

  24. MB-CB Cost to obtain one positive example Generate Positive Example Label Example $0.15 per example $0.03 per example $0.15 for one positive example 24

  25. MB-CB Cost to obtain one positive example Generate Positive Example $0.15 per example Label Example $0.03 per example 50 examples to get one positive $1.50 for one positive example $0.15 for one positive example 25

  26. MB-CB Cost to obtain one positive example Generate Positive Example $0.15 per example Label Example $0.03 per example 50 2 examples to get one positive $0.06 for one positive example $0.15 for one positive example 26

  27. MB-CB Cost to obtain one positive example Generate Positive Example $0.15 per example Label Example $0.03 per example 50 2 examples to get one positive $0.06 for one positive example $0.15 for one positive example Reinforcement Learning Problem 27

  28. MB-CB adapts UCB from multi-armed bandit literature Generate Positive Example Label Positive $1.50/positive Observation $0.15/positive Lower Confidence Bound using UCB $0.03/positive 28

  29. MB-CB adapts UCB from multi-armed bandit literature Generate Positive Example Label Positive $1.50/positive Observation $0.15/positive Lower Confidence Bound using UCB $0.03/positive Lower confidence bound = average cost + w * #observations) 29

  30. Small detail Every time MB-CB generates positive examples, it randomly samples examples from the unlabeled dataset and inserts them into the training set as negative examples. 30

  31. How good is MB-CB, which dynamically switches between generation and labeling? 31

  32. MB-CBs opponents Round-Robin Generate Positive Example* + Active Learning GL (Guided Learning) [Attenberg and Provost, 2010] Generate Positive Example* GL-Hybrid [Attenberg and Provost, 2010] Generate Positive Example*, then switch to Active Learning forever when derivative of learning curve is small enough Active Learning *Add 3 Random Negatives (free) per Generate [Weiss and Provost 2003] 32

  33. The Unlabeled Corpus: News Aggregator Dataset (NADS) [UCI ML Repo] 422,937 news headlines 152,746 about Entertainment 108,465 about Science and Technology 115,920 about Business 45,615 about Health $0.03 per label 33

  34. The Crowd Generated Examples: NADS-Generate 1000 crowd-generated headlines for each topic: Entertainment, Science and Technology, Business, Health $0.15 per example 34

  35. Experimental Setup For each domain: For each skew in {1,9,99,199,499, 999} Set budget = $100 Construct dataset from unlabeled corpus to target skew Compute F1 AUC for each strategy Average over 10 trials 35

  36. Entertainment 36

  37. Entertainment 37

  38. Entertainment 38

  39. Entertainment 39

  40. 40

  41. 41

  42. Extras in the paper 42

  43. More strategies for obtaining examples in paper 43

  44. What if instead of uncertainty sampling, we picked examples that we predict are positive? 44

  45. % of examples That are positive # examples in training set 45

  46. Takeaway Use MB-CB to intelligently switch between positive example generation and labeling. https://github.com/polarcoconut/thesis-skew 46

  47. Questions? 47

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#