Title: Utility data annotation via Amazon Mechanical Turk
1Utility data annotation via Amazon Mechanical Turk
X 100 000 5000
- Alexander Sorokin
- David Forsyth
- University of Illinois at Urbana-Champaign
- http//visionpc.cs.uiuc.edu/largescale/
2Motivation
- Unlabeled data is free
- Labels are useful
- We need large volumes of labeled data
- Different labeling needs
- Is there X in the image?
- Outline X.
- Where is part Y of X.
- Of these 500 images, which belong to category X?
- . and many more .
3Amazon Mechanical Turk
Workers
Task
Task Dog?
Broker
Answer Yes
Pay 0.01
Is this a dog?
www.mturk.com
o Yes o No
0.01
4Motivation
X 100 000 5000
Custom annotations
Large scale
Low price
5Annotation protocols
- Type keywords
- Select relevant images
- Click on landmarks
- Outline something
- Detect features
- .. anything else
6Type keywords
http//austinsmoke.com/turk/.
0.01
7Select examples
Joint work with Tamara and Alex Berg
http//visionpc.cs.uiuc.edu/largescale/data/simpl
eevaluation/html/horse.html
8Select examples
0.02
requester mtlabel
9Click on landmarks
0.01
http//vision-app1.cs.uiuc.edu/mt/results/people14
-batch11/p7/
10Outline something
0.01
http//visionpc.cs.uiuc.edu/largescale/results/pr
oduction-3-2/results_page_013.html Data from
Ramanan NIPS06
11Detect features
Measuring molecules. Joint work with Rebecca
Schulman (Caltech)
?? 0.1
http//visionpc.cs.uiuc.edu/largescale/all_exampl
es.html
12Motivation
X 100 000 5000
Custom annotations
Large scale
Low price
13Issues
- Quality?
- How good is it?
- How to be sure?
- Price?
- How to price it?
- How does MTurk compare with others?
- How do I sign up?
- sorokin2_at_uiuc.edu
- http//visionpc.cs.uiuc.edu/largescale/
14Annotation quality
- Agree within 5-10 pixels
- on 500x500 screen
- There are bad ones.
A
C
E
G
15Grading tasks
- Take 10 submitted results
- Create new task to verify the result
- Verification is easy
- Pay the same or slightly higher price
- Total overhead - 10
- (work in progress)
http//vision-app1.cs.uiuc.edu/mt/grading/people14
-batch11-small/p1/
16Price
- 0.01 per image (16 clicks)
- 1500 / 100 000 images
- gt1000 images per day
- lt4 months
- Workers suggested 0.03 - 0.05/img
- 3500 - 5500 / 100 000 images
17Is the price right?
- 0.01/ 40 clicks
- 15 hours
- 900 labels
0.01 / 14 clicks 1.6 hours 900 labels
0.01 / 16 clicks 4 hours 900 labels
18Annotation Method Comparison
Approach Cost Scale Setup effort Centralized Quality Elastic to
MTurk no /
LabelME Yes
ImageParsing.com Yes
Games with purpose (ESP) Yes
In house no
19How do I sign up?
- Go to our web page
- http//visionpc.cs.uiuc.edu/largescale/
- Send us an e-mail
- sorokin2_at_uiuc.edu
- Register at Amazon Mechanical Turk
- http//www.mturk.com
20Acknowledgments
- Special thanks to
- David Forsyth
- Tamara Berg
- Rebecca Schulman
- David Martin
- Kobus Barnard
- Mert Dikmen
- All workers at Amazon Mechanical Turk
- This work was funded in part by ONR
21Thank you
X 100 000 5000