Title: School of Computer Science
1A Novel Approach to Labeling Data
Luis von Ahn Manuel Blum
School of Computer Science Carnegie Mellon
University
2Image Search on The Web
Uses filenames and surrounding text
3Image Search on The Web
Lots of images named image1.jpg Gives you very
few (if any) useful keywords for each
image Doesnt look at the actual image!
4Desiderata
A method for labeling images that
- Actually looks at the images
- For any image gives several keywords that make
sense - Is very fast (Google has labeled 425,000,000
images)
5According to The Computer Industry Almanac, by
the end of this year there will be 800 million
people using the internet on a regular basis
6The ESP Game
Two-player game
Players dont know each other and cant
communicate
7The ESP Game
The ES
Player 1
Player 2
Guessing car
Guessing boy
Guessing car
Guessing hat
Success! You both agree on car
Success! You both agree on car
8(No Transcript)
9The ESP Game
Taboos guarantee that each image will get many
different keywords Its ok if some keywords are
garbage imagine how may garbage keywords each
Google image has! Preliminary studies suggest
that people find the game fun
10The ESP Game
Average rate of classification slightly over 4
images per minute
5000 people playing the game throughout the day
would classify all the images on Google in 30
days!
Games in Yahoo!, Pogo.com or MSN average over
10,000 players throughout the day!
11Stealing Cycles From Humans is a More General
Idea
The ESP Game works for other variations of the
problem www.HotOrNot.com
12www.espgame.org