Title: Image Understanding
1Image Understanding Web Security
- Henry Baird
- Joint work with
- Richard Fateman, Allison Coates, Kris Popat,
- Monica Chew, Tom Breuel, Mark Luk
2A fast-emerging research topic
- Human Interactive Proofs (HIPs definition
later) - first instance in 1999
- research took hold in CS security theory field
first - intersects image understanding, cog sci, etc etc
- fast attracting researchers, engineers, users
- This talk
- A brief history of HIPs
- Existing systems -- w/ my critiques
- Professional activities, so far -- incl. the 1st
Intl Workshop - In detail PARCs PessimalPrint BaffleText
H. Baird K. Popat, Web Security Document
Image Analysis, in J. Hu A. Antonacopoulos
(Eds.), Web Document Analysis, World Scientific,
2003 (in press).
3Straws in the wind
- 90s spammers trolling for email addresses
- in defense, people disguise them, e.g.
- baird AT parc DOT com
- 1997 abuse of Add-URL feature at AltaVista
- some write programs to add their URL many times
- skewed the search rankings
- Andrei Broder et al (then at DEC SRC)
- a user action which is legitimate when performed
once - becomes abusive when repeated many times
- no effective legal recourse
- how to block or slow down these programs
4The first known instance Altavistas AddURL
filter
An image of text, not ASCII
- 1999 ransom note filter
- randomly pick letters, fonts, rotations render
as an image - every user is required to read and type it in
correctly - reduced spam add_URL by over 95
- Weaknesses isolated chars, filterable noise,
affine deformations
M. D. Lillibridge, M. Abadi, K. Bharat, A. Z.
Broder, Method for Selectively Restricting
Access to Computer Systems, U.S. Patent No.
6,195,698, Filed April 13, 1998, Issued February
27, 2001.
5Yahoo!s Chat Room Problem
- September 2000
- Udi Manber asked Prof. Manuel Blums group at
CMU - programs impersonate people in chat rooms,
- then hand out ads ugh!
- how can all machines be denied access to a Web
site - without inconveniencing any human users?
- I.e., how to distinguish between machines and
people on-line - a kind of Turing test !
6Alan Turing (1912-1954)
- 1936 a universal model of computation
- 1940s helped break Enigma (U-boat) cipher
- 1949 first serious uses of a working computer
- including plans to read printed
text - (he expected it would be easy)
- 1950 proposed a test for machine intelligence
7Turings Test for AI
- How to judge that a machine can think
- play an imitation game conducted via teletypes
- a human judge two invisible interlocutors
- a human
- a machine pretending to be human
- after asking any questions (challenges) he/she
- wishes, the judge decides which is human
- failure to decide correctly would be convincing
- evidence of machine intelligence (Turing
asserted) - Modern GUIs invite richer challenges than
teletypes.
A. Turing, Computing Machinery Intelligence,
Mind, Vol. 59(236), 1950.
8CAPTCHAs Completely Automated Public Turing
Tests to Tell Computers Humans Apart
(M. Blum, L. A. von Ahn, J. Langford, et al,
CMU-SCS)
- challenges can be generated graded
automatically - (i.e. the judge is a machine)
- accepts virtually all humans, quickly easily
- rejects virtually all machines
- resists automatic attack for many years
- (even assuming that its algorithms are
known?) - NOTE the machine administers, but cannot pass
the test!
L. von Ahn, M. Blum, N.J. Hopper, J. Langford,
CAPTCHA Using Hard AI Problems For Security,
Proc., EuroCrypt 2003, Warsaw, Poland, May 4-8,
2003 to appear.
9CMUs Gimpy CAPTCHA
- Randomly pick
- English words, deformations, occlusions,
backgrounds, etc - Challenge user to type in any three of the words
- Designed by CMU team tried out by Yahoo!
- Problem users hated it --- Yahoo! withdrew it
L. Von Ahn, M. Blum, N. J. Hopper, J. Langford,
The CAPTCHA Web Page, http//www.captcha.net.
10Yahoo!s present CAPTCHA EZ-Gimpy
- Randomly pick
- one English word, deformations,
degradations, occlusions, - colored backgrounds, etc
- Better tolerated by users
- Now used on a large scale to protect various
services - Weaknesses a single typeface, English lexicon
11PayPals CAPTCHA
- Nothing published
- Seems to use a single typeface
- Picks, at random
- letters, overlain pattern
- Weaknesses single typeface, simple grid,
- no image degradations, spaced apart
12Cropping up everywhere
- In use today, to defend against
- skewing search-engine rankings (Altavista, 1999)
- infesting chat rooms, etc (Yahoo!, 2000)
- gaming financial accounts (PayPal, 2001)
- robot spamming (MailBlocks, SpamArrest 2002)
- In the last few months Overture, Chinese
website, HotMail, - CD-rebate, TicketMaster, MailFrontier,
Qurb, Madonnarama, - have you seen others?
- On the horizon
- ballot stuffing, password guessing,
denial-of-service attacks - blunt force attacks (e.g. UT Austin break-in,
Mar 03) - many others
- Similar problems w/ scrapers also, likely on
Intranets.
D. P. Baron, eBay and Database Protection, Case
No. P-33, Case Writing Office, Stanford Graduate
School of Business, Stanford Univ., 2001.
13The Known Limits ofImage Understanding
Technology
- There remains a large gap in ability
- between human and machine vision systems,
- even when reading printed text
- Performance of OCR machines has been
systematically studied - 7 year olds can consistently do better!
- This ability gap has been mapped quantitatively
S. Rice, G. Nagy, T. Nartker, OCR An Illustrated
Guide to the Frontier, Kluwer Academic
Publishers 1999.
14 Image Degradation Modeling
thrs x blur
- Effects of printing imaging
blur thrs sens
We can generate challenging images pseudorandomly
H. Baird, Document Image Defect Models, in H.
Baird, H. Bunke, K. Yamamoto (Eds.), Structured
Document Image Analysis, Springer-Verlag New
York, 1992.
15Machine Accuracy is a SmoothMonotonic Function
of Parameters
T. K. Ho H. S. Baird, Large Scale Simulation
Studies in Image Pattern Recognition, IEEE
Trans. on PAMI, Vol. 19, No. 10, p. 1067-1079,
October 1997.
16Can You Read These Degraded Images?
Of course you can . but OCR machines cannot!
17Experiments by PARC UCB-CS
- Pick words at random
- 70 words commonly used on the Web
- w/out ascenders or descenders (cf. Spitz)
- Vary physics-based image degradation parameters
- blur, threshold, x-scale -- within certain
ranges - Pick fonts at random from a large set
- Times Roman (TR), Times Italic (TI),
- Palatino Roman (PR), Palatino Italic (PI),
- Courier Roman (CR), Courier Oblique (CO),
etc - Test legibility on
- ten human volunteers (UC Berkeley CS Dept grad
students) - three OCR machines
- Expervision TR (E), ABBYY FineReader (A),
IRIS Reader (I)
18Results OCR Accuracy, by machine
Each machine has its peculiar blind spots
19OCR Accuracy varying blur threshold
The machines share some blind spots
20PessimalPrint exploiting image degradations
- Three OCR machines fail when
OCR outputs - blur 0.0
- threshold ? 0.02 - 0.08
- threshold 0.02
- any value of blur
but people find all these easy to read
A. Coates, H. Baird, R. Fateman, Pessimal Print
A Reverse Turing Test, Proc. 6th IAPR Intl
Conf. On Doc. Anal. Recogn. (ICDAR01),
Seattle, WA, Sep 10-13, 2001.
21High Time for a Workshop!
- Manuel Blum proposes it, rounds up some key
speakers - Henry Baird offers PARC as venue Kris Popat
helps run it - Goals
- Invite all known principals theory, systems,
engineers, users - Describe the state of the art
- Plan next steps for the field
- Organization
- 30 attendees
- abstracts only, 1-5 pages, no refereeing, no
archival publication - 100 participation everyone gives a (short)
talk - mixing it up panel working group
discussions - 2-1/2 days, lots of breaks for informal
socializing - plenary talk by John McCarthy Father of AI
221st NSF Intl Workshop onHuman Interactive
Proofs PARC, Palo Alto, CA, January 9-11, 2002
23HIP2002 Participants
- CMU - SCS, Aladdin Center
- Manuel Blum, Lenore Blum, Luis von Ahn, John
Langford, Guy Blelloch, Nick Hopper, Ke Yang,
Brighten Godfrey, Bartosz Przydatek, Rachel Rue - PARC - SPIA/Security/Theory
- Henry Baird, Kris Popat, Tom Breuel, Prateek
Sarkar, Tom Berson, Dirk Balfanz, David Goldberg - UCB - CS SIMS
- Richard Fateman, Allison Coates, Jitendra Malik,
Doug Tygar, Alma Whitten, Rachna Dhamija, Monica
Chew, Adrian Perrig, Dawn Song - RPI
- George Nagy
- Stanford
- John McCarthy
- NSF
- Robert Sloan
- Altavista
- Andrei Broder
- Yahoo!
- Udi Manber
- Bell Labs
- Dan Lopresti
- IBM T.J. Watson
- Charles Bennett
- InterTrust Star Labs
- Stuart Haber
- City Univ. of Hong Hong
- Nancy Chan
- Weizmann Institute
- Moni Naor
- RSA Security Laboratories
- Ari Juels
- Document Recognition Techs, Inc
- Larry Spitz
24Variations Generalizations
- CAPTCHA
- Completely Automatic Public Turing test to tell
Computers and Humans Apart - HUMANOID
- Text-based dialogue which an individual can use
to authenticate that he/she is himself/herself
(naked in a glass bubble) - PHONOID
- Individual authentication using spoken language
- Human Interactive Proof (HIP)
- An automatically administered challenge/response
protocol - allowing a person to authenticate him/herself
as belonging to a certain group over a network
without the burden of passwords, - biometrics, mechanical aids, or special
training.
25Highlights of HIP2002
- Theory
- some text-based CAPTCHAs are provably breakable
- Ability Gaps
- vision gestalt, segmentation, noise immunity,
style consistency - speech noise of many kinds, clutter (cocktail
party effect) - intelligence puzzles, analogical reasoning,
weak logic - gestures, reflexes, common knowledge,
- Applications
- subtle system-level vulnerabilties
- aggressive arms race with shadowy enemies
http//www.parc.com/istl/groups/did/HIP2002
26Funding Partnerships
- NSF
- Robert Sloan, Dir, Theory of Computing Pgm
- strongly supportive of this newborn field
- encouraged grant proposals
- Yahoo!
- willing to run field trials
- user acceptance laboratory
- able to detect intrusion
27Disciplines
- Participating now
- Cryptography
- Security
- Pattern Recognition
- Computer Vision
- Artificial Intelligence
- eCommerce
- Needed
- Cognitive Science
- Psychophysics (esp. of Reading)
- Biometrics
- Business, Law,
- .?
28Weaknesses of Existing Reading-Based CAPTCHAs
- English lexicon is too predictable
- dictionaries are too small
- only 1.2 bits of entropy per character (cf.
Shannon) - Physics-based image degradations vulnerable
- to well-studied image restoration attacks,
e.g. - ?
- Complex images irritate people
- even when they can read them
- need user-tolerance experiments
29Strengths of Human Reading
- Literature on the psychophysics of reading is
relevant - familiarity helps, e.g. English words
- optimal word-image size (subtended angle)
- is known (0.3-2 degrees)
- optimal contrast conditions known
- other factors measured for the best performance
- to achieve and sustain critical reading speed
- BUT gives no answer to
- wheres the optimal comfort zone?
G. E. Legge, D. G. Pelli, G. S. Rubin, M. M.
Schleske, Psychophysics of Reading I. normal
vision, Vision Research 25(2), 1985.
A. J. Grainger J. Segui, Neighborhood
Frequency Effects in Visual Word Recognition,
Perception Psychophysics 47, 1990..
30Designing a Stronger CAPTCHA BaffleText
principles
- Nonsense words.
- generate pronounceable not spellable
words - using a variable-length character n-gram
Markov model - they look familiar, but arent in any lexicon,
e.g. - ablithan wouquire quasis
- Gestalt perception.
- force inference of a whole word-image
- from fragmentary or occluded characters,
e.g. - using a single familiar typeface also helps
M. Chew H. S. Baird, BaffleText A Human
Interactive Proof, Proc., SPIE/IST Conf. on
Document Recognition Retrieval X, Santa Clara,
CA, January 23-24, 2003.
31Mask Degradations
- Parameters of pseudorandom mask generator
- shape type square, circle, ellipse, mixed
- density black-area / whole-area
- range of radii of shapes
32BaffleText Experiments at PARC
- Goal map the margins of accurate comfortable
- human reading on this family of
images - Metrics
- objective difficulty accuracy
- subjective difficulty rating
- response time
- exit survey how tolerable overall
- Participation
- 41 individual sessions
- 1200 challenge/response trials
- 18 exit surveys
33BaffleText challenge webpage
34BaffleText user ratings
35User Acceptance
- Subjects willing to solve a BaffleText
- 17 every time they send email
- 39 if it cut spam by 10x
- 89 every time they register for an
e-commerce site - 94 if it led to more trustworthy
recommendations - 100 every time they register for an email
account -
Out of 18 responses to the exit survey.
36Subjective difficulty tracks objective
difficulty
37How to engineer BaffleText
- When we generate a challenge,
- need to estimate its difficulty
- throw away if too easy or too hard
- Apply an idea from the psychophysics of reading
- image complexity metric how hard to read
- simple to compute perimeter? / black-area
38Image complexity predicts objective difficulty
39Image complexity predicts subjective difficulty
40Engineering guidelines
- For high performance, image complexity
- should fall in the range 50-100 e.g.
- Within this regime, BaffleText performs well
- 100 human subjects willing to try to read it
- 89 accuracy by humans
- 0 accuracy by commercial OCR
- 3.3 difficulty rating, out of 10 (on average)
- 8.7 seconds / trial on average
41The latest serious (known or published) attack
G. Mori J. Malik, Recognizing Objects in
Adversarial Clutter, submitted to CVPR03,
Madison, WI, June 16-22, 2003.
- Greg Mori Jitendra Malik (UCB-CS)
- Generalized Shape Context CV method
- requires known lexicon else, fails completely
- expects known font (or fonts) else, does worse
- Results of Mori-Malik attacks (Dec 2002)
given - perfect foreknowledge of both lexicon and
font
42BaffleText the strongest known CAPTCHA?
- Resists many known algorithmic attacks
- physics-based image restoration
- recognizing into a lexicon
- known-typeface targeting
- segmenting then recognizing
- Exploits hard-to-automate human cognition powers
- Gestalt perception
- semi-linguistic familiarity
- within-typeface style consistency
43Recent Microsoft CAPTCHA
- Random strings, local space-warping plus
meaningless curving strokes, both black
(overlaid) and white (erasing) - Fielded Dec 2002 on Passport (HotMail, etc)
- Immediate reduction in new Hotmail accounts, with
virtually no user complaints
P. Y. Simard, R. Szeliski, J. Benaloh, J.
Couvreur, I. Calinov, Using Character
Recognition and Segmentation to Tell Computer
from Humans, Proc., Intl Conf. on Document
Analysis Recognition, Edinburgh, Scotland,
August, 2003 to appear.
44PARCs Leadership in RD on Reading-based
CAPTCHAs
- First refereed article on CAPTCHAs
- A. L. Coates, H. S. Baird, R. Fateman,
Pessimal Print a Reverse Turing Test, Proc.,
6th IAPR Intl Conf. On Document Analysis
Recognition, Seattle, WA, Sept. 10-13, 2001. - First professional HIP event, organized by PARC
1st NSF Intl Workshop on HIPs, Jan. 9-11,
2002, PARC, Palo Alto, CA. - First to play both offense defense
- builds high-performance OCR systems attacks
CAPTCHAs - builds strong CAPTCHAs
- First to validate using human-factors research
- human-subject trials measuring both accuracy
tolerance - PARCs interdisciplinary tradition social
computer sciences
45The Arms Race
- When will serious technical attacks be launched?
- spam kings make millions
- two spam-blocking e-commerce firms now use
CAPTCHAs - How long can a CAPTCHA withstand attack?
- especially if its algorithms are published or
guessed - Strategy keep a pipeline of defenses in
reserve - continuing partnership between RD users
46Lots of Open Research Questions
- What are the most intractable obstacles to
machine vision? - segmentation, occlusion, degradations, ?
- Under what conditions is human reading most
robust? - linguistic semantic context, Gestalt, style
consistency? - Where are ability gaps located?
- quantitatively, not just qualitatively
- How to generate challenges strictly within
ability gaps? - fully automatically
- an indefinitely long sequence of distinct
challenges
47HIP Research Community
- PARC CAPTCHA website
- www.parc.com/istl/projects/captcha
- HIP2002 Workshop
- www.parc.com/istl/groups/did/HIP2002
- HIP Website at Aladdin Center, CMU-SCS
- www.captcha.net
- Volunteers for a PARC CAPTCHA usability test?
- A 2nd HIP Workshop soon?
48Alan Turing might have enjoyed the irony
- A technical problem machine reading
- which he thought would be easy,
- has resisted attack for 50 years, and
- now allows the first widespread
- practical use of variants of
- his test for artificial intelligence.
49Contact
- Henry S. Baird
- baird_at_parc.com
- www.parc.com/baird