Image Understanding presentation

About This Presentation

Transcript and Presenter's Notes

Title: Image Understanding

1
Image Understanding Web Security

Henry Baird
Joint work with
Richard Fateman, Allison Coates, Kris Popat,
Monica Chew, Tom Breuel, Mark Luk

2
A fast-emerging research topic

Human Interactive Proofs (definition later)
first instance in 1999
research took hold in CS security theory field
first
intersects image understanding, cog sci, etc etc
fast attracting researchers, engineers, users
This talk
A brief history of HIPs
Professional activities, so far -- incl. the 1st
Intl Workshop
Existing systems -- w/ my critiques
Next steps for the field
In detail PARCs PessimalPrint BaffleText

H. Baird K. Popat, Web Security Document
Image Analysis, in J. Hu A. Antonacopoulos
(Eds.), Web Document Analysis, World Scientific,
2003 (in press).
3
Early rumblings

90s spammers trolling for email addresses
in defense, people disguise them, e.g.
baird at parc dot com
1997 abuse of Add-URL feature at AltaVista
some write programs to add their URL many times
skewed the popularity rankings
Andrei Broder et al (then at DEC SRC)
a user action which is legitimate when performed
once
becomes abusive when repeated many times
no effective legal recourse
how to block or slow down these programs

4
The first known instanceAltavistas AddURL
filter
An image of text, not ASCII

1999 ransom note filter
randomly pick letters, fonts, rotations render
as an image
every user required to read and type it in
correctly
reduced spam add_URL by over 95
Weaknesses isolated chars, filterable noise,
affine deformations

M. D. Lillibridge, M. Abadi, K. Bharat, A. Z.
Broder, Method for Selectively Restricting
Access to Computer Systems, U.S. Patent No.
6,195,698, Issued February 27, 2001.
5
Yahoo!s Chat Room Problem

September 2000
Udi Manber asked Prof. Manuel Blums group at
CMS-SCS
programs impersonate people in chat rooms,
then hand out ads ugh!
how can all machines be denied access to a Web
site
without inconveniencing any human users?
I.e., how to distinguish between machines and
people on-line
some variation on Turing tests !

6
Alan Turing (1912-1954)

1936 a universal model of computation
1940s helped break Enigma (U-boat) cipher
1949 first serious uses of a working computer
including plans to read printed
text
(he expected it would be easy)
1950 proposed strong test for machine
intelligence

7
Turing Tests

How to judge that a machine can think
play an imitation game conducted via teletypes
a human judge two invisible interlocutors
a human
a machine pretending to be human
after asking any questions (challenges) he/she
wishes, the judge decides which is human
failure to decide correctly would be convincing
evidence of machine intelligence (Turing
asserted)
Modern GUIs invite richer challenges than
teletypes.

A. Turing, Computing Machinery Intelligence,
Mind, Vol. 59(236), 1950.
8
CAPTCHAs Completely Automated Public Turing
Tests to Tell Computers Humans Apart
(M. Blum, L. A. von Ahn, J. Langford, et al, CMU
SCS)

challenges can be generated graded
automatically
(i.e. the judge is a machine)
accepts virtually all humans, quickly easily
rejects virtually all machines
resists automatic attack for many years
(even assuming that its algorithms are
known?)
NOTE the machine administers, but cannot pass
the test!

L. von Ahn, M. Blum, N.J. Hopper, J. Langford,
CAPTCHA Using Hard AI Problems For Security,
Proc., EuroCrypt 2003, Warsaw, Poland, May 4-8,
2003 to appear.
9
CMUs Gimpy CAPTCHA

Randomly pick
English words, deformations, occlusions,
backgrounds, etc
Challenge user to type in any three of the words
Designed by CMU team tried out by Yahoo!
Problem users hated it --- it was withdrawn

L. Von Ahn, M. Blum, N. J. Hopper, J. Langford,
The CAPTCHA Web Page, http//www.captcha.net.
10
Yahoo!s present CAPTCHA EZ-Gimpy

Randomly pick
one English word, deformations,
degradations, occlusions,
colored backgrounds
Better tolerated by users
Now used on a large scale to protect various
services
Well tolerated by users
Weaknesses a single typeface, English lexicon

11
PayPals CAPTCHA

Nothing published
Seems to use one typeface
Picks, at random
letters, overlain pattern
Weaknesses single typeface, simple grid,
no image degradations, spaced apart

12
Cropping up everywhere

In use today, defending against
skewing search-engine rankings (Altavista, 1999)
infesting chat rooms, etc (Yahoo!, 2000)
gaming financial accounts (PayPal, 2001)
robot spamming (SpamArrest, MailBlock, 2002)
also Overture, Chinese website, CD-rebate,
TicketMaster,
have you seen others?
Coming up over the horizon they can discourage
password guessing
denial-of-service attacks
ballot stuffing
many others
Similar problems w/ scrapers also, likely on
Intranets.

D. P. Baron, eBay and Database Protection, Case
No. P-33, Case Writing Office, Stanford Graduate
School of Business, Stanford Univ., 2001.
13
The Known Limits ofImage Understanding
Technology

There remains a large gap in ability
between human and machine vision systems,
even in reading printed text
The performance of OCR machines has been
systematically studied
7 year olds can consistently do better!
Researchers have developed
stochastic models of document image
degradation
so we can generate challenging
word images pseudorandomly

S. Rice, G. Nagy, T. Nartker, OCR An Illustrated
Guide to the Frontier, Kluwer Academic
Publishers 1999.
H. Baird, Document Image Defect Models, in H.
Baird, H. Bunke, K. Yamamoto (Eds.), Structured
Document Image Analysis, Springer-Verlag New
York, 1992.
14
Can You Read These Degraded Images?
Of course you can . but OCR machines cannot!
15
Experiments by PARC UCB-CS

Pick words at random
70 words commonly used on the Web
w/out ascenders or descenders (cf. Spitz)
Vary physics-based image degradation parameters
blur, threshold, x-scale -- within certain
ranges
Pick fonts at random from a large set
Times Roman (TR), Times Italic (TI),
Palatino Roman (PR), Palatino Italic (PI),
Courier Roman (CR), Courier Oblique (CO),
etc
Test legibility on
ten human volunteers (UC Berkeley CS Dept grad
students)
three OCR machines
Expervision TR (E), ABBYY FineReader (A),
IRIS Reader (I)

16
Results OCR Accuracy, by machine
Each machine has its peculiar blind spots
17
OCR Accuracy varying blur threshold
They share some blind spots
18
PessimalPrint exploiting image degradations

Three OCR machines fail when
OCR outputs
blur 0.0
threshold ? 0.02 - 0.08
threshold 0.02
any value of blur

but people find these easy to read
A. Coates, H. Baird, R. Fateman, PessimalPrint
A Reverse Turing Test, Proc. 6th IAPR Intl
Conf. On Doc. Anal. Recogn. (ICDAR01),
Seattle, WA, Sep 10-13, 2001.
19
Jan 2002 High Time for a Workshop!

Manuel Blum proposes it, rounds up some key
speakers
Henry Baird offers PARC as venue Kris Popat
helps run it
Goals
Invite known principals theory, systems,
engineers, users
Describe the state of the art
Plan next steps for the field
Organization
30 attendees
abstracts only, 1-5 pages, no refereeing, no
archival publication
100 participation everyone gives a (short)
talk
mixing it up panel working group
discussions
2-1/2 days, lots of breaks for informal
socializing
plenary talk by John McCarthy Father of AI

20
NSF 1st Intl HIP WorkshopJan 9-11, 2002, Palo
Alto, CA
21
HIP2002 Participants

CMU - SCS, Aladdin Center
Manuel Blum, Lenore Blum, Luis von Ahn, John
Langford, Guy Blelloch, Nick Hopper, Ke Yang,
Brighten Godfrey, Bartosz Przydatek, Rachel Rue
PARC - SPIA/Security/Theory
Henry Baird, Kris Popat, Tom Breuel, Prateek
Sarkar, Tom Berson, Dirk Balfanz, David Goldberg
UCB - CS SIMS
Richard Fateman, Allison Coates, Jitendra Malik,
Doug Tygar, Alma Whitten, Rachna Dhamija, Monica
Chew, Adrian Perrig, Dawn Song
RPI
George Nagy
Stanford
John McCarthy
NSF
Robert Sloan

Altavista
Andrei Broder
Yahoo!
Udi Manber
Bell Labs
Dan Lopresti
IBM T.J. Watson
Charles Bennett
InterTrust Star Labs
Stuart Haber
City Univ. of Hong Hong
Nancy Chan
Weizmann Institute
Moni Naor
RSA Security Laboratories
Ari Juels
Document Recognition Techs, Inc
Larry Spitz

22
Variations Generalizations

CAPTCHA
Completely Automatic Public Turing test to tell
Computers and Humans Apart
HUMANOID
Text-based dialogue which an individual can use
to authenticate that he/she is himself/herself
(naked in a glass bubble)
PHONOID
Individual authentication using spoken language
Human Interactive Proof (HIP)
An automatically administered challenge/response
protocol
allowing a person to authenticate him/herself
as belonging to a certain group over a network
without the burden of passwords,
biometrics, mechanical aids, or special
training.

23
Highlights

Theory
some text-based CAPTCHAs are provably breakable
Ability Gaps
vision gestalt, segmentation, noise immunity,
style consistency
speech noise of many kinds, clutter (cocktail
party effect)
intelligence puzzles, analogical reasoning,
weak logic
gestures, reflexes, common knowledge,
Applications
subtle system-level vulnerabilties
aggressive arms race with shadowy enemies

24
Funding Partnerships

NSF
Robert Sloan, Dir, Theory of Computing Pgm
strongly supportive of this newborn field
encouraged grant proposals
Yahoo!
willing to run field trials
user acceptance laboratory
able to detect intrusion

25
Disciplines

Participating
Cryptography
Security
Document Image Analysis
Computer Vision
Artificial Intelligence
Needed
Cognitive Science
Psychophysics (esp. of Reading)
Biometrics
eCommerce, Business
.?

26
Weaknesses of Existing Reading-Based CAPTCHAs

English lexicon is too predictable
dictionaries are too small
only 1.2 bits of entropy per character (cf.
Shannon)
Physics-based image degradations vulnerable
to well-studied image restoration attacks,
e.g.
?
Complex images irritate people
even when they can read them
need user-tolerance experiments

27
Strengths ofHuman Reading

Literature on the psychophysics of reading is
relevant
familiarity helps, e.g. English words
optimal word-image size (subtended angle)
is known (0.3-2 degrees)
optimal contrast conditions known
other factors measured for the best performance
to achieve and sustain critical reading speed
BUT gives no answer to
wheres the optimal comfort zone?

G. E. Legge, D. G. Pelli, G. S. Rubin, M. M.
Schleske, Psychophysics of Reading I. normal
vision, Vision Research 25(2), 1985.
AJ. Grainger J. Segui, Neighborhood Frequency
Effects in Visual Word Recognition, Perception
Psychophysics 47, 1990..
28
Designing a Stronger CAPTCHA BaffleText
principles

Nonsense words.
generate pronounceable not spellable
words
using a variable-length character n-gram
Markov model
they look familiar, but arent in any lexicon,
e.g.
ablithan wouquire quasis
Gestalt perception.
force inference of a whole word-image
from fragmentary or occluded characters,
e.g.
using a single familiar typeface also helps people

M. Chew H. S. Baird, BaffleText A Human
Interactive Proof, Proc., SPIE/IST Conf. on
Document Recognition Retrieval X, Santa Clara,
CA, January 23-24, 2003.
29
Mask Degradations

Parameters of pseudorandom mask generator
shape type square, circle, ellipse, mixed
density black-area / whole-area
range of radii of shapes

30
BaffleText Experiment at PARC

Goal map the margins of accurate comfortable
human reading on this family of
images
Metrics
objectiive difficulty accuracy
subjective difficulty rating
response time
exit survey how tolerable overall
Participation
41 individual sessions
gt1200 challenge/response trials
18 exit surveys

31
BaffleText challenge webpage
32
BaffleText user rating
33
User Acceptance

Subjects who say theyre willing to solve a
BaffleText
17 every time they send email
39 if it cut spam by 10x
89 every time they register for an
e-commerce site
94 if it led to more trustworthy
recommendations
100 every time they register for an email
account

Out of 18 responses to the exit survey.
34
Subjective difficulty tracks objective
difficulty
35
How to engineer BaffleText

When we generate a challenge,
need to be able to estimate its difficulty
throw away if too easy or too hard
Apply an idea from the psychophysics of reading
image complexity metric how hard to read
simple to compute perimeter? / black-area

36
Image complexity predicts objective difficulty
37
Image complexity predicts subjective difficulty
38
Engineering guidelines

For high performance, image complexity
should fall in the range 50-100 e.g.
Within this regime, BaffleText performs well
100 human subjects willing to try to read it
89 accuracy by humans
0 accuracy by commercial OCR
3.3 difficulty rating, out of 10 (on average)
8.7 seconds / trial on average

39
The latest serious (known or published) attack
G. Mori J. Malik, Recognizing Objects in
Adversarial Clutter, submitted to CVPR03,
Madison, WI, June 16-22, 2003.

Greg Mori Jitendra Malik (UCB-CS)
Generalized Shape Context CV method
requires known lexicon else, fails completely
expects known font (or fonts) else, does worse
Results of Mori-Malik attacks (Dec 2002) with
foreknowledge of both lexicon and font

40
BaffleText the strongest known CAPTCHA?

Resists many known attacks
physics-based image restoration
recognizing into a lexicon
typeface targeting
segmenting then recognizing
Exploits hard-to-automate human cognition powers
Gestalt perception
semi-linguistic familiarity
style consistency

41
PARCs Leadership Role

Published 1st refereed paper on CAPTCHAs
A. L. Coates, H. S. Baird, R. Fateman,
Pessimal Print a Reverse Turing Test, Proc.,
6th IAPR Intl Conf. On Document Analysis
Recognition, Seattle, WA, Sept. 10-13, 2001.
Hosted first professional event 1st NSF Intl
Workshop on HIPs, Jan. 9-11, 2002, Palo Alto, CA.
Plays both offense defense
attacks CAPTCHAs builds high-performance OCR
systems
builds strong CAPTCHAs
Validates using human-factors research
human-subject trials measuring accuracy
tolerance
PARCs interdisciplinary tradition social
computer sciences

42
The Arms Race

Will serious technical attacks be launched?
spam kings make millions
two spam-blocking e-commerce firms use CAPTCHAs
How long can a CAPTCHA stand against attack?
especially if its algorithms are published or
guessed
Keep a pipeline of defenses in reserve
a long partnership between research users

43
Lots of Open Research Questions

What are the most intractable obstacles to
machine vision?
segmentation, occlusion, degradations, ?
Under what conditions is human reading most
robust?
linguistic semantic context, Gestalt, style
consistency?
Where are ability gaps located?
quantitatively, not just qualitatively
How to generate challenges strictly within
ability gaps?
fully automatically
an indefinitely long sequence of distinct
challenges

44
HIP Research Community

HIP Website at Aladdin Center, CMU SCS
www.captcha.net
Volunteers for a CAPTCHA usability test?
PARC CAPTCHA experimental software tools
FreeType-based, C, C for Linux etc (T. Breuel)
Doc. image degradation generator (H. Baird)
New Gestalt-inspired degradations (M. Chew, UCB)
PHP4 code for CAPTCHA test web site (M. Chew, M.
Luk)
would a free GPL license be acceptable?
A 2nd HIP Workshop soon?

45
Alan Turing might have enjoyed the irony

A technical problem machine reading
which he thought would be easy,
has resisted attack for 50 years, and
now allows the first widespread
practical use of variants of
his test for artificial intelligence.

46
Contact

Henry S. Baird
baird_at_parc.com
www.parc.com/baird

47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
OCR Accuracy varying blur
51
OCR Accuracy varying blur
52
OCR Accuracy varying blur
53
Yahoo!s current CAPTCHA

Randomly pick
one English word, typeface, distortions,
occlusions, background
More tolerable to users
Used on a large scale to protect various services

Write a Comment

User Comments (0)

About PowerShow.com

Image Understanding PowerPoint PPT Presentation