Image Understanding - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Image Understanding

Description:

Richard Fateman, Allison Coates, Kris Popat, ... research took hold in CS security theory field first ... Modern GUIs invite richer challenges than teletypes... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 53
Provided by: alli57
Category:

less

Transcript and Presenter's Notes

Title: Image Understanding


1
Image Understanding Web Security
  • Henry Baird
  • Joint work with
  • Richard Fateman, Allison Coates, Kris Popat,
  • Monica Chew, Tom Breuel, Mark Luk

2
A fast-emerging research topic
  • Human Interactive Proofs (definition later)
  • first instance in 1999
  • research took hold in CS security theory field
    first
  • intersects image understanding, cog sci, etc etc
  • fast attracting researchers, engineers, users
  • This talk
  • A brief history of HIPs
  • Professional activities, so far -- incl. the 1st
    Intl Workshop
  • Existing systems -- w/ my critiques
  • Next steps for the field
  • In detail PARCs PessimalPrint BaffleText

H. Baird K. Popat, Web Security Document
Image Analysis, in J. Hu A. Antonacopoulos
(Eds.), Web Document Analysis, World Scientific,
2003 (in press).
3
Early rumblings
  • 90s spammers trolling for email addresses
  • in defense, people disguise them, e.g.
  • baird at parc dot com
  • 1997 abuse of Add-URL feature at AltaVista
  • some write programs to add their URL many times
  • skewed the popularity rankings
  • Andrei Broder et al (then at DEC SRC)
  • a user action which is legitimate when performed
    once
  • becomes abusive when repeated many times
  • no effective legal recourse
  • how to block or slow down these programs

4
The first known instanceAltavistas AddURL
filter
An image of text, not ASCII
  • 1999 ransom note filter
  • randomly pick letters, fonts, rotations render
    as an image
  • every user required to read and type it in
    correctly
  • reduced spam add_URL by over 95
  • Weaknesses isolated chars, filterable noise,
    affine deformations

M. D. Lillibridge, M. Abadi, K. Bharat, A. Z.
Broder, Method for Selectively Restricting
Access to Computer Systems, U.S. Patent No.
6,195,698, Issued February 27, 2001.
5
Yahoo!s Chat Room Problem
  • September 2000
  • Udi Manber asked Prof. Manuel Blums group at
    CMS-SCS
  • programs impersonate people in chat rooms,
  • then hand out ads ugh!
  • how can all machines be denied access to a Web
    site
  • without inconveniencing any human users?
  • I.e., how to distinguish between machines and
    people on-line
  • some variation on Turing tests !

6
Alan Turing (1912-1954)
  • 1936 a universal model of computation
  • 1940s helped break Enigma (U-boat) cipher
  • 1949 first serious uses of a working computer
  • including plans to read printed
    text
  • (he expected it would be easy)
  • 1950 proposed strong test for machine
    intelligence

7
Turing Tests
  • How to judge that a machine can think
  • play an imitation game conducted via teletypes
  • a human judge two invisible interlocutors
  • a human
  • a machine pretending to be human
  • after asking any questions (challenges) he/she
  • wishes, the judge decides which is human
  • failure to decide correctly would be convincing
  • evidence of machine intelligence (Turing
    asserted)
  • Modern GUIs invite richer challenges than
    teletypes.

A. Turing, Computing Machinery Intelligence,
Mind, Vol. 59(236), 1950.
8
CAPTCHAs Completely Automated Public Turing
Tests to Tell Computers Humans Apart
(M. Blum, L. A. von Ahn, J. Langford, et al, CMU
SCS)
  • challenges can be generated graded
    automatically
  • (i.e. the judge is a machine)
  • accepts virtually all humans, quickly easily
  • rejects virtually all machines
  • resists automatic attack for many years
  • (even assuming that its algorithms are
    known?)
  • NOTE the machine administers, but cannot pass
    the test!

L. von Ahn, M. Blum, N.J. Hopper, J. Langford,
CAPTCHA Using Hard AI Problems For Security,
Proc., EuroCrypt 2003, Warsaw, Poland, May 4-8,
2003 to appear.
9
CMUs Gimpy CAPTCHA
  • Randomly pick
  • English words, deformations, occlusions,
    backgrounds, etc
  • Challenge user to type in any three of the words
  • Designed by CMU team tried out by Yahoo!
  • Problem users hated it --- it was withdrawn

L. Von Ahn, M. Blum, N. J. Hopper, J. Langford,
The CAPTCHA Web Page, http//www.captcha.net.
10
Yahoo!s present CAPTCHA EZ-Gimpy
  • Randomly pick
  • one English word, deformations,
    degradations, occlusions,
  • colored backgrounds
  • Better tolerated by users
  • Now used on a large scale to protect various
    services
  • Well tolerated by users
  • Weaknesses a single typeface, English lexicon

11
PayPals CAPTCHA
  • Nothing published
  • Seems to use one typeface
  • Picks, at random
  • letters, overlain pattern
  • Weaknesses single typeface, simple grid,
  • no image degradations, spaced apart

12
Cropping up everywhere
  • In use today, defending against
  • skewing search-engine rankings (Altavista, 1999)
  • infesting chat rooms, etc (Yahoo!, 2000)
  • gaming financial accounts (PayPal, 2001)
  • robot spamming (SpamArrest, MailBlock, 2002)
  • also Overture, Chinese website, CD-rebate,
    TicketMaster,
  • have you seen others?
  • Coming up over the horizon they can discourage
  • password guessing
  • denial-of-service attacks
  • ballot stuffing
  • many others
  • Similar problems w/ scrapers also, likely on
    Intranets.

D. P. Baron, eBay and Database Protection, Case
No. P-33, Case Writing Office, Stanford Graduate
School of Business, Stanford Univ., 2001.
13
The Known Limits ofImage Understanding
Technology
  • There remains a large gap in ability
  • between human and machine vision systems,
  • even in reading printed text
  • The performance of OCR machines has been
    systematically studied
  • 7 year olds can consistently do better!
  • Researchers have developed
  • stochastic models of document image
    degradation
  • so we can generate challenging
  • word images pseudorandomly

S. Rice, G. Nagy, T. Nartker, OCR An Illustrated
Guide to the Frontier, Kluwer Academic
Publishers 1999.
H. Baird, Document Image Defect Models, in H.
Baird, H. Bunke, K. Yamamoto (Eds.), Structured
Document Image Analysis, Springer-Verlag New
York, 1992.
14
Can You Read These Degraded Images?
Of course you can . but OCR machines cannot!
15
Experiments by PARC UCB-CS
  • Pick words at random
  • 70 words commonly used on the Web
  • w/out ascenders or descenders (cf. Spitz)
  • Vary physics-based image degradation parameters
  • blur, threshold, x-scale -- within certain
    ranges
  • Pick fonts at random from a large set
  • Times Roman (TR), Times Italic (TI),
  • Palatino Roman (PR), Palatino Italic (PI),
  • Courier Roman (CR), Courier Oblique (CO),
    etc
  • Test legibility on
  • ten human volunteers (UC Berkeley CS Dept grad
    students)
  • three OCR machines
  • Expervision TR (E), ABBYY FineReader (A),
    IRIS Reader (I)

16
Results OCR Accuracy, by machine
Each machine has its peculiar blind spots
17
OCR Accuracy varying blur threshold
They share some blind spots
18
PessimalPrint exploiting image degradations
  • Three OCR machines fail when
    OCR outputs
  • blur 0.0
  • threshold ? 0.02 - 0.08
  • threshold 0.02
  • any value of blur
  • .I
  • i1
  • N/A
  • N/A
  • N/A
  • I

but people find these easy to read
A. Coates, H. Baird, R. Fateman, PessimalPrint
A Reverse Turing Test, Proc. 6th IAPR Intl
Conf. On Doc. Anal. Recogn. (ICDAR01),
Seattle, WA, Sep 10-13, 2001.
19
Jan 2002 High Time for a Workshop!
  • Manuel Blum proposes it, rounds up some key
    speakers
  • Henry Baird offers PARC as venue Kris Popat
    helps run it
  • Goals
  • Invite known principals theory, systems,
    engineers, users
  • Describe the state of the art
  • Plan next steps for the field
  • Organization
  • 30 attendees
  • abstracts only, 1-5 pages, no refereeing, no
    archival publication
  • 100 participation everyone gives a (short)
    talk
  • mixing it up panel working group
    discussions
  • 2-1/2 days, lots of breaks for informal
    socializing
  • plenary talk by John McCarthy Father of AI

20
NSF 1st Intl HIP WorkshopJan 9-11, 2002, Palo
Alto, CA
21
HIP2002 Participants
  • CMU - SCS, Aladdin Center
  • Manuel Blum, Lenore Blum, Luis von Ahn, John
    Langford, Guy Blelloch, Nick Hopper, Ke Yang,
    Brighten Godfrey, Bartosz Przydatek, Rachel Rue
  • PARC - SPIA/Security/Theory
  • Henry Baird, Kris Popat, Tom Breuel, Prateek
    Sarkar, Tom Berson, Dirk Balfanz, David Goldberg
  • UCB - CS SIMS
  • Richard Fateman, Allison Coates, Jitendra Malik,
    Doug Tygar, Alma Whitten, Rachna Dhamija, Monica
    Chew, Adrian Perrig, Dawn Song
  • RPI
  • George Nagy
  • Stanford
  • John McCarthy
  • NSF
  • Robert Sloan
  • Altavista
  • Andrei Broder
  • Yahoo!
  • Udi Manber
  • Bell Labs
  • Dan Lopresti
  • IBM T.J. Watson
  • Charles Bennett
  • InterTrust Star Labs
  • Stuart Haber
  • City Univ. of Hong Hong
  • Nancy Chan
  • Weizmann Institute
  • Moni Naor
  • RSA Security Laboratories
  • Ari Juels
  • Document Recognition Techs, Inc
  • Larry Spitz

22
Variations Generalizations
  • CAPTCHA
  • Completely Automatic Public Turing test to tell
    Computers and Humans Apart
  • HUMANOID
  • Text-based dialogue which an individual can use
    to authenticate that he/she is himself/herself
    (naked in a glass bubble)
  • PHONOID
  • Individual authentication using spoken language
  • Human Interactive Proof (HIP)
  • An automatically administered challenge/response
    protocol
  • allowing a person to authenticate him/herself
    as belonging to a certain group over a network
    without the burden of passwords,
  • biometrics, mechanical aids, or special
    training.

23
Highlights
  • Theory
  • some text-based CAPTCHAs are provably breakable
  • Ability Gaps
  • vision gestalt, segmentation, noise immunity,
    style consistency
  • speech noise of many kinds, clutter (cocktail
    party effect)
  • intelligence puzzles, analogical reasoning,
    weak logic
  • gestures, reflexes, common knowledge,
  • Applications
  • subtle system-level vulnerabilties
  • aggressive arms race with shadowy enemies

24
Funding Partnerships
  • NSF
  • Robert Sloan, Dir, Theory of Computing Pgm
  • strongly supportive of this newborn field
  • encouraged grant proposals
  • Yahoo!
  • willing to run field trials
  • user acceptance laboratory
  • able to detect intrusion

25
Disciplines
  • Participating
  • Cryptography
  • Security
  • Document Image Analysis
  • Computer Vision
  • Artificial Intelligence
  • Needed
  • Cognitive Science
  • Psychophysics (esp. of Reading)
  • Biometrics
  • eCommerce, Business
  • .?

26
Weaknesses of Existing Reading-Based CAPTCHAs
  • English lexicon is too predictable
  • dictionaries are too small
  • only 1.2 bits of entropy per character (cf.
    Shannon)
  • Physics-based image degradations vulnerable
  • to well-studied image restoration attacks,
    e.g.
  • ?
  • Complex images irritate people
  • even when they can read them
  • need user-tolerance experiments

27
Strengths ofHuman Reading
  • Literature on the psychophysics of reading is
    relevant
  • familiarity helps, e.g. English words
  • optimal word-image size (subtended angle)
  • is known (0.3-2 degrees)
  • optimal contrast conditions known
  • other factors measured for the best performance
  • to achieve and sustain critical reading speed
  • BUT gives no answer to
  • wheres the optimal comfort zone?

G. E. Legge, D. G. Pelli, G. S. Rubin, M. M.
Schleske, Psychophysics of Reading I. normal
vision, Vision Research 25(2), 1985.
AJ. Grainger J. Segui, Neighborhood Frequency
Effects in Visual Word Recognition, Perception
Psychophysics 47, 1990..
28
Designing a Stronger CAPTCHA BaffleText
principles
  • Nonsense words.
  • generate pronounceable not spellable
    words
  • using a variable-length character n-gram
    Markov model
  • they look familiar, but arent in any lexicon,
    e.g.
  • ablithan wouquire quasis
  • Gestalt perception.
  • force inference of a whole word-image
  • from fragmentary or occluded characters,
    e.g.
  • using a single familiar typeface also helps people

M. Chew H. S. Baird, BaffleText A Human
Interactive Proof, Proc., SPIE/IST Conf. on
Document Recognition Retrieval X, Santa Clara,
CA, January 23-24, 2003.
29
Mask Degradations
  • Parameters of pseudorandom mask generator
  • shape type square, circle, ellipse, mixed
  • density black-area / whole-area
  • range of radii of shapes

30
BaffleText Experiment at PARC
  • Goal map the margins of accurate comfortable
  • human reading on this family of
    images
  • Metrics
  • objectiive difficulty accuracy
  • subjective difficulty rating
  • response time
  • exit survey how tolerable overall
  • Participation
  • 41 individual sessions
  • gt1200 challenge/response trials
  • 18 exit surveys

31
BaffleText challenge webpage
32
BaffleText user rating
33
User Acceptance
  • Subjects who say theyre willing to solve a
    BaffleText
  • 17 every time they send email
  • 39 if it cut spam by 10x
  • 89 every time they register for an
    e-commerce site
  • 94 if it led to more trustworthy
    recommendations
  • 100 every time they register for an email
    account

Out of 18 responses to the exit survey.
34
Subjective difficulty tracks objective
difficulty
35
How to engineer BaffleText
  • When we generate a challenge,
  • need to be able to estimate its difficulty
  • throw away if too easy or too hard
  • Apply an idea from the psychophysics of reading
  • image complexity metric how hard to read
  • simple to compute perimeter? / black-area

36
Image complexity predicts objective difficulty
37
Image complexity predicts subjective difficulty
38
Engineering guidelines
  • For high performance, image complexity
  • should fall in the range 50-100 e.g.
  • Within this regime, BaffleText performs well
  • 100 human subjects willing to try to read it
  • 89 accuracy by humans
  • 0 accuracy by commercial OCR
  • 3.3 difficulty rating, out of 10 (on average)
  • 8.7 seconds / trial on average

39
The latest serious (known or published) attack
G. Mori J. Malik, Recognizing Objects in
Adversarial Clutter, submitted to CVPR03,
Madison, WI, June 16-22, 2003.
  • Greg Mori Jitendra Malik (UCB-CS)
  • Generalized Shape Context CV method
  • requires known lexicon else, fails completely
  • expects known font (or fonts) else, does worse
  • Results of Mori-Malik attacks (Dec 2002) with
  • foreknowledge of both lexicon and font

40
BaffleText the strongest known CAPTCHA?
  • Resists many known attacks
  • physics-based image restoration
  • recognizing into a lexicon
  • typeface targeting
  • segmenting then recognizing
  • Exploits hard-to-automate human cognition powers
  • Gestalt perception
  • semi-linguistic familiarity
  • style consistency

41
PARCs Leadership Role
  • Published 1st refereed paper on CAPTCHAs
  • A. L. Coates, H. S. Baird, R. Fateman,
    Pessimal Print a Reverse Turing Test, Proc.,
    6th IAPR Intl Conf. On Document Analysis
    Recognition, Seattle, WA, Sept. 10-13, 2001.
  • Hosted first professional event 1st NSF Intl
    Workshop on HIPs, Jan. 9-11, 2002, Palo Alto, CA.
  • Plays both offense defense
  • attacks CAPTCHAs builds high-performance OCR
    systems
  • builds strong CAPTCHAs
  • Validates using human-factors research
  • human-subject trials measuring accuracy
    tolerance
  • PARCs interdisciplinary tradition social
    computer sciences

42
The Arms Race
  • Will serious technical attacks be launched?
  • spam kings make millions
  • two spam-blocking e-commerce firms use CAPTCHAs
  • How long can a CAPTCHA stand against attack?
  • especially if its algorithms are published or
    guessed
  • Keep a pipeline of defenses in reserve
  • a long partnership between research users

43
Lots of Open Research Questions
  • What are the most intractable obstacles to
    machine vision?
  • segmentation, occlusion, degradations, ?
  • Under what conditions is human reading most
    robust?
  • linguistic semantic context, Gestalt, style
    consistency?
  • Where are ability gaps located?
  • quantitatively, not just qualitatively
  • How to generate challenges strictly within
    ability gaps?
  • fully automatically
  • an indefinitely long sequence of distinct
    challenges

44
HIP Research Community
  • HIP Website at Aladdin Center, CMU SCS
  • www.captcha.net
  • Volunteers for a CAPTCHA usability test?
  • PARC CAPTCHA experimental software tools
  • FreeType-based, C, C for Linux etc (T. Breuel)
  • Doc. image degradation generator (H. Baird)
  • New Gestalt-inspired degradations (M. Chew, UCB)
  • PHP4 code for CAPTCHA test web site (M. Chew, M.
    Luk)
  • would a free GPL license be acceptable?
  • A 2nd HIP Workshop soon?

45
Alan Turing might have enjoyed the irony
  • A technical problem machine reading
  • which he thought would be easy,
  • has resisted attack for 50 years, and
  • now allows the first widespread
  • practical use of variants of
  • his test for artificial intelligence.

46
Contact
  • Henry S. Baird
  • baird_at_parc.com
  • www.parc.com/baird

47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
OCR Accuracy varying blur
51
OCR Accuracy varying blur
52
OCR Accuracy varying blur
53
Yahoo!s current CAPTCHA
  • Randomly pick
  • one English word, typeface, distortions,
    occlusions, background
  • More tolerable to users
  • Used on a large scale to protect various services
Write a Comment
User Comments (0)
About PowerShow.com