Image Understanding - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Image Understanding

Description:

1997: abuse of Add-URL' feature at AltaVista. some write programs to add their URL many times ... search-engine rankings (Altavista, 1999) infesting chat rooms, ... – PowerPoint PPT presentation

Number of Views:529
Avg rating:3.0/5.0
Slides: 50
Provided by: allison90
Category:

less

Transcript and Presenter's Notes

Title: Image Understanding


1
Image Understanding Web Security
  • Henry Baird
  • Joint work with
  • Richard Fateman, Allison Coates, Kris Popat,
  • Monica Chew, Tom Breuel, Mark Luk

2
A fast-emerging research topic
  • Human Interactive Proofs (HIPs definition
    later)
  • first instance in 1999
  • research took hold in CS security theory field
    first
  • intersects image understanding, cog sci, etc etc
  • fast attracting researchers, engineers, users
  • This talk
  • A brief history of HIPs
  • Existing systems -- w/ my critiques
  • Professional activities, so far -- incl. the 1st
    Intl Workshop
  • In detail PARCs PessimalPrint BaffleText

H. Baird K. Popat, Web Security Document
Image Analysis, in J. Hu A. Antonacopoulos
(Eds.), Web Document Analysis, World Scientific,
2003 (in press).
3
Straws in the wind
  • 90s spammers trolling for email addresses
  • in defense, people disguise them, e.g.
  • baird AT parc DOT com
  • 1997 abuse of Add-URL feature at AltaVista
  • some write programs to add their URL many times
  • skewed the search rankings
  • Andrei Broder et al (then at DEC SRC)
  • a user action which is legitimate when performed
    once
  • becomes abusive when repeated many times
  • no effective legal recourse
  • how to block or slow down these programs

4
The first known instance Altavistas AddURL
filter
An image of text, not ASCII
  • 1999 ransom note filter
  • randomly pick letters, fonts, rotations render
    as an image
  • every user is required to read and type it in
    correctly
  • reduced spam add_URL by over 95
  • Weaknesses isolated chars, filterable noise,
    affine deformations

M. D. Lillibridge, M. Abadi, K. Bharat, A. Z.
Broder, Method for Selectively Restricting
Access to Computer Systems, U.S. Patent No.
6,195,698, Filed April 13, 1998, Issued February
27, 2001.
5
Yahoo!s Chat Room Problem
  • September 2000
  • Udi Manber asked Prof. Manuel Blums group at
    CMU
  • programs impersonate people in chat rooms,
  • then hand out ads ugh!
  • how can all machines be denied access to a Web
    site
  • without inconveniencing any human users?
  • I.e., how to distinguish between machines and
    people on-line
  • a kind of Turing test !

6
Alan Turing (1912-1954)
  • 1936 a universal model of computation
  • 1940s helped break Enigma (U-boat) cipher
  • 1949 first serious uses of a working computer
  • including plans to read printed
    text
  • (he expected it would be easy)
  • 1950 proposed a test for machine intelligence

7
Turings Test for AI
  • How to judge that a machine can think
  • play an imitation game conducted via teletypes
  • a human judge two invisible interlocutors
  • a human
  • a machine pretending to be human
  • after asking any questions (challenges) he/she
  • wishes, the judge decides which is human
  • failure to decide correctly would be convincing
  • evidence of machine intelligence (Turing
    asserted)
  • Modern GUIs invite richer challenges than
    teletypes.

A. Turing, Computing Machinery Intelligence,
Mind, Vol. 59(236), 1950.
8
CAPTCHAs Completely Automated Public Turing
Tests to Tell Computers Humans Apart
(M. Blum, L. A. von Ahn, J. Langford, et al,
CMU-SCS)
  • challenges can be generated graded
    automatically
  • (i.e. the judge is a machine)
  • accepts virtually all humans, quickly easily
  • rejects virtually all machines
  • resists automatic attack for many years
  • (even assuming that its algorithms are
    known?)
  • NOTE the machine administers, but cannot pass
    the test!

L. von Ahn, M. Blum, N.J. Hopper, J. Langford,
CAPTCHA Using Hard AI Problems For Security,
Proc., EuroCrypt 2003, Warsaw, Poland, May 4-8,
2003 to appear.
9
CMUs Gimpy CAPTCHA
  • Randomly pick
  • English words, deformations, occlusions,
    backgrounds, etc
  • Challenge user to type in any three of the words
  • Designed by CMU team tried out by Yahoo!
  • Problem users hated it --- Yahoo! withdrew it

L. Von Ahn, M. Blum, N. J. Hopper, J. Langford,
The CAPTCHA Web Page, http//www.captcha.net.
10
Yahoo!s present CAPTCHA EZ-Gimpy
  • Randomly pick
  • one English word, deformations,
    degradations, occlusions,
  • colored backgrounds, etc
  • Better tolerated by users
  • Now used on a large scale to protect various
    services
  • Weaknesses a single typeface, English lexicon

11
PayPals CAPTCHA
  • Nothing published
  • Seems to use a single typeface
  • Picks, at random
  • letters, overlain pattern
  • Weaknesses single typeface, simple grid,
  • no image degradations, spaced apart

12
Cropping up everywhere
  • In use today, to defend against
  • skewing search-engine rankings (Altavista, 1999)
  • infesting chat rooms, etc (Yahoo!, 2000)
  • gaming financial accounts (PayPal, 2001)
  • robot spamming (MailBlocks, SpamArrest 2002)
  • In the last few months Overture, Chinese
    website, HotMail,
  • CD-rebate, TicketMaster, MailFrontier,
    Qurb, Madonnarama,
  • have you seen others?
  • On the horizon
  • ballot stuffing, password guessing,
    denial-of-service attacks
  • blunt force attacks (e.g. UT Austin break-in,
    Mar 03)
  • many others
  • Similar problems w/ scrapers also, likely on
    Intranets.

D. P. Baron, eBay and Database Protection, Case
No. P-33, Case Writing Office, Stanford Graduate
School of Business, Stanford Univ., 2001.
13
The Known Limits ofImage Understanding
Technology
  • There remains a large gap in ability
  • between human and machine vision systems,
  • even when reading printed text
  • Performance of OCR machines has been
    systematically studied
  • 7 year olds can consistently do better!
  • This ability gap has been mapped quantitatively

S. Rice, G. Nagy, T. Nartker, OCR An Illustrated
Guide to the Frontier, Kluwer Academic
Publishers 1999.
14
Image Degradation Modeling
thrs x blur
  • Effects of printing imaging

blur thrs sens
We can generate challenging images pseudorandomly
H. Baird, Document Image Defect Models, in H.
Baird, H. Bunke, K. Yamamoto (Eds.), Structured
Document Image Analysis, Springer-Verlag New
York, 1992.
15
Machine Accuracy is a SmoothMonotonic Function
of Parameters
T. K. Ho H. S. Baird, Large Scale Simulation
Studies in Image Pattern Recognition, IEEE
Trans. on PAMI, Vol. 19, No. 10, p. 1067-1079,
October 1997.
16
Can You Read These Degraded Images?
Of course you can . but OCR machines cannot!
17
Experiments by PARC UCB-CS
  • Pick words at random
  • 70 words commonly used on the Web
  • w/out ascenders or descenders (cf. Spitz)
  • Vary physics-based image degradation parameters
  • blur, threshold, x-scale -- within certain
    ranges
  • Pick fonts at random from a large set
  • Times Roman (TR), Times Italic (TI),
  • Palatino Roman (PR), Palatino Italic (PI),
  • Courier Roman (CR), Courier Oblique (CO),
    etc
  • Test legibility on
  • ten human volunteers (UC Berkeley CS Dept grad
    students)
  • three OCR machines
  • Expervision TR (E), ABBYY FineReader (A),
    IRIS Reader (I)

18
Results OCR Accuracy, by machine
Each machine has its peculiar blind spots
19
OCR Accuracy varying blur threshold
The machines share some blind spots
20
PessimalPrint exploiting image degradations
  • Three OCR machines fail when
    OCR outputs
  • blur 0.0
  • threshold ? 0.02 - 0.08
  • threshold 0.02
  • any value of blur
  • .I
  • i1
  • N/A
  • N/A
  • N/A
  • I

but people find all these easy to read
A. Coates, H. Baird, R. Fateman, Pessimal Print
A Reverse Turing Test, Proc. 6th IAPR Intl
Conf. On Doc. Anal. Recogn. (ICDAR01),
Seattle, WA, Sep 10-13, 2001.
21
High Time for a Workshop!
  • Manuel Blum proposes it, rounds up some key
    speakers
  • Henry Baird offers PARC as venue Kris Popat
    helps run it
  • Goals
  • Invite all known principals theory, systems,
    engineers, users
  • Describe the state of the art
  • Plan next steps for the field
  • Organization
  • 30 attendees
  • abstracts only, 1-5 pages, no refereeing, no
    archival publication
  • 100 participation everyone gives a (short)
    talk
  • mixing it up panel working group
    discussions
  • 2-1/2 days, lots of breaks for informal
    socializing
  • plenary talk by John McCarthy Father of AI

22
1st NSF Intl Workshop onHuman Interactive
Proofs PARC, Palo Alto, CA, January 9-11, 2002
23
HIP2002 Participants
  • CMU - SCS, Aladdin Center
  • Manuel Blum, Lenore Blum, Luis von Ahn, John
    Langford, Guy Blelloch, Nick Hopper, Ke Yang,
    Brighten Godfrey, Bartosz Przydatek, Rachel Rue
  • PARC - SPIA/Security/Theory
  • Henry Baird, Kris Popat, Tom Breuel, Prateek
    Sarkar, Tom Berson, Dirk Balfanz, David Goldberg
  • UCB - CS SIMS
  • Richard Fateman, Allison Coates, Jitendra Malik,
    Doug Tygar, Alma Whitten, Rachna Dhamija, Monica
    Chew, Adrian Perrig, Dawn Song
  • RPI
  • George Nagy
  • Stanford
  • John McCarthy
  • NSF
  • Robert Sloan
  • Altavista
  • Andrei Broder
  • Yahoo!
  • Udi Manber
  • Bell Labs
  • Dan Lopresti
  • IBM T.J. Watson
  • Charles Bennett
  • InterTrust Star Labs
  • Stuart Haber
  • City Univ. of Hong Hong
  • Nancy Chan
  • Weizmann Institute
  • Moni Naor
  • RSA Security Laboratories
  • Ari Juels
  • Document Recognition Techs, Inc
  • Larry Spitz

24
Variations Generalizations
  • CAPTCHA
  • Completely Automatic Public Turing test to tell
    Computers and Humans Apart
  • HUMANOID
  • Text-based dialogue which an individual can use
    to authenticate that he/she is himself/herself
    (naked in a glass bubble)
  • PHONOID
  • Individual authentication using spoken language
  • Human Interactive Proof (HIP)
  • An automatically administered challenge/response
    protocol
  • allowing a person to authenticate him/herself
    as belonging to a certain group over a network
    without the burden of passwords,
  • biometrics, mechanical aids, or special
    training.

25
Highlights of HIP2002
  • Theory
  • some text-based CAPTCHAs are provably breakable
  • Ability Gaps
  • vision gestalt, segmentation, noise immunity,
    style consistency
  • speech noise of many kinds, clutter (cocktail
    party effect)
  • intelligence puzzles, analogical reasoning,
    weak logic
  • gestures, reflexes, common knowledge,
  • Applications
  • subtle system-level vulnerabilties
  • aggressive arms race with shadowy enemies

http//www.parc.com/istl/groups/did/HIP2002
26
Funding Partnerships
  • NSF
  • Robert Sloan, Dir, Theory of Computing Pgm
  • strongly supportive of this newborn field
  • encouraged grant proposals
  • Yahoo!
  • willing to run field trials
  • user acceptance laboratory
  • able to detect intrusion

27
Disciplines
  • Participating now
  • Cryptography
  • Security
  • Pattern Recognition
  • Computer Vision
  • Artificial Intelligence
  • eCommerce
  • Needed
  • Cognitive Science
  • Psychophysics (esp. of Reading)
  • Biometrics
  • Business, Law,
  • .?

28
Weaknesses of Existing Reading-Based CAPTCHAs
  • English lexicon is too predictable
  • dictionaries are too small
  • only 1.2 bits of entropy per character (cf.
    Shannon)
  • Physics-based image degradations vulnerable
  • to well-studied image restoration attacks,
    e.g.
  • ?
  • Complex images irritate people
  • even when they can read them
  • need user-tolerance experiments

29
Strengths of Human Reading
  • Literature on the psychophysics of reading is
    relevant
  • familiarity helps, e.g. English words
  • optimal word-image size (subtended angle)
  • is known (0.3-2 degrees)
  • optimal contrast conditions known
  • other factors measured for the best performance
  • to achieve and sustain critical reading speed
  • BUT gives no answer to
  • wheres the optimal comfort zone?

G. E. Legge, D. G. Pelli, G. S. Rubin, M. M.
Schleske, Psychophysics of Reading I. normal
vision, Vision Research 25(2), 1985.
A. J. Grainger J. Segui, Neighborhood
Frequency Effects in Visual Word Recognition,
Perception Psychophysics 47, 1990..
30
Designing a Stronger CAPTCHA BaffleText
principles
  • Nonsense words.
  • generate pronounceable not spellable
    words
  • using a variable-length character n-gram
    Markov model
  • they look familiar, but arent in any lexicon,
    e.g.
  • ablithan wouquire quasis
  • Gestalt perception.
  • force inference of a whole word-image
  • from fragmentary or occluded characters,
    e.g.
  • using a single familiar typeface also helps

M. Chew H. S. Baird, BaffleText A Human
Interactive Proof, Proc., SPIE/IST Conf. on
Document Recognition Retrieval X, Santa Clara,
CA, January 23-24, 2003.
31
Mask Degradations
  • Parameters of pseudorandom mask generator
  • shape type square, circle, ellipse, mixed
  • density black-area / whole-area
  • range of radii of shapes

32
BaffleText Experiments at PARC
  • Goal map the margins of accurate comfortable
  • human reading on this family of
    images
  • Metrics
  • objective difficulty accuracy
  • subjective difficulty rating
  • response time
  • exit survey how tolerable overall
  • Participation
  • 41 individual sessions
  • 1200 challenge/response trials
  • 18 exit surveys

33
BaffleText challenge webpage
34
BaffleText user ratings
35
User Acceptance
  • Subjects willing to solve a BaffleText
  • 17 every time they send email
  • 39 if it cut spam by 10x
  • 89 every time they register for an
    e-commerce site
  • 94 if it led to more trustworthy
    recommendations
  • 100 every time they register for an email
    account

Out of 18 responses to the exit survey.
36
Subjective difficulty tracks objective
difficulty
37
How to engineer BaffleText
  • When we generate a challenge,
  • need to estimate its difficulty
  • throw away if too easy or too hard
  • Apply an idea from the psychophysics of reading
  • image complexity metric how hard to read
  • simple to compute perimeter? / black-area

38
Image complexity predicts objective difficulty
39
Image complexity predicts subjective difficulty
40
Engineering guidelines
  • For high performance, image complexity
  • should fall in the range 50-100 e.g.
  • Within this regime, BaffleText performs well
  • 100 human subjects willing to try to read it
  • 89 accuracy by humans
  • 0 accuracy by commercial OCR
  • 3.3 difficulty rating, out of 10 (on average)
  • 8.7 seconds / trial on average

41
The latest serious (known or published) attack
G. Mori J. Malik, Recognizing Objects in
Adversarial Clutter, submitted to CVPR03,
Madison, WI, June 16-22, 2003.
  • Greg Mori Jitendra Malik (UCB-CS)
  • Generalized Shape Context CV method
  • requires known lexicon else, fails completely
  • expects known font (or fonts) else, does worse
  • Results of Mori-Malik attacks (Dec 2002)
    given
  • perfect foreknowledge of both lexicon and
    font

42
BaffleText the strongest known CAPTCHA?
  • Resists many known algorithmic attacks
  • physics-based image restoration
  • recognizing into a lexicon
  • known-typeface targeting
  • segmenting then recognizing
  • Exploits hard-to-automate human cognition powers
  • Gestalt perception
  • semi-linguistic familiarity
  • within-typeface style consistency

43
Recent Microsoft CAPTCHA
  • Random strings, local space-warping plus
    meaningless curving strokes, both black
    (overlaid) and white (erasing)
  • Fielded Dec 2002 on Passport (HotMail, etc)
  • Immediate reduction in new Hotmail accounts, with
    virtually no user complaints

P. Y. Simard, R. Szeliski, J. Benaloh, J.
Couvreur, I. Calinov, Using Character
Recognition and Segmentation to Tell Computer
from Humans, Proc., Intl Conf. on Document
Analysis Recognition, Edinburgh, Scotland,
August, 2003 to appear.
44
PARCs Leadership in RD on Reading-based
CAPTCHAs
  • First refereed article on CAPTCHAs
  • A. L. Coates, H. S. Baird, R. Fateman,
    Pessimal Print a Reverse Turing Test, Proc.,
    6th IAPR Intl Conf. On Document Analysis
    Recognition, Seattle, WA, Sept. 10-13, 2001.
  • First professional HIP event, organized by PARC
    1st NSF Intl Workshop on HIPs, Jan. 9-11,
    2002, PARC, Palo Alto, CA.
  • First to play both offense defense
  • builds high-performance OCR systems attacks
    CAPTCHAs
  • builds strong CAPTCHAs
  • First to validate using human-factors research
  • human-subject trials measuring both accuracy
    tolerance
  • PARCs interdisciplinary tradition social
    computer sciences

45
The Arms Race
  • When will serious technical attacks be launched?
  • spam kings make millions
  • two spam-blocking e-commerce firms now use
    CAPTCHAs
  • How long can a CAPTCHA withstand attack?
  • especially if its algorithms are published or
    guessed
  • Strategy keep a pipeline of defenses in
    reserve
  • continuing partnership between RD users

46
Lots of Open Research Questions
  • What are the most intractable obstacles to
    machine vision?
  • segmentation, occlusion, degradations, ?
  • Under what conditions is human reading most
    robust?
  • linguistic semantic context, Gestalt, style
    consistency?
  • Where are ability gaps located?
  • quantitatively, not just qualitatively
  • How to generate challenges strictly within
    ability gaps?
  • fully automatically
  • an indefinitely long sequence of distinct
    challenges

47
HIP Research Community
  • PARC CAPTCHA website
  • www.parc.com/istl/projects/captcha
  • HIP2002 Workshop
  • www.parc.com/istl/groups/did/HIP2002
  • HIP Website at Aladdin Center, CMU-SCS
  • www.captcha.net
  • Volunteers for a PARC CAPTCHA usability test?
  • A 2nd HIP Workshop soon?

48
Alan Turing might have enjoyed the irony
  • A technical problem machine reading
  • which he thought would be easy,
  • has resisted attack for 50 years, and
  • now allows the first widespread
  • practical use of variants of
  • his test for artificial intelligence.

49
Contact
  • Henry S. Baird
  • baird_at_parc.com
  • www.parc.com/baird
Write a Comment
User Comments (0)
About PowerShow.com