Handwritten Word Recognition: A New CAPTCHA Challenge - PowerPoint PPT Presentation

About This Presentation
Title:

Handwritten Word Recognition: A New CAPTCHA Challenge

Description:

First CAPTCHA designed in 1997 (for AltaVista website URL filter) CMU ... AltaVista URL filter uses isolated random characters and digits on a cluttered background. ... – PowerPoint PPT presentation

Number of Views:501
Avg rating:3.0/5.0
Slides: 34
Provided by: cedarB
Category:

less

Transcript and Presenter's Notes

Title: Handwritten Word Recognition: A New CAPTCHA Challenge


1
Handwritten Word Recognition A New CAPTCHA
Challenge
  • Amalia Rusu and Venu Govindaraju
  • CEDAR
  • University at Buffalo

2
CAPTCHA
  • Completely Automatic Public Turing test to tell
    Computers and Humans Apart
  • An automated test that humans can pass but
    current computer programs fail beyond the
    state-of-the-art
  • Exploits the difference in abilities between
    humans and machines
  • (i.e. text, speech or facial features
    recognition)
  • A new formulation of the Alan Turings test -
    Can machines think?

3
Objective
  • Example of interface and handwritten CAPTCHA to
    confirm registration.

4
User Authentication Steps using HCAPTCHA
Automatic Authentication Session for Web Services.
  1. Initialization
  2. Handwritten CAPTCHA Challenge
  3. User Response
  4. Verification

5
Desirable Properties
  • CAPTCHA should be automatically generated and
    graded
  • Test can be taken quickly and easily by human
    users
  • Test will accept virtually all human users and
    reject software agents
  • Test will resist automatic attack for many years
    despite the technology advances and prior
    knowledge of algorithms

6
Previous Work
  • First CAPTCHA designed in 1997 (for AltaVista
    website URL filter)
  • CMU
  • Gimpy, EZ-Gimpy, Gimpy-R, Bongo, Pix, Eco
  • PARC
  • BaffleText
  • UCB PARC
  • PessimalPrint
  • Microsoft
  • ARTiFACIAL
  • Bell Labs
  • Reverse Turing test using speech
  • GIT
  • Character morphing

7
CAPTCHA Tests
AltaVista URL filter uses isolated random
characters and digits on a cluttered background.
PessimalPrint uses a degradation model simulating
physical defects caused by copying and scanning
of printed text.
BaffleText uses pronounceable character strings
that are not in the English dictionary and render
the character string using a font into an image
(without physics-based degradations) then
generate a mask image as shown above.
8
CAPTCHA Tests
EZ-Gimpy uses real English words.
Gimpy Type 3 different English words appearing in
the picture above.
Gimpy-R uses nonsense words.
Character morphing algorithm that transforms a
string into its graphical form.
9
Why Handwritten CAPTCHA?
  • No handwritten text based CAPTCHA exists - so
    far!!!
  • Several machine printed text based CAPTCHA
    already broken
  • Greg Mori and Jitendra Malik of the UCB have
    written a program that can solve Ez-Gimpy with
    accuracy 83
  • Thayananthan, Stenger, Torr, and Cipolla of the
    Cambridge vision group have written a program
    that can achieve 93 correct recognition rate
    against Ez-Gimpy
  • Gabriel Moy, Nathan Jones, Curt Harkless, and
    Randy Potter of Areté Associates have written a
    program that can achieve 78 accuracy against
    Gimpy-R
  • Machine recognition of handwriting is more
    difficult than printed text
  • Handwriting recognition is a task that humans
    perform easily and reliably
  • Research is in the early stages - a promising
    field
  • Handwritten CAPTCHAs will challenge the KBCS
    community!

10
State-of-the-art
Lexicon size Lexicon Driven Lexicon Driven Lexicon Driven Grapheme Model Grapheme Model Grapheme Model
Lexicon size time (secs) accuracy accuracy time (secs) accuracy accuracy
Lexicon size time (secs) Top 1 Top 2 time (secs) Top 1 Top 2
10 0.027 96.53 98.73 0.021 96.56 98.77
100 0.044 89.22 94.13 0.031 89.12 94.06
1000 0.144 75.38 86.29 0.089 75.38 86.29
20000 1.827 58.14 66.56 0.994 58.14 66.49
  • Speed and accuracy of a HR. Feature extraction
    time is excluded. Testing platform is an
    Ultra-SPARC.

11
Source of Errors for HW Recognizers
  • Image quality
  • Background noise, printing surface, writing
    styles
  • Image features
  • Variable stroke width, slope, rotations,
    stretching, compressing
  • Segmentation errors
  • Over-segmentation, merging, fragmentation,
    ligatures, scrawls
  • Recognition errors
  • Confusion with similar lexicon entries, large
    lexicons

12
Creating H-CAPTCHAS
  • Use handwritten word images that current
    recognizers cannot read
  • Controlled distortion of existing handwritten
    word images
  • Create handwritten images by concatenating
    handwritten character images
  • Use handwritten US city name images (4,000 from
    CEDAR CDROM)
  • Character images were discretely printed to begin
    with
  • Character images are automatically segmented out
    of handwritten word images
  • Use set of 20,000 handwritten character images
    (extracted by program)
  • Synthesize sentence images by gluing together
    isolated upper and lower case handwritten
    characters or word images

13
H-CAPTCHA Generation Algorithm
  • Input.
  • Original (random) handwritten image (existing US
    city name image or synthetic word image with
    length 5 to 8 characters or meaningful sentence).
  • Lexicon containing the images truth word.
  • Output.
  • H-CAPTCHA image.
  • Method.
  • Randomly choose a number of transformations
  • Randomly establish the transformations
    corresponding to the given number from add
    lines, circles, grids, arcs, background noise
    (multiplicative or impulse), random convolution
    masks, blur, wave, spread, median filters, thick
    or thin characters on vertical or horizontal
    fashion, etc.
  • A priori order is assigned to each transformation
    based on experimental results. Sort the list of
    chosen transformations based on their priority
    order and apply them in sequence, so that the
    effect is cumulative.

14
Handwritten text images
  • Examples of handwritten characters used to
    generate random words.

Examples of handwritten US city name images used
as a base for transformations.
Examples of synthetic handwritten sentence images.
15
H-CAPTCHA by Image Quality Transforms
Add lines, grids, arcs, background noise,
convolution masks and special filters
16
H-CAPTCHA by Image Features Transforms
Variable stroke width, slope, rotations,
stretching, compressing
17
H-CAPTCHA by Segmentation Transform
Delete ligatures, use touching letters/digits,
merge characters for over segmentation or to be
unable to segment
18
H-CAPTCHA by Lexicon Transform
  • Lexicon challenges size, density, availability

19
H-CAPTCHA Evaluation
  • No risk of image repetition
  • Image generation completely automated words,
    images and distortions chosen at random
  • The transformed images cannot be easily
    normalized or rendered noise free by present
    computer programs, although original images must
    be public knowledge
  • Deformed images do not pose problems to humans
  • Human subjects succeeded on our test images
  • Test against state-of-the-art WMR, Accuscript
  • CAPTCHAs unbroken by CEDAR recognizers

20
H-CAPTCHAs
  • Handwritten US city name images that defeat both
    WMR and Accuscript recognizers.

21
H-CAPTCHA Challenge
Word Recognizers Number of Recognized Images Accuracy
WMR 383 9.28
Accuscript 182 4.41
Low accuracy of handwriting recognizers. The
lexicons are created so as to contain all the
truths of test images. Total number of tested
images is 4,127 (and so is the lexicon size)
Number of Students Number of Test Images Humans Accuracy WMR Accuracy Accuscript Accuracy
12 15 82 0 0
Low accuracy of handwriting recognizers vs.
humans on a subset of test images.
22
CAPTCHA using Gestalt Psychology
  • Gestalt psychology is based on the observation
    that we often experience things that are not a
    part of our simple sensations
  • What we are seeing is an effect of the whole
    event, not contained in the sum of the parts
    (holistic approach)
  • Organizing principles - Gestalt laws
  • law of closure
  • law of similarity
  • law of proximity
  • law of symmetry
  • law of continuity
  • law of familiarity
  • figure and ground
  • Not restricted to perception
  • memory

OXXXXXX XOXXXXX XXOXXXX XXXOXXX XXXXOXX
XXXXXOXXXXXXXO
              

23
H-CAPTCHA based on Gestalt Laws
Gestalt laws law of proximity, symmetry,
familiarity, continuity
Methods create horizontal or vertical overlaps -
for same words smaller distance overlaps
- for different words
bigger distance overlaps
24
H-CAPTCHA based on Gestalt Laws
Gestalt laws law of closure, proximity,
continuity
Methods create occlusions by circles,
rectangles, lines with random angles
25
H-CAPTCHA based on gestalt laws
Gestalt laws law of closure, proximity,
continuity
Methods add occlusions by waves from left to
right on entire image, with various amplitudes /
wavelength or rotate them by an angle
26
H-CAPTCHA based on Gestalt Laws
Gestalt laws law of closure, proximity,
continuity, background
Methods use empty letters, broken letters, edgy
contour, fragmentation
27
H-CAPTCHA based on Gestalt Laws
Gestalt laws memory, internal metrics,
familiarity of letters
vertical mirror difficult for humans
horizontal mirror difficult for humans
flip-flop OK for humans!!
Methods change word orientation entirely, or the
orientation for few letters only
28
Gestalt H-CAPTCHA Results
Word Recognizers Horizontal Overlap (Small) Horizontal Overlap (Large) Vertical Overlap Occlusion by waves Occlusion by circles Empty Letters Less Fragment-ation More Fragment-ation Old Transforms
WMR 24.35 12.93 27.88 15.43 35.93 0.89 0 0.48 9.28
Accuscript 2.93 2.42 12.64 10.56 32.34 0.06 0.18 0 4.41
Tested images is 4,127 for each type of
transformation.
29
Future Work
Personalizing Email Addresses
  • Creates transformed alias e-mail addresses to
    prevent mining by software agents

30
Future Work
Adult vs. Child vs. Machine
  • Few methods to differentiate between adult vs.
    child
  • Asking a question that has the answer in the
    handwritten sentence
  • Giving an incomplete handwritten sentence and
    asking to imply the missing word
  • Comparing the handwritten text with a standard
    word list
  • Using longer, more complicated handwritten
    sentences, using advanced topics from technical
    fields such as math, physics, or financial
  • Useful on Internet services due to expansion of
    harmful minor websites
  • Reading abilities delimitation
  • Machine vs. 1st grade child
  • Adult vs. 7th grade child

31
Future Work
  • HCAPTCHA based on Handwritten Sentence Reading
    and Understanding
  • Incorporate and adjust the image complexity
    factor as a parameter of error
  • Try out more image transformations and compare
    results against humans performance
  • Cognitive aspects of HCAPTCHA for adult vs. child
    protocol
  • HCAPTCHA as a Challenge Response Protocol for
    Security Systems
  • Online-Handwriting CAPTCHA
  • HCAPTCHA as a Biometric?
  • HCAPTCHA normalization concerns based on future
    technology development

32
  • Thank You

33
Handwritten CAPTCHA Applications
  • Wide variety on the web applications
  • Suppressing SPAM and worms
  • Only accept an email if I know there is a human
    behind the other computer.
  • Prove you are human before you can get a free
    email account.
  • Search engine boots
  • There is an html tag to prevent search engine
    bots from reading web pages it only
  • serves to say "no bots, please, but not
    guarantee that bots won't enter a web site.
  • Thwarting password guessing
  • Prevent a computer from being able to iterate
    through the entire space of passwords.
  • Blocking denial-of-service attacks
  • Prevent congestion based DoS attacks from denying
    any users access to web servers
  • targeted by those attacks.

34
Handwritten CAPTCHA Applications
  • Preventing ballot stuffing
  • Can the result of any online poll be trusted? Not
    unless the poll requires that only
  • humans can vote.
  • Protecting databases
  • I.e. eBay protecting the data from auction
    portals that search across auction sites to
  • provide listings and price information for their
    users, but prohibiting copying that
  • data
  • Email addresses personalization
  • You will only be able to read the address and
    send the email if you are a human.

35
CAPTCHA Tests
  • PIX
  • Uses a large database of labeled images. All of
    these images are pictures of concrete objects (a
    horse, a table, a house, a flower, etc). In our
    example an egg. The program picks an object at
    random, finds 4 random images of that object from
    its database, distorts them at random, presents
    them to the user and then asks the question "what
    are these pictures of?"

ECO Sounds can be thought of as a sound version
of Gimpy. The program picks a word or a sequence
of numbers at random, renders the word or the
numbers into a sound clip and distorts the clip.
It then presents the distorted sound clip to its
user and asks the user to type in the contents of
the sound clip.
36
CAPTCHA Tests
  • ARTiFACIAL
  • Per each user request, it automatically
    synthesizes an image with a distorted face
    embedded in a cluttered background. The user is
    asked to first find the face and then click on 6
    points (4 eye corners and 2 mouth corners) on the
    face.

37
Power of Context
Context
Ranked Lexicon
38
Lexicon Driven Model
Distance between lexicon entry word first
character w and the image between - segments 1
and 4 is 5.0 - segments 1 and 3 is 7.2 - segments
1 and 2 is 7.6
Find the best way of accounting for characters
w, o, r, d buy consuming all segments 1
to 8 in the process
39
Lexicon Free Model
  • Image from 1 to 3 is a in with 0.5 confidence
  • Image from segment 1 to 4 is a w with 0.7
    confidence
  • Image from segment 1 to 5 is a w with 0.6
    confidence and an m with 0.3 confidence

w.6, m.3
w.7
d.8
o.5
u.5, v.2
i.8, l.8
i.7
r.4
u.3
m.2
m.1
Find the best path in graph from segment 1 to 8 w
o r d
40
Grapheme Model
Loops
End
Junction
End
Loop
Turns
41
Matching - Structural Features
Statistical analysis of the feature attributes
42
Hidden Markov Models
  • The occurrence of the structural features can be
    modeled as a HMM
  • The HMM can be converted to a SFSA by assigning
    observation and probability to the transitions
    instead of to the states

43
Law of closure

If something is missing in an otherwise complete
figure, we will tend to add it (i.e. a triangle,
for example, with a small part of its edge
missing, will still be seen as a triangle). We
will close the gap. A set of dots outlining the
shape of a B is likely to be perceived as a B,
not as a set of dots.  We tend to complete the
figure, make it the way it should be, finish
it.
44
Law of similarity
OXXXXXXXXXX XOXXXXXXXXX XXOXXXXXXXX
XXXOXXXXXXX XXXXOXXXXXX XXXXXOXXXXX
XXXXXXOXXXX XXXXXXXOXXX XXXXXXXXOXX
XXXXXXXXXOX XXXXXXXXXXO

We tend to group similar items together, to see
them as forming a larger form. It is just natural
for us to see the os as a line within a field of
xs.
45
Law of proximity

Things that are close together are seen as
belonging together. You are much more likely to
see three lines of close-together s than 14
vertical collections of 3 s each.
46
Law of symmetry
              
Despite the pressure of proximity to group the
brackets nearest each other together, symmetry
overwhelms our perception and makes us see them
as pairs of symmetrical brackets.
47
Law of continuity
We can see a line, for example, as continuing
through another line, rather than stopping and
starting, as in this example, which we see as
composed of two lines, not as a combination of
two angle.
  1. Ambiguous segmentation
  2. Segmentation based on good continuity, follows
    the path of minimal curvature change
  3. Perceptually implausible segmentation

48
Law of familiarity
The elements are grouped together if we are used
to seeing them together, i.e. we are used to
seeing rectangles and squares rather than the
shape in (c).
  1. Ambiguous segmentation
  2. Perceptual segmentation
  3. Segmentation based on good continuity proves to
    be erroneous

49
Figure and ground
We seem to have an innate tendency to perceive
one aspect of an event as the figure or
fore-ground and the other as the ground or
back-ground.  There is only one image here, and
yet, by changing nothing but our attitude, we can
see two different things.  It doesnt even seem
to be possible to see them both at the same time!
50
Memory
  • If you see an irregular figure, it is likely that
    your memory will straighten it out for you a
    bit. 
  • Or, if you experience something that doesnt
    quite make sense to you, you will tend to
    remember it as having meaning that may not have
    been there. Good example are dreams  Watch
    yourself the next time you tell someone a dream
    and see if you dont notice yourself modifying
    the dream a little to force it to make sense!
  • The world is an outside iconic memory with
    internal metric relations.

After flip-flop (vertical mirror / horizontal
mirror)
Write a Comment
User Comments (0)
About PowerShow.com