Handwritten Word Recognition: A New CAPTCHA Challenge - PowerPoint PPT Presentation

About This Presentation

Title:

Handwritten Word Recognition: A New CAPTCHA Challenge

Description:

First CAPTCHA designed in 1997 (for AltaVista website URL filter) CMU ... AltaVista URL filter uses isolated random characters and digits on a cluttered background. ... – PowerPoint PPT presentation

Number of Views:501

Avg rating:3.0/5.0

Slides: 34

Provided by: cedarB

Learn more at: https://cedar.buffalo.edu

Category:

more less

Transcript and Presenter's Notes

Title: Handwritten Word Recognition: A New CAPTCHA Challenge

1
Handwritten Word Recognition A New CAPTCHA
Challenge

Amalia Rusu and Venu Govindaraju
CEDAR
University at Buffalo

2
CAPTCHA

Completely Automatic Public Turing test to tell
Computers and Humans Apart
An automated test that humans can pass but
current computer programs fail beyond the
state-of-the-art
Exploits the difference in abilities between
humans and machines
(i.e. text, speech or facial features
recognition)
A new formulation of the Alan Turings test -
Can machines think?

3
Objective

Example of interface and handwritten CAPTCHA to
confirm registration.

4
User Authentication Steps using HCAPTCHA
Automatic Authentication Session for Web Services.

Initialization
Handwritten CAPTCHA Challenge
User Response
Verification

5
Desirable Properties

CAPTCHA should be automatically generated and
graded
Test can be taken quickly and easily by human
users
Test will accept virtually all human users and
reject software agents
Test will resist automatic attack for many years
despite the technology advances and prior
knowledge of algorithms

6
Previous Work

First CAPTCHA designed in 1997 (for AltaVista
website URL filter)
CMU
Gimpy, EZ-Gimpy, Gimpy-R, Bongo, Pix, Eco
PARC
BaffleText
UCB PARC
PessimalPrint
Microsoft
ARTiFACIAL
Bell Labs
Reverse Turing test using speech
GIT
Character morphing

7
CAPTCHA Tests
AltaVista URL filter uses isolated random
characters and digits on a cluttered background.
PessimalPrint uses a degradation model simulating
physical defects caused by copying and scanning
of printed text.
BaffleText uses pronounceable character strings
that are not in the English dictionary and render
the character string using a font into an image
(without physics-based degradations) then
generate a mask image as shown above.
8
CAPTCHA Tests
EZ-Gimpy uses real English words.
Gimpy Type 3 different English words appearing in
the picture above.
Gimpy-R uses nonsense words.
Character morphing algorithm that transforms a
string into its graphical form.
9
Why Handwritten CAPTCHA?

No handwritten text based CAPTCHA exists - so
far!!!
Several machine printed text based CAPTCHA
already broken
Greg Mori and Jitendra Malik of the UCB have
written a program that can solve Ez-Gimpy with
accuracy 83
Thayananthan, Stenger, Torr, and Cipolla of the
Cambridge vision group have written a program
that can achieve 93 correct recognition rate
against Ez-Gimpy
Gabriel Moy, Nathan Jones, Curt Harkless, and
Randy Potter of Areté Associates have written a
program that can achieve 78 accuracy against
Gimpy-R
Machine recognition of handwriting is more
difficult than printed text
Handwriting recognition is a task that humans
perform easily and reliably
Research is in the early stages - a promising
field
Handwritten CAPTCHAs will challenge the KBCS
community!

10
State-of-the-art
Lexicon size Lexicon Driven Lexicon Driven Lexicon Driven Grapheme Model Grapheme Model Grapheme Model
Lexicon size time (secs) accuracy accuracy time (secs) accuracy accuracy
Lexicon size time (secs) Top 1 Top 2 time (secs) Top 1 Top 2
10 0.027 96.53 98.73 0.021 96.56 98.77
100 0.044 89.22 94.13 0.031 89.12 94.06
1000 0.144 75.38 86.29 0.089 75.38 86.29
20000 1.827 58.14 66.56 0.994 58.14 66.49

Speed and accuracy of a HR. Feature extraction
time is excluded. Testing platform is an
Ultra-SPARC.

11
Source of Errors for HW Recognizers

Image quality
Background noise, printing surface, writing
styles
Image features
Variable stroke width, slope, rotations,
stretching, compressing
Segmentation errors
Over-segmentation, merging, fragmentation,
ligatures, scrawls
Recognition errors
Confusion with similar lexicon entries, large
lexicons

12
Creating H-CAPTCHAS

Use handwritten word images that current
recognizers cannot read
Controlled distortion of existing handwritten
word images
Create handwritten images by concatenating
handwritten character images
Use handwritten US city name images (4,000 from
CEDAR CDROM)
Character images were discretely printed to begin
with
Character images are automatically segmented out
of handwritten word images
Use set of 20,000 handwritten character images
(extracted by program)
Synthesize sentence images by gluing together
isolated upper and lower case handwritten
characters or word images

13
H-CAPTCHA Generation Algorithm

Input.
Original (random) handwritten image (existing US
city name image or synthetic word image with
length 5 to 8 characters or meaningful sentence).
Lexicon containing the images truth word.
Output.
H-CAPTCHA image.
Method.
Randomly choose a number of transformations
Randomly establish the transformations
corresponding to the given number from add
lines, circles, grids, arcs, background noise
(multiplicative or impulse), random convolution
masks, blur, wave, spread, median filters, thick
or thin characters on vertical or horizontal
fashion, etc.
A priori order is assigned to each transformation
based on experimental results. Sort the list of
chosen transformations based on their priority
order and apply them in sequence, so that the
effect is cumulative.

14
Handwritten text images

Examples of handwritten characters used to
generate random words.

Examples of handwritten US city name images used
as a base for transformations.
Examples of synthetic handwritten sentence images.
15
H-CAPTCHA by Image Quality Transforms
Add lines, grids, arcs, background noise,
convolution masks and special filters
16
H-CAPTCHA by Image Features Transforms
Variable stroke width, slope, rotations,
stretching, compressing
17
H-CAPTCHA by Segmentation Transform
Delete ligatures, use touching letters/digits,
merge characters for over segmentation or to be
unable to segment
18
H-CAPTCHA by Lexicon Transform

Lexicon challenges size, density, availability

19
H-CAPTCHA Evaluation

No risk of image repetition
Image generation completely automated words,
images and distortions chosen at random
The transformed images cannot be easily
normalized or rendered noise free by present
computer programs, although original images must
be public knowledge
Deformed images do not pose problems to humans
Human subjects succeeded on our test images
Test against state-of-the-art WMR, Accuscript
CAPTCHAs unbroken by CEDAR recognizers

20
H-CAPTCHAs

Handwritten US city name images that defeat both
WMR and Accuscript recognizers.

21
H-CAPTCHA Challenge
Word Recognizers Number of Recognized Images Accuracy
WMR 383 9.28
Accuscript 182 4.41
Low accuracy of handwriting recognizers. The
lexicons are created so as to contain all the
truths of test images. Total number of tested
images is 4,127 (and so is the lexicon size)
Number of Students Number of Test Images Humans Accuracy WMR Accuracy Accuscript Accuracy
12 15 82 0 0
Low accuracy of handwriting recognizers vs.
humans on a subset of test images.
22
CAPTCHA using Gestalt Psychology

Gestalt psychology is based on the observation
that we often experience things that are not a
part of our simple sensations
What we are seeing is an effect of the whole
event, not contained in the sum of the parts
(holistic approach)
Organizing principles - Gestalt laws
law of closure
law of similarity
law of proximity
law of symmetry
law of continuity
law of familiarity
figure and ground
Not restricted to perception
memory

OXXXXXX XOXXXXX XXOXXXX XXXOXXX XXXXOXX
XXXXXOXXXXXXXO

23
H-CAPTCHA based on Gestalt Laws
Gestalt laws law of proximity, symmetry,
familiarity, continuity
Methods create horizontal or vertical overlaps -
for same words smaller distance overlaps
- for different words
bigger distance overlaps
24
H-CAPTCHA based on Gestalt Laws
Gestalt laws law of closure, proximity,
continuity
Methods create occlusions by circles,
rectangles, lines with random angles
25
H-CAPTCHA based on gestalt laws
Gestalt laws law of closure, proximity,
continuity
Methods add occlusions by waves from left to
right on entire image, with various amplitudes /
wavelength or rotate them by an angle
26
H-CAPTCHA based on Gestalt Laws
Gestalt laws law of closure, proximity,
continuity, background
Methods use empty letters, broken letters, edgy
contour, fragmentation
27
H-CAPTCHA based on Gestalt Laws
Gestalt laws memory, internal metrics,
familiarity of letters
vertical mirror difficult for humans
horizontal mirror difficult for humans
flip-flop OK for humans!!
Methods change word orientation entirely, or the
orientation for few letters only
28
Gestalt H-CAPTCHA Results
Word Recognizers Horizontal Overlap (Small) Horizontal Overlap (Large) Vertical Overlap Occlusion by waves Occlusion by circles Empty Letters Less Fragment-ation More Fragment-ation Old Transforms
WMR 24.35 12.93 27.88 15.43 35.93 0.89 0 0.48 9.28
Accuscript 2.93 2.42 12.64 10.56 32.34 0.06 0.18 0 4.41
Tested images is 4,127 for each type of
transformation.
29
Future Work
Personalizing Email Addresses

Creates transformed alias e-mail addresses to
prevent mining by software agents

30
Future Work
Adult vs. Child vs. Machine

Few methods to differentiate between adult vs.
child
Asking a question that has the answer in the
handwritten sentence
Giving an incomplete handwritten sentence and
asking to imply the missing word
Comparing the handwritten text with a standard
word list
Using longer, more complicated handwritten
sentences, using advanced topics from technical
fields such as math, physics, or financial
Useful on Internet services due to expansion of
harmful minor websites

Reading abilities delimitation
Machine vs. 1st grade child
Adult vs. 7th grade child

31
Future Work

HCAPTCHA based on Handwritten Sentence Reading
and Understanding
Incorporate and adjust the image complexity
factor as a parameter of error
Try out more image transformations and compare
results against humans performance
Cognitive aspects of HCAPTCHA for adult vs. child
protocol
HCAPTCHA as a Challenge Response Protocol for
Security Systems
Online-Handwriting CAPTCHA
HCAPTCHA as a Biometric?
HCAPTCHA normalization concerns based on future
technology development

Thank You

33
Handwritten CAPTCHA Applications

Wide variety on the web applications
Suppressing SPAM and worms
Only accept an email if I know there is a human
behind the other computer.
Prove you are human before you can get a free
email account.
Search engine boots
There is an html tag to prevent search engine
bots from reading web pages it only
serves to say "no bots, please, but not
guarantee that bots won't enter a web site.
Thwarting password guessing
Prevent a computer from being able to iterate
through the entire space of passwords.
Blocking denial-of-service attacks
Prevent congestion based DoS attacks from denying
any users access to web servers
targeted by those attacks.

34
Handwritten CAPTCHA Applications

Preventing ballot stuffing
Can the result of any online poll be trusted? Not
unless the poll requires that only
humans can vote.
Protecting databases
I.e. eBay protecting the data from auction
portals that search across auction sites to
provide listings and price information for their
users, but prohibiting copying that
data
Email addresses personalization
You will only be able to read the address and
send the email if you are a human.

35
CAPTCHA Tests

PIX
Uses a large database of labeled images. All of
these images are pictures of concrete objects (a
horse, a table, a house, a flower, etc). In our
example an egg. The program picks an object at
random, finds 4 random images of that object from
its database, distorts them at random, presents
them to the user and then asks the question "what
are these pictures of?"

ECO Sounds can be thought of as a sound version
of Gimpy. The program picks a word or a sequence
of numbers at random, renders the word or the
numbers into a sound clip and distorts the clip.
It then presents the distorted sound clip to its
user and asks the user to type in the contents of
the sound clip.
36
CAPTCHA Tests

ARTiFACIAL
Per each user request, it automatically
synthesizes an image with a distorted face
embedded in a cluttered background. The user is
asked to first find the face and then click on 6
points (4 eye corners and 2 mouth corners) on the
face.

37
Power of Context
Context
Ranked Lexicon
38
Lexicon Driven Model
Distance between lexicon entry word first
character w and the image between - segments 1
and 4 is 5.0 - segments 1 and 3 is 7.2 - segments
1 and 2 is 7.6
Find the best way of accounting for characters
w, o, r, d buy consuming all segments 1
to 8 in the process
39
Lexicon Free Model

Image from 1 to 3 is a in with 0.5 confidence
Image from segment 1 to 4 is a w with 0.7
confidence
Image from segment 1 to 5 is a w with 0.6
confidence and an m with 0.3 confidence

w.6, m.3
w.7
d.8
o.5
u.5, v.2
i.8, l.8
i.7
r.4
u.3
m.2
m.1
Find the best path in graph from segment 1 to 8 w
o r d
40
Grapheme Model
Loops
End
Junction
End
Loop
Turns
41
Matching - Structural Features
Statistical analysis of the feature attributes
42
Hidden Markov Models

The occurrence of the structural features can be
modeled as a HMM
The HMM can be converted to a SFSA by assigning
observation and probability to the transitions
instead of to the states

43
Law of closure

If something is missing in an otherwise complete
figure, we will tend to add it (i.e. a triangle,
for example, with a small part of its edge
missing, will still be seen as a triangle). We
will close the gap. A set of dots outlining the
shape of a B is likely to be perceived as a B,
not as a set of dots. We tend to complete the
figure, make it the way it should be, finish
it.
44
Law of similarity
OXXXXXXXXXX XOXXXXXXXXX XXOXXXXXXXX
XXXOXXXXXXX XXXXOXXXXXX XXXXXOXXXXX
XXXXXXOXXXX XXXXXXXOXXX XXXXXXXXOXX
XXXXXXXXXOX XXXXXXXXXXO

We tend to group similar items together, to see
them as forming a larger form. It is just natural
for us to see the os as a line within a field of
xs.
45
Law of proximity

Things that are close together are seen as
belonging together. You are much more likely to
see three lines of close-together s than 14
vertical collections of 3 s each.
46
Law of symmetry

Despite the pressure of proximity to group the
brackets nearest each other together, symmetry
overwhelms our perception and makes us see them
as pairs of symmetrical brackets.
47
Law of continuity
We can see a line, for example, as continuing
through another line, rather than stopping and
starting, as in this example, which we see as
composed of two lines, not as a combination of
two angle.

Ambiguous segmentation
Segmentation based on good continuity, follows
the path of minimal curvature change
Perceptually implausible segmentation

48
Law of familiarity
The elements are grouped together if we are used
to seeing them together, i.e. we are used to
seeing rectangles and squares rather than the
shape in (c).

Ambiguous segmentation
Perceptual segmentation
Segmentation based on good continuity proves to
be erroneous

49
Figure and ground
We seem to have an innate tendency to perceive
one aspect of an event as the figure or
fore-ground and the other as the ground or
back-ground. There is only one image here, and
yet, by changing nothing but our attitude, we can
see two different things. It doesnt even seem
to be possible to see them both at the same time!
50
Memory

If you see an irregular figure, it is likely that
your memory will straighten it out for you a
bit.
Or, if you experience something that doesnt
quite make sense to you, you will tend to
remember it as having meaning that may not have
been there. Good example are dreams Watch
yourself the next time you tell someone a dream
and see if you dont notice yourself modifying
the dream a little to force it to make sense!
The world is an outside iconic memory with
internal metric relations.

After flip-flop (vertical mirror / horizontal
mirror)

Write a Comment

User Comments (0)