LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition - PowerPoint PPT Presentation

About This Presentation
Title:

LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Description:

Two main features: ... Problems Faced. Train the stroke database needs much time. Two or more strokes maybe stick together. KOCR Stroke Recognition ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 60
Provided by: wongch7
Category:

less

Transcript and Presenter's Notes

Title: LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition


1
LYU0203Smart Traveller with Visual
Translatorfor OCR and Face Recognition
Department of Computer Science Engineering The
Chinese University of Hong Kong
Supervised by Prof. LYU, Rung Tsong Michael
Prepared by Wong Chi Hang Tsang Siu Fung
2
Outline
  • Introduction
  • Overall Design
  • Korean OCR
  • Face Detection
  • Future Work

3
Introduction What is VTT?
  • Smart Traveller with Visual Translator (VTT)
  • Mobile Device which is convenient for a traveller
    to carry
  • Mobile Phone, Pocket PC, Palm, etc.
  • Recognize and translate the foreign text into
    native language
  • Detect and recognize the face into name

4
Introduction Motivation
  • More and more people have mobile device which
    include Pocket PC, Palm, mobile phone.
  • Mobile Device becomes more powerful.
  • There are many people travelling aboard

5
Introduction Motivation (Cont.)
  • Types of programs for Mobile Device
  • Communication and Network
  • Multimedia
  • Games
  • Personal management
  • System tool
  • Utility

6
Introduction Motivation (Cont.)
  • Application for traveller?
  • Almost no!!!
  • Very often, travellers encounter many problems
    about unfamiliar foreign language
  • Therefore, the demand of an application for
    traveller is very large.

7
Introduction Objective
  • Help travellers to overcome language and memory
    power problems
  • Two main features
  • Recognize and translate Korean to English (Korean
    is not understandable for us)
  • Detect and recognize the face (Sometimes we
    forget the name of a friend)

8
Introduction Objective (Cont.)
  • Target of Korean OCR
  • Signs and Guideposts
  • Printed Characters
  • Contrast Text Color and Background Color
  • Target of Face Recognizer
  • One face in photo
  • Frontal face
  • Limited set of faces

9
Introduction Objective (Cont.)
  • Real Life Examples
  • Sometimes we lose the way, we need to know where
    we are.
  • Sometimes we forget somebody we met before.

10
Overall Design of VTT System
11
KOCR Design
12
KOCR Text Area Detection
  • Edge Detection using Sobel Filter

-1 -2 -1
0 0 0
1 2 1
-1 0 1
-2 0 2
-1 0 1
13
KOCR Text Area Detection (Cont.)
  • Horizontal and Vertical Edge Projection

14
KOCR Binarization
  • Color Segmentation
  • Base on Color Histogram

Threshold
15
KOCR Stroke Extraction
  • Labeling of Connected Component with
    8-connectivity

16
KOCR Stroke Extraction (Cont.)
  • Why do we choose stroke but not whole character?
  • Korean Character is composed of Some Stroke types
  • Limited Set of Stroke Types in Korean

17
KOCR Stroke Feature
  • Our Proposed Feature
  • Five rays each side
  • Difference of adjacent rays (-1 or 0 or 1)
  • Has holes (0 or 1)
  • Dimension ratio of Stroke (width/height) (-1 or 0
    or 1)

18
KOCR Stroke Feature (Cont.)
  • Problems Faced
  • Train the stroke database needs much time
  • Two or more strokes maybe stick together

19
KOCR Stroke Recognition
  • Exact Matching by Pre-learned Stroke Features
  • Trained Decision Tree

20
KOCR Pattern Identification
  • Six Pattern of Korean Character
  • Identify by simple if-then-else statement

0 1 2
3 4 5
21
Face Detection
  • Outline
  • 1. Find Face Region
  • 2. Find the potential eye region
  • 3. Locate the iris
  • 4. Improvement

22
1. Find Face Region
  • There are three methods available
  • 1. Projection of the image
  • 2. Base on gray-scale image
  • 3. Color-based model

23
1. Find Face Region -Projection of the image
  • Consider only one single color blue, green or
    red.
  • Usually blue pixel value is used because it can
    avoid the interference of the facial feature.
  • Project the blue pixel vertically to find the
    left and right edge of face.

24
1. Find Face Region (Cont.) -Projection of the
image
Sum of pixel value
25
1. Find Face Region (Cont.) -Projection of the
image
  • The image should be filtered out the high
    frequency of this curve by FTT (Fast Fourier
    Transform)
  • Assume the face occupy large area of the image

26
1. Find Face Region -Base on gray-scale image
  • No color information
  • Pattern recognition

27
1. Find Face Region -Color-based model
  • We use this method because of its simplicity and
    robustness.
  • Color-based model is used to represent color.
  • Since human retina has three types of color
    photoreceptor cone cell, color model need three
    numerical components.

28
Color-based model (Cont.)
  • There are many color model such as RGB, YUV
    (luminance-chrominance) and HSB (hue, saturation
    and brightness)
  • Usually RGB color model will be transformed to
    other color model such as YUV and HSB.

29
Color-based model (Cont.) -YUV
  • We use YUV or YCbCr color model.
  • Y component is used to represent the intensity of
    the image
  • Cb and Cr are used to represent the blue and red
    component respectively.

30
Color-based model (Cont.) -YCbCr Image
Original Image -
31
Representation of Face color
  • How can YUV color model represent face color?
  • What happens when we transform the pixel into
    Cr-Cb histogram?

32
Representation of Face color
  • We just use a simple ellipse equation to model
    skin color.

Cr
Cb
33
Representation of Face color
The equation of the ellipse
  • where L is the length of the long axis and S is
    the length of the short axis.
  • We choose L 35.42, S 20.615, ? -0.726
    (radius)

34
Representation of Face color -Color segmentation
  • The white regions represent the skin color pixels

35
Representation of Face color -Color
segmentation (modified version1)
  • We distribute some agents in the image uniformly.
  • Then each agent will check whether the pixel is a
    skin-like pixel and not visited by the other
    agent.
  • If yes, it will produce 4 more agents at its four
    neighboring points.
  • If no, it will moved to one of its four
    neighboring points randomly.

36
Representation of Face color (Cont.) -Color
segmentation (modified version1)
If the pixel is a skin-like pixel and not visited
by the other agent, produce 4 more agents at its
four neighboring points
37
Representation of Face color (Cont.) -Color
segmentation (modified version1)
  • Otherwise, it will moved to one of its four
    neighboring points randomly

38
Representation of Face color (Cont.) -Color
segmentation (modified version1)
  • Each agent will search their own region
  • Each region are shown in the next slide with
    different color.

39
Representation of Face color (Cont.) -Color
segmentation (modified version1)
  • The advantage of this algorithm is that we need
    not to search the whole image.
  • Therefore, it is fast.

40
Representation of Face color (Cont.) -Color
segmentation (modified version1)
  • 19270 of 102900 pixels is searched (about 18.7)
  • There are 37 regions

41
2. Eye detection
  • After the segmentation of face region, we have
    some parts which are not regarded as skin color.
  • They are probably the region of eye and mouth
  • We only consider the red component of these
    regions because it usually includes the most
    information about faces.

42
2. Eye detection (Cont.)
  • We extraction such regions by pseudo-convex hull.

43
2. Eye detection (Cont.)
  • We do the following on the regions of potential
    eye region
  • Histogram equalization
  • Threshold

44
2. Eye detection (Cont.)
  • Histogram equalization

Threshold with lt 49
After the histogram equalization and threshold,
the searching space of eyes is greatly reduced.
45
3. Locate the iris
  • After the operations above, we almost find the
    eye.
  • However, we should locate the iris. We use the
    following different methods
  • Template matching
  • Hough Transform

46
3. Locate the iris (Cont.) -Template matching
  • It bases on normalized cross-correlation.
  • It is used to measure the similarity between two
    images

47
3. Locate the iris (Cont.) -Template matching
  • Let I1, I2 be images of the same size.
  • I1(pi) ai , I2(pi) bi

NCC(I1, I2) lies on the range -1, 1
48
3. Locate the iris (Cont.) -Template matching
We use this template and calculate the NCC. This
template can be obtained by averaging all the eye
image.
49
3. Locate the iris (Cont.) -Template matching
Red region show the result
50
3. Locate the iris (Cont.) -Hough transform
  • Hough Transform can find the complete shape of
    the edge according to small portion of edge
    information.
  • It works with a parametric representation of the
    object we are looking for.
  • We use Hough transform with 2D circle parametric
    representation to find the iris.

51
3. Locate the iris (Cont.) -Hough transform
We find the edge of eye by Sobel filter.
52
3. Locate the iris (Cont.) -Hough transform
  • We apply a circle on the edge image and count the
    number of pixel lying on the circle

53
3. Locate the iris (Cont.) -Hough transform
  • A(x,y,r) lt- Number of pixel
  • where A(x,y,r) is Accumulator, where x,y are the
    coordinate of the center and r is the radius of
    the circle.
  • The searching space for the circle is x, y, r
    17, 17, 8.

54
3. Locate the iris (Cont.) -Hough transform
  • We have tried this method
  • It fails to find the iris

55
4. Improvement
  • Skin Color Detection
  • Neuron Network with simplified activate function
    (polynomial)
  • Probability function (e.g. Bayesian estimation)
  • Setup face Shape model it estimates the shape of
    face

56
4. Improvement (Cont.)
  • Template MatchingReplace it with deformable
    template or probability function.

57
Future Work
  • Stroke Combination
  • Dictionary
  • Face Detection Improvement
  • Face Recognition
  • normal luminance light source
  • about 20 people
  • gt 90 accuracy
  • Port the system into Pocket PC

58
QA
59
The End
Write a Comment
User Comments (0)
About PowerShow.com