Computer Vision: Gesture Recognition from Images - PowerPoint PPT Presentation

About This Presentation
Title:

Computer Vision: Gesture Recognition from Images

Description:

Computer Vision: Gesture Recognition from Images Joshua R. New Knowledge Systems Laboratory Jacksonville State University Outline Terminology Current Research and ... – PowerPoint PPT presentation

Number of Views:330
Avg rating:3.0/5.0
Slides: 29
Provided by: Joshu76
Learn more at: http://web.eecs.utk.edu
Category:

less

Transcript and Presenter's Notes

Title: Computer Vision: Gesture Recognition from Images


1
Computer VisionGesture Recognition from Images
  • Joshua R. New
  • Knowledge Systems Laboratory
  • Jacksonville State University

2
Outline
  • Terminology
  • Current Research and Uses
  • Kjeldsens PhD Thesis
  • Implementation Overview
  • Implementation Analysis
  • Future Directions

3
Terminology
Image Processing - Computer manipulation of
images. Some of the many algorithms used in image
processing include convolution (on which many
others are based), edge detection, and contrast
enhancement. Computer Vision - A branch of
artificial intelligence and image processing
concerned with computer processing of images from
the real world. Computer vision typically
requires a combination of low level image
processing to enhance the image quality (e.g.
remove noise, increase contrast) and higher level
pattern recognition and image understanding to
recognize features present in the image.
4
Current Research
  • Capture images from a camera
  • Process images to extract features
  • Use those features to train a learning system to
    recognize the gesture
  • Use the gesture as a meaningful input into a
    system
  • More information located at
  • http//www.cybernet.com/ccohen/

5
Current Research Example
  • Starner and Pentland
  • 2 hands segmented
  • Hand shape from a bounding ellipse
  • Eight element feature vector
  • Recognition using Hidden Markov Models

6
Current Uses
  • Sign Stream (released demo for MacOS)
  • Database tool for analysis of linguistic data
    captured on video
  • Developed at Boston University with funding from
    ASL Linguistic Research Project and NSF
  • http//www.bu.edu/asllrp/SignStream/

7
Current Uses
  • Recursive Models of Human Motion (Smart Desk,
    MIT)
  • Models the constraints by which we move
  • Visually-guided gestural interaction, animation,
    and face recognition
  • Stereoscopic vision for 3D modeling
  • http//vismod.www.media.mit.edu/vismod/demos/smart
    desk/

8
Current Uses
9
Kjeldsens PhD thesis
  • Application
  • Gesture recognition as a system interface to
    augment that of the mouse
  • Menu selection, window move, and resize
  • Input 200x300 image
  • Calibration of users hand

10
Kjeldsens PhD thesis
  • Image split into HSI channels (I Intensity,
    Lightness, Value)
  • Segmentation with largest connected component
  • Eroded to get rid of edges
  • Gray-scale values sent to learning system

11
Kjeldsens PhD thesis
  • Learning System Backprop network
  • 1014 input nodes (one for each pixel)
  • 20 hidden nodes
  • 1 output node for each classification
  • 40 images of each pose
  • Results
  • Correct classification 90-96 of the time on
    images

12
Implementation Overview
  • System
  • 1.33 Ghz AMD Athlon
  • OpenCV and IPL libraries (from Intel)
  • Input
  • 2 640x480 images, saturation channel
  • Max hand size in x and y orientations in of
    pixels
  • Output
  • Rough estimate of movement
  • Refined estimate of movement
  • Number of fingers being held up
  • Rough Orientation

13
Implementation Overview
  • Chronological order of system
  • Saturation channel extraction
  • Threshold Saturation channel
  • Calculate Center of Mass (CoM)
  • Reduce Noise
  • Remove arm from hand
  • Calculate refined-CoM
  • Calculate orientation
  • Count the number of fingers

14
Implementation Analysis
1. Saturation channel extraction Digital
camera, saved as JPGs JPGs converted to
640x480 PPMs Saturation channels extracted
into PGMs
Original Image
Hue
Lightness
Saturation
15
Implementation Analysis
2. Threshold Saturation channel a) Threshold
value 50 (values range from 0 to 255) b) _at_
PixelValue PixelValue 50 ? 128 0
16
Implementation Analysis
3. Calculate Center of Mass (CoM)
a) Count number of 128-valued pixels b) Sum
x-values and y-values of those pixels c) Divide
each sum by the number of pixels
  • 0th moment of an image
  • b) 1st moment for x and y of an image,
    respectively
  • c) Center of Mass (location of centroid)
  • where and

17
Implementation Analysis
4. Reduce Noise FloodFill at the computed CoM
(128-valued pixels become 192)
18
Implementation Analysis
  • 5. Remove arm from hand
  • Find top left of bounding box
  • Apply border for bounding box from calibration
    measure
  • FloodFill, 192 to 254

19
Implementation Analysis
  • 6. Calculate refined-CoM (rCoM)
  • Threshold, 254 to 255
  • Compute CoM as before

20
Implementation Analysis
7. Orientation a) 0th moment of an image b)
1st moment for x and y of an image,
respectively c) 2nd moment for x and y of an
image, respectively d) Orientation of image
major axis
21
Implementation Analysis
8. Count the number of fingers (via
FingerCountGivenX) Function inputs a) Pointer
to Image Data b) rCoM c) Radius .17HandSizeX
.17HandSizeY d) Starting Location (x or y, call
appropriate function) e) Ending Location (x or y,
call appropriate function) f) White Pixel
Counter g) Black Pixel Counter h) Finger Counter
22
Implementation Analysis
  • 8. Count the number of fingers
  • 2 similar functions start/end location in x or
    y
  • After all previous steps, the finger-finding
    function sweeps out an arc, counting the number
    of white and black pixels as it progresses
  • A finger in the current system is defined to be
    any 10 white pixels separated by 3 black pixels
    (salt/pepper tolerance) minus 1 for the hand
    itself

23
Implementation Analysis
8. Count the number of fingers
24
Implementation Analysis
  • 8. Count the number of fingers
  • Illustration of noise tolerance

25
Implementation Analysis
System Input
System Output
26
Implementation Analysis
System Input
System Output
27
Implementation Analysis
Process Steps Time (ms) Time (ms)
Process Steps Athlon MP 1500 (1.33 Ghz) Pentium 850 Mhz
1) Reading Image ? ?
2) Reading Image 208 340
3) Threshold .5 6.5
4) Center of Mass 3.5 18.5
5) Flood Fill 1.5 27
6) Bounding Box Top-Left 3.5 5.5
7) Arm Removal 2 34.5
8) Refined CoM 4 19
9) Finger Counting .5 1
10) Write Image 233 324
Time w/o RW 16.5 112
Time w/o Write 224.5 452
Total Time 457.5 776.5
  • System Runtime
  • Real Time requires 30fps
  • Current time 16.5 ms for one frame (without
    reading or writing)
  • Current Processing Capability on 1.33 Ghz Athlon
    60 fps

28
Future Directions
  • Optimization
  • Orientation for Hand Registration
  • New Finger Counting Approach
  • Learning System

For additional information, please visit
http//ksl.jsu.edu.
Write a Comment
User Comments (0)
About PowerShow.com