MobileASL: Making Cell Phones Accessible to the Deaf Community - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

MobileASL: Making Cell Phones Accessible to the Deaf Community

Description:

Existing video phone technology (with minor modifications) would be usable. 11 ... Mobile Video Phone Study. 3 Region-of-Interest (ROI) values ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 40
Provided by: ann45
Category:

less

Transcript and Presenter's Notes

Title: MobileASL: Making Cell Phones Accessible to the Deaf Community


1
MobileASL Making Cell Phones Accessible to the
Deaf Community
  • Anna CavenderRichard Ladner, Eve Riskin
  • University of Washington

2
Two Themes
  • MobileASL
  • Cyber-Community for Advancing Deaf and Hard of
    Hearing in STEM (Science Technology Engineering
    and Math)

3
Our goal
  • ASL communication using video cell phones over
    current U.S. cell phone network

Challenges
  • Limited network bandwidth
  • Limited processing power on cell phones

4
Cell Phone Network Constraints
  • Low bit rate goal
  • GPRS (General Packet Radio Service)
  • Ranges from 30kbps to 80kbps (download)
  • Perhaps half that for upload
  • Unpredictable variation and packet loss
  • 3G 3rd Generation
  • Special service
  • Not yet widespread
  • Will still have congestion
  • Service providers more likely to offer services
    if throughput can be minimized.

5
MobileASL Network Goals
  • Sign language presents a unique challenge
  • Not just appearance of video, intelligibility
    too!
  • If it works for sign language, other video
    applications benefit too.
  • MobileASL is about fair access to the current
    network
  • As soon as possible, no special accommodations
  • Not geographically limited
  • Lower bitrate power savings more accessible

6
Architecture
Cell phone
Sender
Receiver
Camera
Player
Encoder
Decoder
Encoder
Transmitter
Receiver
Cell Phone Network
7
Codec Used x264
  • Open source implementation of H.264 standard
  • Doubles compression ratio over MPEG2
  • Replacing MPEG2 as industry standard
  • x264 offers faster encoding
  • Off-the-shelf H.264 decoder can be used
  • (speculation about H.264 on the iPhone)

8
Outline
  • Motivation
  • Introduction
  • MobileASL Consumers
  • Eyetracking Motivation
  • Video Phone Study
  • Compression Challenges
  • Current Work
  • Conclusions

9
Discussions with Consumers
  • Open ended questions
  • Physical Setup
  • Camera, distance,
  • Features
  • Compatibility, text,
  • Scenarios
  • Lighting, driving, relay services,

10
Consumer Response
  • I dont foresee any limitations. I would use
    the phone anywhere the grocery store, the bus,
    the car, a restaurant, anywhere!
  • There is a need within the Deaf Community for
    mobile ASL conversations
  • Existing video phone technology (with minor
    modifications) would be usable

11
Video Encoding for ASL
  • Constraints of cell phone network create video
    compression challenges
  • How do we compress ASL video to maximize
    intelligibility?

12
Outline
  • Motivation
  • Introduction
  • MobileASL Consumers
  • Eyetracking Motivation
  • Video Phone Study
  • Compression Challenges
  • Current Work
  • Conclusions

13
Eyetracking Studies
  • Participants watched ASL videos while eye
    movements were tracked
  • Important regions of the video could be encoded
    differently

Muir et al. (2005) and Agrafiotis et al. (2003)
14
Eyetracking Results
  • 95 of eye movements within 2 degrees visual
    angle of the signers face (demo)
  • Implications Face region of video is most
    visually important
  • Detailed grammar in face requires foveal vision
  • Hands and arms can be viewed in peripheral vision

Muir et al. (2005) and Agrafiotis et al. (2003)
15
Outline
  • Motivation
  • Introduction
  • MobileASL Consumers
  • Eyetracking Motivation
  • Video Phone Study
  • Compression Challenges
  • Current Work
  • Conclusions

16
Mobile Video Phone Study
  • 3 Region-of-Interest (ROI) values
  • 2 Frame rates, frames per second (FPS)
  • 3 different Bit rates
  • 15 kbps, 20 kbps, 25 kbps
  • 18 participants (7 women)
  • 10 Deaf, 5 hearing, 3 CODA
  • All fluent in ASL

CODA (Hearing) Child of a Deaf Adult
17
Example of ROI
  • Varied quality in fixed-sized region around the
    face
  • (demo)

2x quality in face
4x quality in face
18
Examples of FPS
  • Varied frame rate 10 fps and 15 fps
  • For a given bit rate
  • Fewer frames more bits per frame
  • (demo)

19
Questionnaire
20
User Preferences Results
Bit Rate
Frame Rate
Region of Interest
21
Implications of results
  • A mid-range ROI was preferred
  • Optimal tradeoff between clarity in face and
    distortion in rest of sign-box
  • Lower frame rate preferred
  • Optimal tradeoff between clarity of frames and
    number of frames per second
  • Results independent of bit rate

22
Outline
  • Motivation
  • Introduction
  • MobileASL Consumers
  • Eyetracking Motivation
  • Video Phone Study
  • Compression Challenges
  • Current Work
  • Conclusions

23
Rate, distortion and complexity optimization
Inputparameters
H.264 encoder
Compressed video
Raw video
  • H.264 standard provides 50 bit savings over
    MPEG 2, but with higher complexity.
  • Objective Achieve best possible quality for
    least encoding time at a given bitrate

24
Time Complexity Tradeoff
MSE
Encoding Time
25
Encoding/Decoding on the Cell Phone
  • Implemented a command-line version of x264 on a
    cell phone using Windows Mobile Edition 5.0.

26
QVGA 320x240
27
Outline
  • Motivation
  • Introduction
  • MobileASL Consumers
  • Eyetracking Motivation
  • Video Phone Study
  • Compression Challenges
  • Current Work
  • Conclusions

28
Dynamic Region-of-Interest
  • Skin detection algorithms
  • Region-based metric for
  • bit allocation
  • Automatically determine priority for face and
    hands based on currently available bitrate.

29
Activity Recognition
  • Can save data and power by detecting
  • Fingerspelling
  • Increase frame rate for better intelligibility
  • Signing
  • Sign language-specific encoding
  • Just listening
  • Less processing and transmission needed
  • (demo)

30
User Interface
  • Leverages users prior experience with video
    conferencing interfaces (such as Sorenson, HOVRS,
    etc.)
  • Optimized for small screen space
  • Initial state user interface
  • Incoming Video Stream
  • Outgoing Video Stream
  • Control Toolbar
  • Toggle Privacy Mode
  • Toggle Chat View
  • Video Screen Layout
  • Toggle Status Bar

31
Building the System
32
MobileASL Team
  • Principal Investigators
  • Richard Ladner, Eve Riskin, and Sheila Hemami
    (Cornell)
  • Graduate Students
  • Anna Cavender, Rahul Vanam, Neva Cherniavsky,
    Jaehong Chon, Dane Barney, Frank Ciaramello
    (Cornell)
  • Undergraduate Students
  • Omari Dennis, Jessica DeWitt, Loren Merritt
  • National Science Foundation

33
Cyber infrastructure for Advancing Deaf Hard of
Hearing in STEM
Richard Ladner Jorge Díaz-Herrera James
J DeCaro E William Clymer Anna
Cavender University of Washington
Rochester Institute of TechnologyNational
Technical Institute for the Deaf
34
Our Goal
  • Advancing Deaf and Hard of Hearing people in STEM
    fields through better access to education.

35
Problems
  • Deaf students pursuing STEM fields need skilled
    interpreters and captioners with specific domain
    knowledge.
  • The best interpreter may not be at the students
    locale.
  • Deaf students face challenging classroom
    environments multiple sources of information are
    all visual
  • Deaf Whiplash
  • Sign language is growing to include STEM
    vocabulary
  • Community consensus is required.

36
Enabling Access to STEM Education
37
Enabling ASL to Grow in STEM
38
Summit to Create a Cyber-Community to Advance
Deaf and Hard-of-Hearing Individuals in STEM
(DHH Cyber-Community)
  • Scheduled for June 2008 RIT/NTID
  • Discussion among the many stakeholders
  • Deaf and hard of hearing students in STEM fields.
  • Faculty and administrators in colleges and
    universities with a commitment to deaf and hard
    of hearing students in STEM fields.
  • Interpreters and captioners.
  • Researchers who study sign vocabulary for STEM
    fields and interpreting and captioning for
    education.
  • Educational technology researchers.
  • Experts in multimedia and network services that
    use the national cyberinfrastructure (e.g.,
    AccessGrid).
  • Companies already in the business of providing
    video relay interpreting (VRI) and real time
    captioning (RTC).
  • Leaders in organizations who have an interest in
    advancing deaf and hard of hearing students in
    STEM fields.

39
Questions?
  • Thanks!
  • MobileASL Webpage
  • www.cs.washington.edu/research/MobileASL
  • Richard Ladner
  • ladner_at_cs.washington.edu
  • Anna Cavender
  • cavender_at_cs.washington.edu
Write a Comment
User Comments (0)
About PowerShow.com