Title: Gestures and Avatar Signing
1Gestures and Avatar Signing
2Gesture Research and Virtual Human Signing
- John Glauert
- Ralph Elliott
- Richard Kennaway
- Judy Tryggvason
- Vince Jennings
- Collaboration with Graphics, Linguistics,Vision,
and Speech groups
3Sign Language
- Purely visual
- Multimodal
- Manual and Facial gestures
- Natural language
- British Sign Language (BSL)
- 1 in 1000 prelingually deaf
- Grammar and phonetic structure not based on
English
4Challenges
- Capturing gestures
- Motion capture
- Gesture recognition
- Animation of gestures
- Notations for gesture
- Realistic synthetic animation
- Grammar and semantics of signing
5Motion Capture
- Conventional Motion Capture
- Intrusive sensors
- Unstable
- Markerless Motion Capture
- Video-based
- Robust over long periods
- With BBC Barry Theobald, Vince Jennings, John
Glauert, Andrew Bangham
6Gesture Notation
- Hamburg Notation System
- Universal notation for sign language
- SiGML
- XML language
- Signing Gesture Markup Language
- EU eContent Richard Kennaway, Ralph Elliott,
John Glauert
7Realistic Synthetic Animation
- Signing demands precise animation
- Enhanced skeleton
- Collision detection
- Realistic motion
- Natural trajectory
- IK for aspects ignored by notation
- EU eContent Vince Jennings, Richard Kennaway,
Judy Tryggvason, John Glauert, Ralph Elliott
8Gestures and Avatar Signing
9Virtual Human Signingat UEA
- John Glauert, Ian Marshall
- Andrew Bangham, Stephen Cox, Ralph Elliott
- School of Computing Sciences
- University of East Anglia, UK
10ViSiCASTSign Language Using Virtual Humans
- John Glauert
- Andrew Bangham, StephenCox, Ralph Elliott,
IanMarshall - University of East Anglia,Norwich UK
11UEA and Norwich
12University of East AngliaNorwich UK
Norwich
Bristol
13University of East AngliaSchool of Computing
Sciences
- Computing Science
- Electronic Engineering
- Leading Research and Education
14science_at_uea
- Breadth of science
- Biological Sciences
- Chemical Sciences
- Environmental Sciences
- Information Systems
- Mathematics
15science_at_uea
- Diversity of Research
- TESSA award winning technology translating
speech into sign language - effect of forest fires in the Amazon
- hazard management of volcanoes
- internationally renowned climate research
- ecology and conservation of endangered species
- disease processes, novel solutions
16Norwich and Norfolk?
17Norwich and Norfolk?
18Norwich Research Park
19Norwich Research Park
University of East Anglia
BUPA Hospital Norwich
Norfolk Norwich University Hospital
Institute of Food Research
Plant BioSciences Limited
John Innes Centre
British Sugar Technical Centre
The Sainsbury Laboratory
Zeneca Wheat Improvement Centre
20Deafness in Britain
21Deafness in Britain
- 1 in 7 deaf or hard of hearing
- Hearing aids, lip-reading English, Teletext
- 1 in 1000 pre-lingually deaf
- Signing is closest to natural language
- 50,000 pre-lingually deaf people in UK
- British Sign Language (BSL) has its own grammar,
lexicon - Reading age (for English) is usually low
22Deafness in Britain
- Lose opportunities available to other members of
society - Access to Services is Limited
- Shops and the High Street
- Television
- Web and Electronic Information
23Sign Supported vs.Authentic Sign Languages
- In UK
- SSE Sign-Supported English
- one sign per word (approx.)
- follows English word order
- BSL British Sign Language
- one sign per concept
- use of signing space around signers body
- has own word order, morphology
- SSE and BSL both utilize finger-spelling
24Virtual Reality andVirtual Environments
25Virtual Reality
26Virtual Town Hall
27Virtual HumansorAvatars
28Annanova
29Avatars Virtual humans
30Virtual Dancing
31Simon the Signer
32Simon the Signer, 1997-1999
- Simon-the-Signer Broadcast TV
- Generate signed accompaniment to broadcast
- Teletext stream as source
- SSE Sign Supported English
33Simon the Signer , 1997-1999
- Developing transmission technology
- virtual signer in set-top boxes
- transmission of signing from text subtitles
Audio/Video Stream
TV Capture Card
Avatar
Software Mixer
Teletext Stream
Computer
34Simon the Signer
- Winner of two Royal Television Society awards
35TESSA
36TESSA, 1998-2000
- TESSA Retail, PO
- Using speech recognizer
- Convert counter-clerks voice input to text
- Generate sign stream from text
- BSL limited repertoire
37Post Office Counter Services
- Post Offices transact business with almost all
Deaf people - Counter Clerk asks questions using speech - No
back channel yet - Customer
- Listens or
- Reads or
- Watches Virtual Human Signer
- Real trials at 5 sites
38TESSA
Tessa Winner of BCS Gold Medal and IT 2000 Award
39ViSiCAST
40ViSiCAST
Virtual Signing Capture, Animation, Storage and
Transmission
- Translation of text to sign
- Animation of signing
- Broadcast Transmission
- Web and Multimedia
- Counter Services
41The ViSiCAST Project
- Virtual Signing Capture, Animation, Storage and
Transmission - Funded under EU Framework V Programme
- Additional Funding from ITC and Consignia
- pre-competitive research
- IST-1999-10500
42ViSiCAST Aims
- Improved access for deaf citizens
- to information and services
- in their preferred medium of sign language
- Builds on SignAnim and Tessa
43The ViSiCAST Consortium
44ViSiCAST Partners
ITC, UK Project co-ordination IRT, Germany
Broadcast technology Televirtual, Norwich, UK
Avatar creation IDGS, Germany Sign language
notation UEA Norwich, UK Processing of
language, speech signing
45ViSiCAST Partners
INT, France Broadcast imaging animation
standards IvD, Netherlands Multimedia content
creation Post Office, UK Face-to-face
transaction systems RNID, UK Monitoring of
signing and evaluation
46ViSiCAST Background
- Simon-the-Signer (ITC) (1997-1999)
- ITC (UK Independent Television Commission),
Televirtual, UEA Norwich - Tessa (Consignia) (1998-2000)
- Post Office, Televirtual, UEA Norwich
- Both based on virtual human signing
- using Televirtuals motion-capture driven avatar
technology
47ViSiCAST Project
- Extend applications of virtual signing
- Target to natural sign languages
- BSL (British Sign Language) rather than
- SSE (Sign-Supported English)
- Improve animation technology
- increasingly natural avatars
- easier but more accurate sign capture
48ViSiCAST Structure
Applications
Enabling Technologies
49ViSiCAST Structure
WWW
Transactions
Broadcast
Language
Animation
50Multimedia and the Internet
- Adding signing services to multimedia
- improves access to information for leisure,
learning and communication - Browser plug-in
- accurate signing of existing content on the
internet - translation of own text to generate signed
content on own website
51Face-to-Face Transactions
- Post Office, Advice Services, Shops
- Simple spoken phrases recognised and translated
to sign language - Aim for limited sign recognition for back
channel
52Television and Broadcast
- Developing transmission technology
- virtual signer in set-top boxes
- transmission of compressed signing data
53Television and Broadcast
- Developing transmission technology
- virtual signer in set-top boxes
- transmission of signing from text subtitles
Audio/Video Stream
TV Capture Card
Avatar
Software Mixer
Teletext Stream
Computer
54Television and Broadcast
55Virtual Human Signing Contexts
56Why useVirtual Human Signing?
57When do we need Signing?
- Events
- TV
- High Street
- Web and Communications
58Signing Interpreters
- Excellent for Events and TV
- Not enough to accompany all Deaf people
- Not practical for ephemeral information
- Newspapers
- Web
59Video of Signing
- Excellent for Fixed information sources
- Need to blend video sequences
- Hard
- Inflexible
- Expensive for ephemeral information
60Virtual Human Signing
- Can use realistic Captured motion
- Visual quality improving
- Possible to blend sequences
- Can be used to Synthesise signs
- Textual sign representation
- User freedom to create own content
- Much lower bandwidth than video
61Virtual Human Signing Approaches
62Virtual Human Signing
- Motion Capture and Playback
- Hand-Crafted Animation
- Blending
- Synthesis from Signing Notation
63Motion Capture
- Very lifelike animation
- Time-consuming to set up
- Blending of signs
- Combining signs from different signers
64Hand-Crafted Animation
- Define Key Frames
- Interpolate between Key Frames
- Can give good animation
- Time consuming (12 hours per sign)
- Blending of signs still required
65Synthetic Signing
- Synthesis from Abstract Representation
- Quick to create lexicon
- a few minutes to transcribe a sign
- Instantly retargettable to any avatar with
humanoid topology - Automatic Blending
- Low Bandwidth
66Virtual Human Signing Motion Capture
67Motion-Capture for Virtual Human Signing
- Motion Capture Streams
- body
- magnetic tracking
- face
- reflective markers head-mounted camera
- hands
- gloves with bend-sensors
68Data Capture Face Tracking
Face tracker 20 reflectors, helmet mounted
camera 60/2 Hz
69Data Capture Cybergloves
Cybergloves 18 resistors modulated by bend sample
rate, lt50 Hz
70Data Capture Magnetic Sensors
Magnetic sensors, Motion star Wrist, elbow,
head, body 86/2 Hz
71Virtual Human Signing Animation
72Virtual Humans Animation
- Good motion capture allied withFast real-time
graphics - Bones-Set
- Lengths and interconnection topology (joints)
- Specify joint angles and orientation
- Rendering
- attach mesh (wire-frame) to Bones-set
- apply texture-mapping to mesh
- Animation
- sequence of rendered frames
- each defined by a Bones-Set configuration
73Virtual Humans Animation
- Three dimensional model
- Custom Skeleton driven by motion data
74Virtual Humans Animation
- Three dimensional model
- Custom Skeleton driven by motion data
- Tracked by an enveloping mesh model
- Rendered with OpenGL
75Virtual Humans Animation
- Three dimensional model
- Custom Skeleton driven by motion data
- Tracked by an enveloping mesh model
- Rendered with OpenGL
- Texture map
- Some of the 5000 polygons
76Virtual Humans Animation
- Three dimensional model
- Custom Skeleton driven by motion data
- Tracked by an enveloping mesh model
- Rendered with OpenGL
- Texture map
- Some of the 5000 polygons
- 50 frames per sec
- nVidia GForce2
77Virtual Humans Animation
- Three dimensional model
- Custom Skeleton driven by motion data
78Virtual Humans Animation
- Three dimensional model
- Custom Skeleton driven by motion data
- Tracked by an enveloping mesh model
- Rendered with OpenGL
79Virtual Humans Animation
- Three dimensional model
- Custom Skeleton driven by motion data
- Tracked by an enveloping mesh model
- Rendered with OpenGL
- Texture map
- Some of the 5000 polygons
- Animated in real time
80Virtual Human Signing System
81Motion Capture and Display System
Computer System
care for
82Motion Capture
Computer System
Post-processing
83Display System
Computer System
Weather Forecast
84From Capture to Signing Simon Tessa
- Capture clips of signing
- based on vocabulary for chosen subject area
- requires calibration match signer to avatar
- Segment/Edit clips
- save as files, one per sign
- Generate Stream of Sign Names
- for required script
- Feed Sign Stream to Avatar
- acts as a Player for stream
- blending between signs
85ViSiCAST Applications
86Web Applications
87Web ApplicationsWeather Forecasts
- Signed Weather Forecasts
- SLN (The Netherlands)
- DGS (Germany)
- BSL (Britain)
- Form Filling for Forecast
- Dull and misty in places at first but soon
becoming warm, dry and sunny. - Met Office Summary Midlands 24/04/2002
88Web ApplicationsWeather Forecasts
89Weather Forecasts
Friday
Tomorrow
Today
Dull and misty in places at first but soon
becoming warm, dry and sunny. Maximum temperature
23 deg C (73 deg F). Tonight Becoming cloudy
overnight with perhaps an odd spot of rain.
Minimum temperature 8 deg C (46 deg F).
Cloudy start, then dry with sunny spells but
cooler.
Met Office for Midlands 24/04/2002
90Web ApplicationsWeather Forecasts
- Grammar for normal Weather Phrases
- Sign Language version for each Phrase
- Forecast is sequence of Phrases
- Generate Common XML Weather Model
- XSLT processing for each Sign Language
- XSLT processing for Spoken Languages
91Web ApplicationWeather Forecasts
-
- Rather cloudy with patchy rain at times, but
also some brighter intervals. Windy but mild.
Maximum temperature 13 deg C (55 deg F). - Tonight Patchy rain will clear during the
night leaving clear spells. Still rather breezy.
Mild. Minimum temperature 5 deg C (41 deg
F). Met Office for Southeast 06/03/2002
92Web Application Demo
93Virtual Human Signing Synthesis from Notation
94SiGML Notation for Signing
- Hamburg Notation System
- HamNoSys
- Code for hand shape and orientation, location,
and movement - Signing Gesture Markup Language
- XML Compliant (W3C standards)
- Builds on HamNoSys
95HamNoSys
- General notation for signing
- originally defined primarily for purposes of
recording, transcription, study of signing - Intention
- capable of representing any sign language
- some enhancements in area of non-manual features
needed - Defines
- semantic model for signing gestures
- pictographic notation
96HamNoSys Examples
DGS (German) Sign GOING-TO BSL Sign
NAME BSL Sign ME
97SiGML Notation for Signing
- Gloss level
- GIVE_BOOK_I_YOU
- code for a complete sign
- similar to SignAnim and Tessa approach
- HamNoSys level
- encodes sign phonemes as in HamNoSys
- Articulation level
- represents captured or synthesised motion
- encodes arbitrary gestures
98XML Format
- Use nested labelled bracket structure
- Similar to HTML
- represent brackets by element tags
- ltmyelement gt lt/myelementgt
- Element
- may contain sub-elements and/or text
- may have named attributes
- Document Type Definition
- Defines possible elements (tags)
- permitted attributes for elements
99Current SiGML Definition
- Two XML Applications focussed on manual subset
of HamNoSys - HamNoSysML
- DTD as close as possible to HamNoSys
- SiGML
- Tuned for animation
100SiGML Notation NAME-ME
- ltsigmlgt
- ltsigmlsigngt
- ltsign_manual both_hands"false"gt
- lthandconfig extfidir"ul" palmor"dl"
handshape"point12" thumbpos"across"
location"forehead_right"/gt - ltdirectedmotion direction"or"gt
- lthandconfig palmor"r"/gt
- lt/directedmotiongt
- lt/sign_manualgt
- lt/sigmlsigngt
- ltsigmlsigngt
- ltsign_manual both_hands"false"gt
- lthandconfig extfidir"uil" palmor"l"
handshape"point1" thumbpos"across"
location"chest_near"/gt - lt/sign_manualgt
- lt/sigmlsigngt
- lt/sigmlgt
101NLP and Synthesis
102English to Sign
- Translation via intermediate code Discourse
Representation Structure (DRS)
103Animation from Notation
104Animation of HamNoSys
- Make explicit everything HamNoSys leaves implicit
or fuzzy - Position
- Elbows and Shoulders
- Speed
- Trajectories
105Naturalistic Animation
- A hard problem in general
- e.g. walking
- Easier for signing
- No interaction with environment
- Ignore gravity
106Controller Response
107Inverse Kinematics
- Hand position and orientation given by HamNoSys
- From these, compute joint angles from clavicle to
wrist - Inverse Kinematics
- 3 degrees of freedom per arm left undetermined
- Respect the limits of the joints
- Avoid the arm passing through the body
108Ambient Motion
- If only arms, hands, and face are animated, the
result is stiff - Mix synthetic animation with motion-captured
ambient motion for the spine and head
109Stick-figure Avatars
- Useful for developing animations
- Easier to render, so more frames per second
- Skeleton gives clearer view of motion
- Prototyping tool only, not intended for end user!
110VRML for Prototyping
- Virtual Reality Modelling Language
- Textual description language for 3D animated
scenes - H-Anim standard for articulated humanoid figures
- H-Anim incorporated into MPEG-4
111Conclusions
112Role of Deaf Organisations
- Bridge between the project partners and the deaf
people who could benefit from the technology - Wide dissemination of project aims
- Collation of UK feedback by RNID through visits
to deaf clubs and groups - Evaluations of prototype systems by deaf people
to influence how systems can be improved
113ViSiCAST Conclusion
- Aims ambitious within 3 years
- Novel computational linguistics work to generate
and represent signing - Advanced avatar technology for signing virtual
humans - Input essential from deaf people so that the
technology develops to maximise benefits
114ViSiCAST
Virtual Signing Capture, Animation, Storage and
Transmission http//www.visicast.cmp.uea.ac.uk ht
tp//www.visicast.org
115Blank
116(No Transcript)
117(No Transcript)