Title: A Standard for Developing Multimodal Applications
1A Standard for Developing Multimodal Applications
- James A. LarsonLarson Technical Servicesjim _at_
larson-tech.com - SpeechTEK WestFebruary 23, 2007
2Status of W3C Multimodal Interface Languages
Recommendation
Voice XML 2.0
Speech Recog- nition Grammar Format (SRGS) 1.
0
Speech Synthesis Markup Language (SSML) 1.0
Semantic Interpret- ation of Speech Recog- nition
(SISR) 1.0
Proposed Recommendation
Voice XML 2.1
Candidate Recommendation
Last Call Working Draft
Extended Multi- modal Interaction (EMMA) 1.0
Working Draft
State Chart XML (SCXML) 1.0
InkXL 1.0
Requirements
3Interaction Manager Approaches
XV
W3C
Interaction Manager (XHTML)
Interaction Manager (SCXML)
Data Model
VoiceXML 2.0 Modules
XHTML
VoiceXML 3.0
InkML
4 Object- SALT oriented XV W3C
Standard XHTML SRGS VoiceXML SCXML Languages S
RGS SSML SRGS SRGS SSML SSML VoiceXML
SISR SSML XHTML SISR XHTML
EMMA CCXML Interaction XHTML C
XHTML SCXML Manager Modes GUI GUI GUI GUI
Speech Speech Speech Speech Ink
5MMI ArchitectureBasic Components
Interaction Manager (SCXML)
Data Model
- Interaction Managercoordinates modality
components and provides application flow - Modality Componentsprovide modality capabilities
such as speech, pen, keyboard, mouse - Data Modelhandles shared data
XHTML
VoiceXML 3.0
InkML
6Multimodal Architecture and Interfaces
- A loosely-coupled, event-based architecture for
integrating multiple modalities into applications - All communication is event-based
- Based on a set of standard life-cycle events
- Components can also expose other events as
required - Encapsulation protects component data
- Encapsulation enhances extensibility to new
modalities - Can be used outside a Web environment
Interaction Manager (SCXML)
Data Model
XHTML
VoiceXML 3.0
InkML
7Specify Interaction Manager Using Harel State
Charts
Prepare State
- Extension of state transition systems
- States
- Transitions
- Nested state-transition systems
- Parallel state-transition systems
- History
Prepare Response (fail)
Prepare Response (success)
Start State
StartFail
FailState
Start Response
WaitState
DoneFail
Done Success
EndState
8Example State Transition System
Prepare State
- State Chart XML (SCXML)
-
- ltstate id"PrepareState"gt
- ltsend event"prepare"
contentURL"hello.vxml"/gt - lttransition event"prepareResponse"
cond"status'success'"
target"StartState"/gt - lttransition event"prepareResponse"
cond"status'failure'"
target"FailState"/gt - lt/stategt
-
Prepare Response (fail)
Prepare Response (success)
Start State
StartFail
FailState
Start Response
WaitState
DoneFail
Done Success
EndState
9Example State Chart with Parallel States
Prepare Voice
Prepare GUI
Prepare Response Fail
Prepare Response Fail
Prepare Response Success
Prepare Response Success
Start Voice
Start GUI
Start Fail
Start Fail
Fail Voice
Start Response
Fail GUI
Start Response
Done Fail
Done Fail
Wait Voice
Wait GUI
Done Success
Done Success
End Voice
End GUI
10The Life Cycle Events
prepare
prepare
SCXML
prepareResponse
prepareResponse
XHTML
VoiceXML
start
start
SCXML
startResponse
startResponse
XHTML
VoiceXML
cancel
cancel
SCXML
cancelResponse
cancelResponse
XHTML
VoiceXML
pause
pause
SCXML
pauseResponse
pauseResponse
XHTML
VoiceXML
resume
resume
SCXML
resumeResponse
resumeResponse
XHTML
VoiceXML
11More Life Cycle Events
newContextRequest
SCXML
newContextRequest
newContextResponse
newContextResponse
XHTML
VoiceXML
SCXML
data
data
XHTML
VoiceXML
SCXML
done
XHTML
SCXML
clearContext
clearContext
XHTML
VoiceXML
12Synchronization Using the Lifecycle Data Event
SCXML
data
data
XHTML
VoiceXML
- Intent-based events
- Capture the underlying intent rather than the
physical manifestation of user-SCXML events - Independent of the physical characteristics of
particular devices
- Data/reset
- Reset one or more field values to null
- Data/focus
- Focus on another field
- Data/change
- Field value has changed
13Lifecycle Events between Interaction Manager and
Modality
Interaction Manager
prepare
Prepare State
Prepare Response Fail
prepare response (failure)
Prepare Response Success)
prepare response (success)
start
Start State
start response (success)
Start Fail
Start Response
FailState
start response (failure)
DoneFail
WaitState
data
Done Success
done
EndState
14MMI Architecture Principles
- Interaction manager communicates with Modality
Components through asynchronous events - Modality Components dont communicate directly
with each other, but indirectly through the
Interaction manager - Components must implement basic life cycle
events, may expose other events - Modality components can be nested (e.g. a Voice
Dialog component like a VoiceXML ltformgt) - Components need not be markup-based
- EMMA communicates users inputs to the
Interaction Manager
15Modalities
Interaction Manager (SCXML)
Data Model
- GUI Modality (XHTML)
- Adapter converts Lifecycle events to XHTML
events - XHTML events converted to lifecycle events
XHTML
VoiceXML 3.0
- Voice Modality (VoiceXML 3.0)
- Lifecyle events are embeddedinto VoiceXML 3.0
16Modalities
- VoiceXML supports
- Events sent from the Interaction Manager
- Sending events to the Interaction Manager.
- ltformgt ltcatch name"change"gt ltassign
name"city" value"data"/gt lt/catchgt -
- ltfield name "city"gt ltpromptgt Blah
lt/promptgt ltgrammar src"city.grxml"/gt
ltfilledgt ltsend event"data.change"
data"city"/gt lt/filledgt lt/fieldgt - lt/formgt
Interaction Manager (SCXML)
Data Model
XHTML
VoiceXML 3.0
17Modalities
- XHTML is extended to send events to the
Interaction Manager. - ltheadgtltevListener evevent"onChange"
evobserver"app1" evhandler"onChangeHandler
()"gtltscriptgtfunction onChangeHandler()
post ("data", data"city")lt/scriptgtlt/headgt -
- ltbody id"app1"? ltinput type"text" idcity
"value " "/gtlt/bodygt
Interaction Manager (SCXML)
Data Model
XHTML
VoiceXML 3.0
18Modalities
- XHTML is extended to support events received from
the Interaction Manager - ltheadgtlthandler type"text/javascript
evevent"data" if (event"change"
document.app1.city.value"data.city"lt/handlergt
lt/headgt -
- ltbody id"app1"? ltinput type"text" id"city"
value" "/gt - lt/bodygt
Interaction Manager (SCXML)
Data Model
XHTML
VoiceXML 3.0
19References
- SCXML
- Second working draft available at http//www.w3.o
rg/TR/2006/WD-scxml-20060124/ - Open Source available from http//jakarta.apache.o
rg/commons/sandbox/scxml/ - Multimodal Architecture and Interfaces
- Working draft available at http//www.w3.org/TR/20
06/WD-mmi-arch-20060414/ - Voice Modality
- First working draft VoiceXML 3.0 scheduled for
November 2007 - XHTML
- Full recommendation
- Adapters must be hand-coded
- Other modalities
- TBD
20Availability
- SAPI 5.3
- Microsoft Windows Vista
- XV
- ACCESS Systems NetFront Multimodal Browser for
PocketPC 2003 - http//www-306.ibm.com/software/pervasive/multimo
dal/?Opencadaw-prod-mmb - Opera Software Multimodal Browser for Sharp
Zaurus http//www-306.ibm.com/software/pervasive/
multimodal/?Opencadaw-prod-mmb - Opera 9 for Windows http//www.opera.com/
- W3C
- First working draft of VoiceXML 3.0 not yet
available - Working drafts of SCXML are available some
open-source implementations are available - Proprietary APIs
- Available from vendor
21Final Advice
- The W3C is defining a rich collection of
languages for authoring multimodal
applications - SCXML can be used as an Interaction Manager
- Many languages for modalities VoiceXML, XHTML,
- EMMA may be used to describe data transmitted
among modules - W3C languages will be available on multiple
platforms - Avoid getting locked into using proprietary
languages available only on a single platform - The W3C languages will be available on multiple
platforms
22Web Resources
- http//www.w3.org/voice
- Specification of grammar, semantic
interpretation, and speech synthesis languages - http//www.w3.org/2002/mmi
- Specification of EMMA and InkML languages
- http/www.microsoft.com (and query SALT)
- SALT specification and download instructions for
adding SALT to Internet Explorer - http//www-306.ibm.com/software/pervasive/multimod
al/ - XV specification download Opera and ACCESS
browsers - http//www.larson-tech.com/SALT/ReadMeFirst.html
- Student projects using SALT to develop multimodal
applications - http//www.larson-tech.com/MMGuide.html or
http//www.w3.org/2002/mmi/Group/2006/Guidelines/ - User interface guidelines for multimodal
applications