Title: WP3 Demo
1WP3 Demo
- Torino Meeting 9-10 March 2006
www.loquendo.com
2Mobile Platform Architecture
Markup Language
HTML, JSP, SALT Tags
The application is web oriented. It is coded with
HTML templates plus SALT tags for voice
interaction and Java Server Pages (or Servlet)
for dynamically filling the HTML templates by
accessing the Application data
Server and Client are connected to an IP LAN/WAN
through a Wi-Fi connection or equivalent one
The mobile platform is made up by a Server,
implemented on a workstation or a portable PC,
and a Client, implemented on a PDA o Tablet PC.
The browser is located on the PDA, and hosts a
Multi-media plug-in to operate the vocal
interactions
The SALT tags are expanded in Java Scripts, and
HTML pages Java scripts pages are exchanged
between the Web Server and the Browser
The application data can be located on the server
or be available through the internet
The browser interacts on a TCP protocol using XML
with the Embedded Voice Server
On the server there is the Application Server (a
Web Server), that is used to implement the
application logic
Embedded Voice Server is a wrapper for Loquendo
ASR and TTS that interacts with the Multimedia
plug-in and with the Audio Resource Manager
3Demo Limitations (wrt full architecture)
- This demo shows the application interaction, like
some active pages were expanded and delivered
from the server side - offline demo the demo works also where no
network coverage is provided - voice-enabled HTML pages local to the PDA
- data are not saved to a server (no connection to
a web server) - audio files are saved locally
- recognition grammars are compiled off-line and
deployed locally on the PDA as Recognition
Objects
4Multimodal Features
Input
Graphic
Vocal
Click/Touch/Press
Vocal
- Interaction modes available at the same time
- Vocal/visual output completing each other or
redundant according to application design - We keep interaction simple (noisy conditions),
application complexity potentially very high
5Interaction Features
- Push-to-talk recognition is triggered when a
hardware button on the PDA is pressed - Interaction can take place without touching the
screen, fields in form filling are proposed in
sequence. - corrections or changes in the sequence by
touch-pen voice and touch are connected by focus
on the form fields - dialogue flow should take into account that user
could interact by voice only, by touch pen only
or any mixture of the two. We are still
experimenting on GV User Interface
6Next-to-come Features (September Project Review)
- Use of a headset with noise cancellation
microphone a simple noise subtraction circuit
can do some basic audio filtering - The operator name can be associated to a PIN
code, thus creating an initial authentication
page to access the application. - On-the-fly grammar compilation
as foreseen by
project schedule, the Embedded
ASR will be integrated with
the grammar compiler
tool. This will let application dynamics to
enter the recognition
grammars too.
7Recognition grammars
- ltmaintenancegt(Checking Replacement
Reparation Removal) - ltpartgt( (right left) wing (right left)
aileron (right left) flap (right left)
spoiler (right left) engine landing gear
horizontal stabilizer vertical fin elevator
rudder) - lttypegt ("Falcon nine hundred E X""Falcon 900
EX" "T B M seven hundred""TBM 700" "Airbus
three twenty""Airbus 320" "Boeing seven
seventy seven""Boeing 777" "Fokker one
hundred""Fokker 100" "C R J one hundred""CRJ
100" "A T R forty two""ATR 42" "Cessna one
hundred eighty two""Cessna 182" "Citation
Jet""Citation Jet" "Boeing seven forty
seven""Boeing 747")
8Application Mantainance Report Demo
DEMO