A Context Inference and Multimodal Approach to Mobile Information Access PowerPoint PPT Presentation

presentation player overlay
1 / 28
About This Presentation
Transcript and Presenter's Notes

Title: A Context Inference and Multimodal Approach to Mobile Information Access


1
A Context Inference and Multimodal Approach to
Mobile Information Access
  • David West
  • Trent Apted
  • Aaron Quigley

2
Overview
  • Multimodal Interaction (David West)
  • Motivation Email Scenario
  • Agent-Based Architecture
  • EMMA
  • Application Supplied Components
  • Input Generation
  • Implicit IO Agent Configuration
  • Application Processing Context
  • Implementation
  • Context Awareness (Trent Apted)
  • Relevance
  • Our Approach
  • Implementation
  • Conclusion

3
Motivation - Example
Instrumented Environment
4
Motivation
  • To support perceptual spaces - multiple users,
    multiple (embedded) input/output devices,
    multiple applications
  • To enable seamless roaming between PAN/PLAN/PWAN
    environments
  • User should be able to approach any device and
    use it to continue her multimodal dialogue
    without interruption (e.g. without explicit
    configuration).
  • To decouple multimodal application processing
    logic from modality specific input/output
    processing logic

5
Distributed Agent-Based Architecture
Input Agents
Application Agents
Output Agents
Speech
Context Plane
Context Plane
Email
Voice
EMMA
Pen
Graphics (XHTML/ XUL/)
Scrap book
EMMA
Mouse/ Keyboard

....
  • Applications agents, input/output agents reside
    on multiple devices
  • Context plane controls their bindings

6
EMMA Example
ltemmaemma emmaversion"1.0" gt
ltemmainterpretation emmaid"speech1"
emmastart"2004-07-26T00000.2"
emmaend"2004-07-26T00000.4"
emmaconfidence"0.8" emmamedium"acoustic"
emmamode"speech" gt ltcommandgtnextlt/comma
ndgt lt/emmainterpretationgt ltemmainterpretatio
ngtlt/emmainterpretationgt lt/emmagt
7
Application Supplied Components for Input and
Output Agents
8
Input Generation
  • Recognition Users produce an input signal which
    is passed through a recogniser in the input
    agent. Recogniser is constrained by a grammar.
  • E.g. an email application speech grammar may look
    like
  • public ltCOMMANDgt Read Next Delete ltFILEgt
  • ltFILEgt (File store move) (in to) folder
    ltFOLDERgt
  • ltFOLDERgt personal spam work

9
Application Supplied Components for Input and
Output Agents
10
Input Generation
  • Interpretation Once user input is recognized, it
    is passed through an interpretation component to
    produce EMMA.
  • E.g. An EMMA document may contain the following
    interpretation
  • ltcommandgt
  • File
  • ltfoldergt
  • personal
  • lt/foldergt
  • lt/commandgt

11
Implicit Activation of IO Agents
  • Command agent activation of input and output
    agents
  • An identification component user indicates they
    wish to use an agent
  • Simple GUI login mechanism
  • Could use biometric identification mechanisms
    (thumbprint, retina scanners, )
  • RFID, proximity sensors,

12
Seamless Agent Configuration
  • Pen agent requests ApplicationWithFocus for
    current user
  • Pen agent requests dataplane location of grammar
    and interpreter
  • Pen agent loads grammar and interpreter
  • 4-6. As for 1-3.
  • Pen agent sends EMMA to email agent
  • Email agent sends application defined output to
    Voice agent.

13
Application Processing Context
  • Input agents also load command grammars/interprete
    rs
  • Allow application focus shifts
  • E.g. Switch to my scrap book application
  • Application focus shift causes context plane to
    trigger all input/output agents in use by a user
    to rebind to the new application
  • Currently only one active application allowed.

14
Implementation
  • Object Oriented framework
  • Input modes pen, speech, GUI
  • Output modes text-to-speech, GUI
  • Test applications email and scrapbook
  • See our UbiComp 2004 demo paper The UbiComp
    scrapbook
  • Current context plane implemented using LIME
    (Linda In a Mobile Environment), providing a
    shared tuple space abstraction
  • Grammars, interpreters, stylers are Java classes.
  • Stored in the data plane
  • Custom class loaders in IO agents load these from
    the data plane as required

15
Context Awareness
  • Trent Apted

16
Context Awareness
  • Not just concerned with location and time, nor
    simply computing resources
  • Applications
  • present their own context
  • provide evidence to influence the resolution
  • we resolve the context into high-level concepts
  • We want to function well without this evidence,
    but better when we have it
  • We want to learn from user decisions

17
Motivation - Recap
Instrumented Environment
18
Scenario
  • As you walk into a room, you receive an email
  • There is evidence to suggest that the email is
    urgent and confidential
  • From past actions, we know that you prefer to
    read email on large displays
  • However, the large, public display in this room
    can be seen by anyone also in the room
  • But we also know there is nobody else in the room

19
Decisions
  • How do we convey the context to the email
    application?
  • How do we initially present the email?
  • How do you influence the choice of output
    modality we make for you?
  • How are you able to reply to the email?
  • How do we learn from the choices you make?

20
Our Approach
  • Basic Rules (domain knowledge)
  • Ontologies to aid our sharing of context
  • Infer context through relationships
  • Probabilistic and temporal logic
  • Resolve context to establish possible actions
  • Rank the possibilities based on suitability, user
    preference and evidence provided
  • Feedback (reinforcement learning)
  • If the user adjusts the decision adjust the
    reasoning model
  • The Context Plane

21
Implementation (current work)
  • We want to harness existing work
  • Representation, Rule Based Inference
  • CYC Upper Ontology (domain knowledge)
  • F-OWL (f-logic for the Web Ontology Language)
  • Probabilistic and Temporal Logic
  • Dynamic Bayesian Networks (K. Murphy 2002)
  • Intel Probabilistic Networks Library (PNL)
  • Infrastructure (the Plane)
  • Applications tap in regardless of connectivity
  • Feedback, new applications, new context
  • Our own techniques for collecting evidence and
    dynamically adapting our inference network

22
Conclusion
  • An infrastructure and protocol for multi-modal
    interaction
  • supports multiple users across multiple
    applications
  • multiple I/O modalities in a mobile/instrumented
    environment
  • Integrated with a supporting information access
    infrastructure (currently using LIME)
  • Discussion
  • dwest, tapted, aquigley_at_it.usyd.edu.au

23
Questions / Discussion
24
This slide intentionally left blank ?
25
Virtual Personal Server Space (VPSS)
26
Application Input
  • Input agents send application defined, modality
    neutral input to application agents in the form
    of Extensible MultiModal Annotation Language
    (EMMA), part of W3Cs multimodal application
    framework
  • Emma consists of
  • Instance data application specific
    interpretation(s) of user intent
  • Data model specifies constraints on the format
    of instance data. E.g. XML schema, DTD. May be
    implicit
  • Metadata information about the instance data.
    E.g. timestamps, confidence scores, process
    information...

27
Software Architecture
  • Objected-Oriented Framework
  • Application writers write top-most layer only

28
Our Method
  • The Context Plane
  • Collects and resolves context from the
    infrastructure
  • Makes context available to mobile devices
  • Collects context/evidence from applications to
    share with other applications and assist
    inferences
  • Uses a common protocol between applications
  • Bind application and I/O agents across multiple
    devices
Write a Comment
User Comments (0)
About PowerShow.com