Title: An Oxygenated Presentation Manager
1An Oxygenated Presentation Manager
- Larry Rudolph
- Oxygen Workshop, January, 2002
2Goals Overview
- Integrate Many Oxygen Technologies
- Application Driven
- Use an application that we understand
- Personally use often
- Would help if were more human-centric
- Develop Architectural Infrastructure
- Exposes new requirements
- Identify some important, basic issues
- Infrastructure
- Critique of Presentation Manager
- What is wrong with it
3Application Scenario
4An Oxygen Application
- Components
- Input
- Vision
- Speech
- Touch
- Processing
- Changing configuration
- Output
- Projector
- Handheld
- Archive
5Camera watching laser point on screen
- Camera Challenges
- Inexpensive ones have wrong focal length
- Alignment issues use edge of screen, display
pattern, figure out from what is known to be
visible - We ended up displaying a pattern of concentric
circles - Relative size of laser point depends on distance
- Beyond ten feet, had to use only certain types of
lasers - Could slow-down camera and let pixels saturate
(too complicated) - Camera Interface
- Click at point (x,y)
- Hold laser at same location for 5 seconds
- Select horizontal line ( (x1,y1) , (x1,y2) )
- Sweep laser back and forth, line is diameter of
ellipse - Select object centered at point (x,y)
- Sweep laser in circle, point is center of circle
- Previous or Next
- Click in left (right) 1/8 of screen
6Microphone listening to speaker
- Microphone
- Many technologies Current approach ipaq
- Push to speak
- Audio server on ipaq
- Detects start and stop
- Best results when human pushes to start and
releases to stop - Audio wave file sent to Galaxy speech system
- Galaxy output (HTTP Post cmd) to CGI-script
Server - Action are
- Powerpoint
- Next slide, Previous slide, Goto slide n, Goto
slide named ltxxxgt - Next item, Previous item, Goto item n, Goto item
named ltxxxgt - Next animations, previous animation, goto
animation n - Presentation
- Start presentation ltnamegt, End presentation,
Pause presentation - Session
- Initialize Camera, test microphone
7Speaker controlling presentation via ipaq
- Ipaq output (HTTP Post cmd) to CGI-script Server
- Same actions as from speech server
- Action are
- Powerpoint
- Next slide, Previous slide, Goto slide n, Goto
slide named ltxxxgt - Next item, Previous item, Goto item n, Goto item
named ltxxxgt - Next animations, previous animation, goto
animation n - Presentation
- Start presentation ltnamegt, End presentation,
Pause presentation - Session
- Initialize Camera, test microphone
- Handheld (Ipaq) display
- GUI generated from speechbuilder grammar
- List of slides, items per slides
- Currently use ad-hoc solution where power-point
sends lists to ipaq. Need more automatic
solution
8Processing controlling session
- Easy to embed movies into presentation
- Add rule that is invoked on slide n
- Switch output producer from powerpoint to media
player - Remove interrupting technologies
- Dynamically disconnect any input / output source
- All done via core language
9Output to projector, handheld, archive
- Unlimited number of video / audio output
producers - E.g. powerpoint just one producer of output
- At any time, each output device has an associated
producer - This producer can receive input from several
producers - Handheld has proxy
- To reduce bandwidth to ipaq
- Current slide, list of slides, list of commands
- Archive
- Each slide shown, audio (from a different
microphone) sent to archive - Currently just gif of current slide
10CORE Communication Oriented Routing Environment
- Part of ORG (Oxygen Research Group)
- Assumptions
- Actuators / Sensors (I/O) in the environment
- Many are shared by apps users
- Many are flaky / faulty
- User does not know much about them
- Environment, application, users desires change
over time
11An Oxygen Application
- Interconnected Collection of Stuff
- Who specifies the stuff?
- I dont know, but its mostly virtual stuff
- Many layers of abstraction
- Dont ask, its turtles all the way down
- Two main layers of programming
- Professionals
- Users, e.g. grandmother
12Communications-Oriented Programs
- Connecting the (virtual) stuff done by user
- Home stereo / theater analogy
- Plug Stuff together unplug it if doesnt work
- Dont like it, unplug it
- Device drivers, services, clients, dont know to
whom or to what they connect - In client/server model,
- server knows a lot about the client,
- the client knows even more about the server
- Extend Unix Pipes
13CORE
Other COREs
Larry Bear
14Message Flow
- Messages flow between nodes core
- Core is both language and router
- Within Core Router, some messages
- are interpreted and may trigger actions
- other messages get routed to other nodes
- Request-Reply message strategy
- Even number of messages
- No reply within time period, means error
15CORE Language Elements
- Four elements
- Nodes,
- Links,
- Messages,
- Rules
- Features
- Interpreted Language
- Statement is a message reply
- Each element has an inverse
16Nodehandler (nickname, specifier)
Nodes Specify via INS
Cam deviceweb-cam location518
PTRvision deviceprocess OSLinuxFileLaser
Vision, ..
CORE
Laser Vision
17 Node Statement Handler
- When node message arrives
- Verified for correctness (statements allowed)
- Routed to Node Manager (just another node)
- Node Manager
- INS lookup, verifies if allowed, creates if
needed - Creates core thread to manage communication with
node - Bookkeeping reply message with handle/error
18Links
Lcamera,vision (Cam,PTRvision)
Slide Speech
Presentation Speech
Command Speech
CORE
Laser Vision
19Link Statement Handler
- Message routed to link manager
- Two queries to node mng for thread cntl
- Message to thread controller of source node
- Specifying destination thread controller
- Message to thread controller of dest node
- Specifying source thread controller
- Bookkeeping reply message handler/error
20Messages
Messages flow over the links
Next Slide!
Slide Speech
Presentation Speech
Command Speech
CORE
Laser Vision
21Message Handling
- Messages can be encrypted
- Core statement messages have fixed format
- Everything else is data message
- Each node thread has two unbounded buffers
- Core to node Node to core
- Logging, rollback, fault-tolerance
22Rules
RULES (trigger,action)
( MESSQuestion , Lslide,lcd -- Lslide,qlcd )
Slide Speech
Presentation Speech
Questions
Command Speech
CORE
Questions
Questions
Laser Vision
23Rule Statement Handler
- ( trigger , consequence )
- Both are event sets
- Eight basic events
- Node, -Node, Link, -Link
- Message, -Message, Rule, -Rule
- Event set is a set of events
- Trigger is true when events are true
- Consequence makes events true
24Rules A link is a rule
- A message event is of form
- (node, message specifier)
- ( message specifier , node )
- Message came from or going to node
- A link (x,y) is just shorthand for the rule
- ( x , m ) ? ( - (x, m) , (m , y) )
- If a message m arrives at node x, then make that
event false (remove the message) and make the
event of m arriving at y from core true.
25Rules Access Control Lists
- An access control list is just a rule
- When messages arrive at node, if they arrive from
valid node, then allowed to continue to flow. - Modifying access control lists is just adding or
removing rules.
26Rules
- Rule statement gets sent to rule manager
- Event set is just another shorthand for rules
- Rule manager sends command to trigger node thread
that tells it about the consequence - Rules are reversible
27Reversibility
- Each statement is invertible (reversible)
- If there is an error in the application
specification, then can undo it all. - General debugging is possible with reversible
rules and message flow
28Critique of Presentation Manager
29Vision / Gesture Recognition
- Laser Pointer
- Great for drawing attention to content
- Audience is primary consumer
- Secondary use to control presentation
- But it is not a mouse
- Semantics are tied to slide context
- Differs from Intelligent-room use
- Small number of identified gestures
- Gestures easily punctuated
- Low computational overhead
- Soon will be handled with a H21
30Critique of Vision / Gesture Recognition
- Laser Pointer
- Great for drawing attention to content
- Cheap technology but mostly distracting
- Too shaky, imprecise
- But it is not a mouse
- More awkward to use than mouse
- Another gadget to hold in the hand, button to
identify, batteries to maintain - Small number of identified gestures
- There are better ways of drawing attention to
slide content - I rarely use it and dont like it when others do
- Low computational overhead
- Dumb vs Intelligent Device Discussion
31Speech Recognition
- Initially seems like great idea
- Speaker is already speaking, so can use it to
control presentation - Want passive, intelligent listener
- Not a dialog
- No prompt alienating distraction
- Want no mistakes
- For dialog, better to guess than ignore
- For us, high cost for incorrect guess
- Most words are not relevant to speech system
- More trouble than it is worth
- But may be good for real-time search of content
32More useful aspect Output modalities
- Presenter has put the time and effort into the
production - Simplier is better
- Audience has harder task
- Understand material being presented
- Record thoughts, impressions, connections
- Filter for later review
- Process in real-time
- Keep-up with presentation
- Do all this with minimal distractions
- Output modalities
- Content for live audience
- Content for speaker (superset of audience)
- Content for retrieval
- Correlate notes with content
33Record and correlate notes with presentation