Title: SHARK Shorthand A Text Input Method for Future Client Computing
1SHARK Shorthand ? A Text Input Method for Future
Client Computing
- Shumin Zhai
- Per-Ola Kristensson (Linköping University)
- Barton Smith
-
- Alison Sue, Clemens Drews, Johnny Accot, Jon
Graham, Paul Lee (Stanford), Michael Hunter
(BYU), Jingtao Wang (Berkeley), Tue Andersen
(Copenhagen) - IBM Almaden Research Center
2Future Client Computing
- Today
- PC desktop centric
- Occasionally goes up to the net and down to
mobile devices - Mouse - Typewriter keyboard Input-based interfaces
- Emerging and Future
- Data in network / information in ether
- VPU (hand-hold devices) as handle to digital
life - Opportunistically augmented by PCs etc
3The text input challenge
- Indispensable user task
- Efficiency vs. ease of entry
- Size / portability
- Skill based interaction
- Visual, cognitive, motor load balancing
- History of writing technology
4Zhai, Hunter, Smith 2000 Zhai, Sue, Accot 2002,
Drews
- Alphabetically Tuned and Optimized Mobile
Interface Keyboard - (ATOMIK)
5Limitations and hints from ATOMIK
- Tapping one key at a time tedious. The stylus
can be more expressive and dexterous. - People tend to remember the pattern of a whole
word, not individual letters. - Does not utilize language redundancy/statistical
intelligence.
6New Approach Shorthand-Aided Rapid Keyboarding
(SHARK)
sokgraph shorthand on keyboard as a graph
7- Sokgraph - Shorthand On Keyboard
- Sokgraph Aided Rapid Keyboarding SHARK
- Video Demo
8A form of shorthand writing one word (not
character) at a time
9Scale and location relaxation/flexibility
- Sokgraph patterns, not individual letters
crossed, are recognized and entered - Statistical intelligence embodied in a lexicon,
not used in prediction, but relaxing input
requirement. - Lower visual attention demand than tapping
10Duality and gradual skill transition
Gradual continuous shift Skill acquisition
Total Novice Tracing letter to letter Visually
guided action Recognition based Closed-loop
performance Slow accurate
Total expert Gesturing sokgraph Memory recall
based Open-loop performance Fast inaccurate
Consistent movement pattern
Falling back and relearning
keyboard as training wheel and mnemonic device
11Related Work
- Novel alphabets
- Unistrokes (Goldberg Richardson 1993)
- Graffiti (Blickenstorfer 1995)
- EdgeWrite (Wobbrock, Myers, Kemnbel 2003)
- Quikwriting (Perlin 1998)
- Wipe-activated Keyboard (Montgomery, 1982)
- Cirrin (Mankoff Abowd 1998)
- Dasher (Ward, Blackwell, Mackay 2000)
- Marking menus (Kurtenbach Buxton 1993)
- T-Cube (Venolia Neiberg 1994)
12Sokgraph Gesture Recognition
- Gesture recognition
- sampling
- filtering
- normalization
- matching against prototypes
- Requirements
- complexity scalability (10K )
- artificial symbols - no training data
- Accuracy and flexibility
- cognitive, perceptive, motoric factors
-
13Novel recognition architecture and algorithms
Kristensson Zhai, UIST 2004
- Multiple channels recognition
- shape
- location
- context (language model)
-
- Channel integration
- Bayes rule
- Dynamic channel weighting
- Performance based weighting
- Lexicon and language model
-
14Using human performance laws to improve
recognition
- Fast gestures are more open-loop, using less
visual attention - Channel integration should be adaptive
- We use Fitts law to compute normative writing
time in stylus keyboarding - Adjust location channel dynamically if user is
exceeding the normative writing time Adjustment
of recognition of individual words
15Stream editor
16Customized lexicon expansion
- Automatic capitalization and punctuation
(stateless pattern matching of context) - Dynamically add words by tapping them. If they
are not in the lexicon, they can be added
dynamically to the system
17A Feasibility Experiment the memorability of
sokgraphs
Zhai, Kristensson, CHI 2003
- 6 subjects (novice no knowledge of ATOMIK)
- 5 sessions ( each in a different day)
- Session 1 practice (40 min) only
- Session 2 4, test first (about 10 min), then
practice (40 min) - Session 5 test only
- Practice with Expanding Rehearsal Interval
(Landauer Bjork 1978, Zhai, Sue, Accot 2002) - Words taken from BNC top 100 or 300
18Results mean accumulated words remembered
19Results number of words learned per session
20Empirical records
21Evaluation in the wild
- Free trial download at IBM alphaWorks, Oct 21,
2004 -
22User Reaction Oct 28, Slashdot.org
- "IBM's famous research lab for nanotechnology,
microelectronics and exotic science, Almaden
Research Center, has released an advanced,
efficient, pen-based text input method for mobile
computing, that allows you to trace letters on
the keyboard to enter a word rather than typing
each letter individually. The new technology
provides a more fluid, smooth, and natural
interaction (see demo ) than tapping on stylus
keyboards."
23User Reaction Nov. 3, jkOnTheRun Top Ten Tech
Blog
Text Entry Epiphany for the Tablet PC-
SHARK http//jkontherun.blogs.com/jkontherun/200
4/11/text_entry_epip.html
- "I am happy to report what I feel is a
revolutionary breakthrough - "This method is so simple and accurate it amazes
me every time I use it - it is phenomenal
- "It is almost faster than touch typing on a
keyboard.
24(No Transcript)
25More user reaction
- Australia Financial Review (Nov 16, 2004)
- may revolutionise how we use touch-screen
computers - Other blogs and news
26Thank you and Questions
27Where do we go from here?
- End of the beginning much more can be done on
the technology - Evaluations / analysis / understanding
- PDA / Smart Phones
- How IBM can play in the commodity world where the
user interaction with technology is? - Key-component technologies?
- Technology lock-in (Qwertynomics)
28Optimizing layout for sokgraph gesture
- Yes - have tried and will try more
- Difficult less room for improvement
- Computationally challenging (least ambiguity for
thousands of words, weighted by frequency -- or
trigraph) - Ambiguity depends on recognition algorithm
- ATOMIK (isotropic) vs. QWERTY (zig-zag)
29Preprocessing and pruning
- Smoothing (filtering)
- Equidistant re-sampling to a fixed N number of
points - Normalization in scale and translation (for shape
channel and pruning) - Pruning scheme
30Using higher level language regularity
- Bigram language model
- Viterbi decoding of most likely word sequence
- Problem of highly accurate recognition data being
integrated with noisy statistics - Integration using a Gaussian function, again,
Sigma is an empirical parameter
31Marking menu vs. Shark
- Marking menu
- 1990
- Command selection
- Angular direction
- Dozens of commands
- Binary novice-expert transition (delayed feedback)
- Shark
- 2000
- Text input
- Pattern recognition
- Thousands of words
- Gradual visual tracing to recall-based gesturing