Title: Visualization and Evaluation at Microsoft Research
1Visualization and Evaluation at Microsoft
Research
- George Robertson, Mary Czerwinski and VIBE team
2Visualization Benefits
- Visualization helps deal with info overload
- Synthesize meaning from multiple data sources
- Large volume of dynamically changing data
- Identify topics and trends
- Identify relationships along multiple dimensions
- Must evaluate against existing techniques
3Visualization in Microsoft Products
- Data Visualization
- Excel charting
- Information Visualization
- Basic hierarchy visualization TreeView
- Microsoft Business Solutions
- BizTalk Server
4Visualization Research Categories Information
Management
- Data Mountain (UIST98)
- Photo Mountain (WinHEC 2001)
- DateLens (CHI 2003)
- FacetMap (AVI 2006)
- FaThumb (CHI 2006)
- Principles leverage spatial memory, animation,
space-filling for scaling, provide tools for
personalization
5Document ManagementData Mountain (UIST98)
- Size is strongest cue
- 26 faster than IE4
- After 6 months, no performance change
- Images help, but not required
- Faster retrieval when similar pages are
highlighted
Subject Layout of 100 Pages
6Evaluation
- Wanted to test the spatial memory hypothesis
- Also wanted to know what the influence of other
factors were - Thumbnail image
- Audio cues
- Text summaries
7Method
- Gave subjects a cue to look for after they
arranged their Data Mountain - Cue either had text summary, a thumbnail, an
audio cue or all 3 - Time to retrieve the right page, number of
misses were dependent measures - After 6 months, had them do it again
- This time, 50 of the trails had the thumbnail
images turned off!
8Calendar VisualizationDatelens (CHI 2003)
- With Ben Bederson _at_ U. Maryland
- Fisheye representation of dates
- Compact overviews
- User control over the view
- Integrated search (keyword)
- Enables overviews, fluid navigation to discover
patterns and outliers - Integrated with outlook
9Evaluation
- First, prototyped on desktop to perform formative
evaluation but also tested against existing UI - Next built onto Pocket PC
- Gave to PPC owners for 3 days use
- Performed benchmark tasks with them on 4th day,
satisfaction ratings over all 4 days
10Benchmark Study
- DateLens v. Microsofts Pocket PC 2002
- Goals
- 1st iteration of UI with potential users
- to compare its overall usability against an
existing product - Marys calendar, seeded with artificial calendar
events, utilized
11 Figure 7 Screen shots of the Microsoft Pocket PC
Calendar program that was used in the study
showing day, week, month, and year views.
12Methods
- 11 knowledge workers (5 F)
- All experienced PC, not PDA users
- 11 isomorphic browsing tasks on each calendar
- All conditions counterbalanced
- All tasks had deadline of 2 minutes
- Find the dates of specific calendar events (e.g.,
birthdays) - Determine how many Mondays a month contained
- View all bdays for the next 3 months
- Task times, success rate, verbal protocols, user
satisfaction and preference
13ResultsTask Times
- Tasks were performed faster using DateLens,
F(1,8)3.5, p.08 - Avg49 v. 55.8 secs for the Pocket PC
- Complex tasks significantly harder, plt.01, but
handled reliably better by DateLens (task x
calendar interaction), p.04
14ResultsTask Times
15Task Success
- Tasks were completed successfully significantly
more often using DateLens (on average, 88.2 v.
76.3 for the PPC, plt.001. - In addition, there was a significant main effect
of task, plt.001. - For the most difficult task (11), no participant
using the Pocket PC completed the task
successfully.
16Task Success
17Usability Issues
- Many users disliked the view when more than 6
months were shown - Concerns about the readability of text, needs to
be customizable - Wanted more control about how weeks were viewed
(e.g., start with Sunday or Monday?) - Needed better visual indicators of conflicts for
both calendars, e.g., red highlights and/or a
conflicts filter
18FacetMapFaceted Search Results of Digital Bits
- Meant to use metadata of your digital stuff to
aid in browsing - Abstract, scalable, space-filling
- Visual more than textual
- Study showed favored over existing techniques for
browsing tasks
19Small Size
20Large Size (Wall Display)
21Evaluation
- Wanted to test against textual search UI
(existing system) - Needed to use both text search and browse at
various levels of depth - Targeted Find the earliest piece of email Gordon
received from Jim Gemmell (text search for
Gemmell). - Browse Name a document that Gordon modified in
the 3rd week of May, 2000. - Also, needed to test search for different kinds
of dimension (file type, date, people, etc.)
22The Text Baseline
23Results
Question FacetMap Memex
Mental Demand 4.0 (1.8) 4.3 (1.6)
Physical Demand 3.6 (2.1) 3.6 (1.6)
System Response Time 4.8 (1.4) 5.7 (1.1)
Satisfaction 5.6 (1.4) 5.4 (0.8)
Preference over Existing Techniques 4.9 (1.2) 5.2 (1.4)
Browsing Support 5.9 (0.9) 5.9 (0.9)
Text Search Support 5.9 (1.4) 5.3 (0.8)
Aesthetic Appeal 5.3 (1.3) 4.1 (1.5)
24Visualization Research Categories Task Management
- Scalable Fabric (WinHEC 2003)
- Clipping Lists (summer 2005)
- Change Borders (summer 2005)
- Principles leverage spatial memory and
periphery to reduce clutter and improve
glancability - Users stay in the flow of their tasks longer,
switch more optimally
25Task ManagementScalable Fabric (WinHEC 2003)
- Beyond Minimization
- Manage Windows tasks using natural human skills
- Central focus area
- Periphery windows scaled
- Cluster of windows task
- Works on variety of displays
- Download available Aug. 2005 5000 downloads in
1st 2 months
26Evaluation
- Similar to TG, users lay out tasks
- Simulate task switching
- Compare to TaskBar
- Also, 3 weeks real usage satisfaction
27Visualization Research Categories Improved
Productivity Readability
- Clipping Lists and Change Borders
- Principles remove content of less importance to
get more info on the screen, reduce occlusion for
readability
28Study compare abstraction techniques
- Change detection
- signals when a change has occurred
- Semantic content extraction
- pulling out and showing the most relevant content
- Scaling
- shrunken version of all the content
- ?Which will most improve multitasking efficiency?
29Our Designs
30Study Design
no semantic content extraction semantic content extraction
no change detection
change detection
31Comparing Tradeoffs
no semantic content extraction semantic content extraction
no change detection spatial layout no legible content most relevant task info detailed visuals / text
change detection spatial layout simple visual cue for change limited info most relevant task info simple visual cue for change
32User Study Participants
- 26 users from the Seattle area (10 female)
- moderate to high experience using computers and
Microsoft Office-style applications
33User Study Tasks
- Four tasks designed to mimic real world tasks
- Quiz - wait for modules to load
- Uploads - wait for documents to upload
- Email - wait for quiz answers and upload
task documents to arrive - Puzzle - high-attention task done while
waiting
34Quiz
35User Study Tasks
- Four tasks designed to mimic real world tasks
- Quiz - wait for modules to load
- Uploads - wait for documents to upload
- Email - wait for quiz answers and upload
task documents to arrive - Puzzle - high-attention task done while
waiting
36Uploads
37User Study Tasks
- Four tasks designed to mimic real world tasks
- Quiz - wait for modules to load
- Uploads - wait for documents to upload
- Email - wait for quiz answers and upload
task documents to arrive - Puzzle - high-attention task done while
waiting
38User Study Tasks
- Four tasks designed to mimic real world tasks
- Quiz - wait for modules to load
- Uploads - wait for documents to upload
- Email - wait for quiz answers and upload
task documents to arrive - Puzzle - high-attention task done while
waiting
39Puzzle
40User Study Tasks
- Four tasks designed to mimic real world tasks
- Quiz - wait for modules to load
- Uploads - wait for documents to upload
- Email - wait for quiz answers and upload
task documents to arrive - Puzzle - high-attention task done while
waiting
41User Study Setup
right monitor
left monitor
42Measures
- Overall performance
- task duration
- Accuracy of task resumption timing
- time to resume task(e.g., time between upload
finishing user clicking on upload tool) - Task flow
- number of task switches
- Recognition of windows and reacquisition of task
- number of window switches within a task
- User satisfaction
- survey after each trial the lab session
43Results overall performance
- Clipping Lists ? faster task times
- Change Borders ? no significant improvement
44Results task resumption timing
- Clipping Lists ? trend toward more accurate task
resumption timing
45Results task flow
- Clipping Lists ? reduced switches
- Change Borders ? increased switches for SF
46Results recognition reacquisition
- Clipping Lists ? reduced window switches
- Change Borders ? no significant improvement
47Results user satisfaction
- Clipping List UIs
- ? rated better than those without
- Change Border UIs
- ? rated better than those without
- Preferred UI
- 17 Clipping Lists Change Borders
- 4 Scalable Fabric Change Borders
- 2 Clipping Lists
- 2 Scalable Fabric
48Results Summary
- Clipping Lists were most effective for all
metrics - Overall performance speed
- Accuracy of task resumption timing (not
significant) - Task flow
- Recognition of windows reacquisition of task
- User satisfaction
- Improvements are cumulative, adding up to a
sizeable impact on daily multitasking
productivity - Clipping Lists
- ? 29 seconds faster on average
- Clipping Lists Change Borders
- ? 44 seconds faster on average
49Results semantic content extraction
- benefits task flow, resumption timing, and
reacquisition - improves multitasking performance more than
either change detection or scaling - ?Implication for design of peripheral interfaces
that support multitasking - providing enough relevant task info is more
important than very simplistic designs
50Visualization Research Categories Software
Visualization
- FastDASH
- Principles leverage usage data to expose most
important, relevant content to improve
discoverability
51FastDASH
- Peripheral display for showing a dev team who has
what checked out of a code base - Shows individual team members, what theyve
checked out, what method theyre in, what theyve
changed, where they may be blocked and need help - Display devotes more screen real estate to bigger
files in code base
52Evaluation
- Developed coding scheme to quickly document
communication and display usage behaviors of team - Code 2 days w/o FastDASH
- Insert FastDASH display on 3rd day
- Code 2 days w/FastDASH display
- Pre- and post- satisfaction and situation
awareness surveys
53Reduction in Use of Shared Artifacts
54Increase in Certain Communications
55Visualization Research Categories Large
Information Spaces
- Polyarchy (CHI 2002)
- PaperLens (InfoVis 2004, CHI 2005)
- Schema Mapper (CHI 2005)
- Treemap Vis of Newsgroup Communities
- Principles support interactive data exploration
through highlighting, transparency, animation and
focus context techniques
56Polyarchy Visualization (CHI 2002)
- Multiple Intersecting Hierarchies
- Show multiple hierarchies
- Show other relationships
- Search results in context
57Evaluation
- Systematically explored each potential animation
speed and transition style - Also, keystroke evaluation
58Topic Trends VisualizationPaperLens (InfoVis
2004, CHI 2005)
- Understanding a conference
- InfoVis (8 years)
- CHI (23 years)
- Helps understand
- Topic evolution
- Frequently published authors
- Frequently cited papers/authors
- Relationship between authors
59Evaluation
- Formative evaluation with target end users
- Used the information visualization contest
questions to make sure the prototype satisfied
the requirements - Noted usability issues and redesigned
- Scaled up for CHI, required massive changes
60Schema Mapper (CHI 2005)
- Current techniques fail for large schemas/maps
61Schema Mapper
- Emphasize relevant relationships
- De-emphasize other relationships
62Evaluation
- Systematically explored each new feature addition
against shipping product doing mapping tasks - Used real schema map designers
63Goals for Future
- Visual representations that
- Exploit human perception, pattern matching and
spatial memory - Summarize and scale to very large datasets
- Use animated transitions to help retain context
- Scale to a variety of display form factors
- Move into collaborative/sharing task domains
- Challenges user-centered design, creative
breakthroughs, need machine learning expertise
64- Thanks for your attention!
- http//research.microsoft.com/research/vibe