Title: Ed H. Chi www.geekbiker.com
1Ed H. Chi www.geekbiker.com
- U of Minnesota
- Ph.D. Visualization Spreadsheets
- M.S. Computational Biology
- Expertise InfoVis, Study of the Web, TaeKwonDo,
Poetry, Motorcycling, Pottery
2Information Scent Modeling User Browsing
Strategies on the Web
- Ed H. Chi
- Peter Pirolli
- User Interface Research Group
- This research was supported in part by
- Office of Naval Research contract number
'N00014-96-C-007'.
3Comparison to Library
- Experience tells us
- general layout of content
- which floor, which section.
- which books are of greatest interest
- by the wear on the spines.
- which information is timely or deadwood
- by looking at the circulation check-out stamps
inside the book covers.
4Trends and Problems
- 200M Web users, 6M web sites
- Web design process ad-hoc, not optimal
- Some tools extract behaviors and correlations but
not intentionally - Being successful requires making the Web more
useful and usable to a broader audience
5Information Foraging
6Underlying Concept
- Users seeking information is similar to
hunter/gatherers optimization strategies.
7Underlying Concept
- Information Scent is the user perception of the
cost and value of information. - Similar to hunters following animal foot prints.
8(No Transcript)
9Information Scent
- Users forage by surfing along links
- Foragers use
- proximal cues (text snippets or graphics) to
access - distal content (destination page)
- Scent is the proximal perception of value and
cost of distal content
snippet
content
link
10Assumptions
- Users have information goals, their surfing
patterns are guided by information scent - Two questions
- Given an information goal and a starting point
- Where do users go? (Behavior)
- Given some surfing pattern
- What is the users goal? (Need)
11WUFIS Web User Flow by Information Scent
Web site
Web Page content
links
User Information goal
Web user flow simulation
Predicted paths
12How does it work?
Start users at page with some goal
Examine user patterns
Flow users through the network
13WUFIS Algorithm
1
Relevant Documents
Weight Matrix
Query
14WUFIS Algorithm (cont.)
2
Scent Matrix
R Relevant documents T Topology matrix
15Prelim. Evaluation of WUFIS
- Show that WUFIS generates good URL destinations
based on information need. - 19 Websites
- Size 27-12,000 pages
- Info Provider, eCommerce, Large Corp.
- Info Need from very general (product info) to
very specific (migraine headaches) - Top ten URL position simulated are extracted.
- Each URL is blindly rated for relevancy.
16WUFIS Evaluation
- 570 ratings are collected 3 variations of the
algorithm x 10 URLs x 19 sites - Tabulated, Averaged.
- Result 7.54 (out of 10)
19 Websites
Website Info, Algorithm Performance
17IUNIS Inferring User Need by Info Scent
Web site
Web Page content
links
User Information goal
Web user flow simulation
observed paths
18Extracting Paths
- Longest Repeating Sequence (LRS)
- New path mining technique
- Extracts significant surfing paths
- Reduces the complexity of path model
19IUNIS
1
P observed user path T topology matrix W
word x document weights K relevant keywords
2
Topology
Path
Weight
Path
20Evaluation of IUNIS
- Goal
- Show that keyword summaries produced by IUNIS are
good at communicating the content of the user
paths. - Dataset
- 8 participants
- random 10 paths from (5/18/1998, xerox.com, path
length6) - booklets of pages on paths (in order)
21Evaluation of IUNIS
- Procedure
- Single rating sheet with the ten 20-word
summaries. Beside each summary, users are asked
to rate the summaries on a 5-point Likert Scale.
A copy of this rating sheet is attached to each
of the ten path booklets - Users are asked to read through each booklet and
rate each of the path summaries. - User are also asked to identify which of the ten
summaries was the best match.
22Evaluation of IUNIS
- Results
- Matching summary mean 4.58 (median5)
- Non-matching summary mean 1.97 (median1)
- Difference highly significant (p lt .001)
- Best match summary 5.6 out of 10 (Cohen
Kappa0.51) - Evaluation yield strong evidence that IUNIS
generates good summaries of the Web paths.
23ScentViz Tasks
- Overall site
- High-level traffic flow and routes?
- Ease of access and costs?
- Given a specific Web page
- Where do users come from?
- Where do they go?
- What other pages are related?
- Users
- What are interests of the users?
- Where should they go based on their need?
- Do observed data match simulation?
24Visualization Demo
- Dome Tree
- Usage Based Layout
- Path Embedding
25Scenario 1 Page Types
- Multi-way branching point
investor/sitemap.htm
26Scenario 1 Drill-down
- Few well-traveled future paths
- shareholder info
- 1998 fact book
- financial doc order
- Conclusion
- good local sitemap
27Scenario 2 Well-traveled
- Related information all over the site
- One well-worn path on the left relating to
product tutorial
Scansoft/tbpro98win/index.htm
28Scenario 3 Identify Need
- Need of path from shareinfo to orderdoc
- reinvestment
- stock
- brochure
- dividend
- shareholder
investor/sitemap.htm
29Scenario 4 Scent Predict
- Scent computed based on pagis need
- Good match between scent and LRS paths
Scansoft/pagis/index.html
30InfoScent Summary
- The overall goal is to model Web user information
needs - Bridge gap between clicks and information needs
- Predict user navigation behavior
- Develop new applications and Web usability metrics
31Questions?
- Ed H. Chi
- Chi_at_acm.org
- http//www.geekbiker.com