Title: Search Text Mining Web Site Usability
1SearchText MiningWeb Site Usability
2Search InterfacesPast Projects
- TileBars
- Scatter/Gather
- DynaCat
- Cat-a-Cone
3BAILANDO Projects
- Better Access to Information
- using Language Analysis and
- Novel Dynamic Organizations
4Current BAILANDO Projects
- CHA-CHA
- Web Search results in Context
- LINDI
- UI support for Search
- Text Data Mining
- TANGO
- Automated Web Site Usability
5Search UIs
- Combine Browsing Search
- Place Search Results in Context
-
-
Large Category Hierarchies
6Cha-Cha Students Mike Chen, Jamie Laflen, Jason
Hong, Jimmy Lin, Shiang Chen
7Medical Category Hierarchy
8DynaCat (Pratt, Hearst, Fagan 99)
9DynaCat Study
- Design
- Three queries
- 24 cancer patients
- Compared three interfaces
- ranked list, clusters, categories
- Results
- Participants strongly preferred categories
- Participants found more answers using categories
- Participants took same amount of time with all
three interfaces - Similar results have been verified by another
study by Chen and Dumais (CHI 2000)
10 Cat-a-Cone Interface(Hearst Karadi 97)
11Improving Search via Large Category Hierarchies
-
- How to show intersections across category types?
- How to preview related categories in a
user-tailored, dynamic manner?
12Information retrieval
Text Data Mining
13Information retrieval
- Selection or rejection of existing documents
based on a function of word match.
14Text Data Mining
- Relationships between information in documents
can create new facts, not previously known.
15Imagine
- You are a medical researcher
- Your patient has
- spinal inflammation
- numbness in fingers
- low TC levels
- negative results for all tests
- How can you help her?
16Idea
- A new way of searching text.
- Link pieces of information together
- to formulate hypotheses
-
17LINDILinking Information for New DIscoveries
- Students Barbara Rosario, David Blei
- Three main parts
- Search UI for building and reusing hypothesis
seeking strategies. - Statistical language analysis techniques for
interpreting the text. - Backend for interfacing with various databases
and translating different formats.
18Gathering Evidence
Spinal Inflammation
Numbness in fingers
Low TC Levels
19Gathering Evidence
Find diseases associated with each
Spinal Inflammation
Numbness in fingers
Low TC Levels
20Gathering Evidence
Find unanticipated commonalities
Spinal Inflammation
Numbness in fingers
Low TC Levels
21Supporting Cascaded Search Operations
22(No Transcript)
23New Language Analysis
- First use category labels to retrieve candidate
documents - Then use language analysis to detect causal
relationships between concepts - Title
- Magnesum deficiency implicated in increased
stress levels. - Interpretation
- ltnutrientgtltreductiongt related-to
ltincreasegtltsymptomgt - Use these to find relationships and formulate
hypotheses
24Statistical Semantic Parsing
- Modern statistical techniques
- Mainly applied to syntactic structure
- Probabilistic knowledge representation
- Represent hypotheses with different degrees of
certainty.
25Automating Assessment of Web Site Usability
26Why Worry?
- Problem IBM's extranet
- Heavy use of help and search
- Unhappy users
- Solution
- Massive web site redesign
- Focus on info-organization, not the purchasing
process. - Cost "in the millions"
- Results
- Not announced or trumped up
- Use of "help" decreased 84
- Sales increased 400
27Web TANGOTool for Assessing NaviGation
Organization
- Student Melody Ivory
- Goal automated support for comparing design
alternatives - How Assess usability of the information
architecture - Approximate peoples information-seeking behavior
(Monte Carlo simulation) - Output quantitative usability metrics
28Anatomy of Web Site Design
29Usability EvaluationStandard Techniques
- User studies
- Have people use the interface to complete some
tasks - Requires an implemented interface
- "Discount" vs. Scientific Results
- Heuristic Evaluation
- An expert assesses a design or implementation
according to certain guidelines
30Automated Usability Evaluation
- Logging/capture
- Pro Easy
- Con Requires implemented system
- Con Don't know the user task (web)
- Con Don't present alternatives
- Con Don't distinguish error from success
- Analytical Modeling
- Pro doable at design phase
- Con models an expert
- Con academic exercise
- Simulation
31Existing Metrics
- Web metric analysis tools report on what is easy
to measure, e.g. - Predicted download time
- Depth/breadth of site
- We want to worry about
- Content
- User goals/tasks
- Not available from logs
- We also want to compare alternative designs.
32Monte Carlo Simulation
- Have a model of information structure
- Have a set of user goals
- Want to assess navigation structure
- Compare alternatives/tradeoffs
- Identify bottlenecks
- Identify critically important pages/links
- Check all pairs of start/end points
- Check overall reachability before and after a
change.
33Monte Carlo Simulation
- At each step in the simulation
- Assume a probability distribution over a set of
next choices. - The next choice is a function of
- The current goal
- The understandability of the choice
- The overall complexity of the set of choices
- Prior interaction history
- These can use models of "scent"
- Varying the distribution corresponds to varying
properties of the links - Spot-check important choices
34X
One Monte Carlo simulation step for Design 1,
Task 1. Simulation starts from the home page and
the target information is at Renter Support.
35X
Monte Carlo simulation results for Design 1, Task
1. Simulation runs start from all pages in the
site. Average Navigation times are shown for
Tasks 2 3.
36Using Simulator Results
- Design Decisions
- Use Design 1
- Improve Tasks 1 2
- Next Steps
- Analyze results for Tasks 1 2
- Create new Design 1
- Repeat simulation to compare old new designs
- Iterate if necessary
37Research Issues Navigation Predictions
- Develop IR model for predicting link selection
- Requirements
- Information need (task metadata)
- Representation of pages (page metadata)
- Method for selecting links (relevance ranking)
- Maintaining users conceptual model during site
traversal (scent Fur97,LC98,Pir97) - One possible approach
- Information Foraging Theory PC95,Pir97,PPR96
- Functional categorization of pages based on
features - Prediction of relevance to current page
- Consider link connectivity, text similarity
usage
38Other HCC-Related Projects
- Using a large digital desk in design
- Ame Elliot
- Using visualization for light design
- Dan Glaser
- User interfaces and computer security
- Prof. Doug Tygar, Rachna Dahmija