Search Text Mining Web Site Usability

1 / 36
About This Presentation
Title:

Search Text Mining Web Site Usability

Description:

UCB HCC Retreat. Cha-Cha. Students: Mike Chen, Jamie Laflen, Jason Hong, Jimmy Lin, Shiang Chen ... UCB HCC Retreat. Improving Search via Large Category Hierarchies ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Search Text Mining Web Site Usability


1
SearchText MiningWeb Site Usability
  • Marti Hearst
  • SIMS

2
Search InterfacesPast Projects
  • TileBars
  • Scatter/Gather
  • DynaCat
  • Cat-a-Cone

3
BAILANDO Projects
  • Better Access to Information
  • using Language Analysis and
  • Novel Dynamic Organizations

4
Current BAILANDO Projects
  • CHA-CHA
  • Web Search results in Context
  • LINDI
  • UI support for Search
  • Text Data Mining
  • TANGO
  • Automated Web Site Usability

5
Search UIs
  • Combine Browsing Search
  • Place Search Results in Context

Large Category Hierarchies
6
Cha-Cha Students Mike Chen, Jamie Laflen, Jason
Hong, Jimmy Lin, Shiang Chen
7
Medical Category Hierarchy
8
DynaCat (Pratt, Hearst, Fagan 99)
9
DynaCat Study
  • Design
  • Three queries
  • 24 cancer patients
  • Compared three interfaces
  • ranked list, clusters, categories
  • Results
  • Participants strongly preferred categories
  • Participants found more answers using categories
  • Participants took same amount of time with all
    three interfaces
  • Similar results have been verified by another
    study by Chen and Dumais (CHI 2000)

10
Cat-a-Cone Interface(Hearst Karadi 97)
11
Improving Search via Large Category Hierarchies
  • How to show intersections across category types?
  • How to preview related categories in a
    user-tailored, dynamic manner?

12
Information retrieval

Text Data Mining
13
Information retrieval
  • Selection or rejection of existing documents
    based on a function of word match.

14
Text Data Mining
  • Relationships between information in documents
    can create new facts, not previously known.

15
Imagine
  • You are a medical researcher
  • Your patient has
  • spinal inflammation
  • numbness in fingers
  • low TC levels
  • negative results for all tests
  • How can you help her?

16
Idea
  • A new way of searching text.
  • Link pieces of information together
  • to formulate hypotheses

17
LINDILinking Information for New DIscoveries
  • Students Barbara Rosario, David Blei
  • Three main parts
  • Search UI for building and reusing hypothesis
    seeking strategies.
  • Statistical language analysis techniques for
    interpreting the text.
  • Backend for interfacing with various databases
    and translating different formats.

18
Gathering Evidence
Spinal Inflammation
Numbness in fingers
Low TC Levels
19
Gathering Evidence
Find diseases associated with each
Spinal Inflammation
Numbness in fingers
Low TC Levels
20
Gathering Evidence
Find unanticipated commonalities
Spinal Inflammation
Numbness in fingers
Low TC Levels
21
Supporting Cascaded Search Operations
22
(No Transcript)
23
New Language Analysis
  • First use category labels to retrieve candidate
    documents
  • Then use language analysis to detect causal
    relationships between concepts
  • Title
  • Magnesum deficiency implicated in increased
    stress levels.
  • Interpretation
  • ltnutrientgtltreductiongt related-to
    ltincreasegtltsymptomgt
  • Use these to find relationships and formulate
    hypotheses

24
Statistical Semantic Parsing
  • Modern statistical techniques
  • Mainly applied to syntactic structure
  • Probabilistic knowledge representation
  • Represent hypotheses with different degrees of
    certainty.

25
Automating Assessment of Web Site Usability

26
Why Worry?
  • Problem IBM's extranet
  • Heavy use of help and search
  • Unhappy users
  • Solution
  • Massive web site redesign
  • Focus on info-organization, not the purchasing
    process.
  • Cost "in the millions"
  • Results
  • Not announced or trumped up
  • Use of "help" decreased 84
  • Sales increased 400

27
Web TANGOTool for Assessing NaviGation
Organization
  • Student Melody Ivory
  • Goal automated support for comparing design
    alternatives
  • How Assess usability of the information
    architecture
  • Approximate peoples information-seeking behavior
    (Monte Carlo simulation)
  • Output quantitative usability metrics

28
Anatomy of Web Site Design
29
Usability EvaluationStandard Techniques
  • User studies
  • Have people use the interface to complete some
    tasks
  • Requires an implemented interface
  • "Discount" vs. Scientific Results
  • Heuristic Evaluation
  • An expert assesses a design or implementation
    according to certain guidelines

30
Automated Usability Evaluation
  • Logging/capture
  • Pro Easy
  • Con Requires implemented system
  • Con Don't know the user task (web)
  • Con Don't present alternatives
  • Con Don't distinguish error from success
  • Analytical Modeling
  • Pro doable at design phase
  • Con models an expert
  • Con academic exercise
  • Simulation

31
Existing Metrics
  • Web metric analysis tools report on what is easy
    to measure, e.g.
  • Predicted download time
  • Depth/breadth of site
  • We want to worry about
  • Content
  • User goals/tasks
  • Not available from logs
  • We also want to compare alternative designs.

32
Monte Carlo Simulation
  • Have a model of information structure
  • Have a set of user goals
  • Want to assess navigation structure
  • Compare alternatives/tradeoffs
  • Identify bottlenecks
  • Identify critically important pages/links
  • Check all pairs of start/end points
  • Check overall reachability before and after a
    change.

33
Monte Carlo Simulation
  • At each step in the simulation
  • Assume a probability distribution over a set of
    next choices.
  • The next choice is a function of
  • The current goal
  • The understandability of the choice
  • The overall complexity of the set of choices
  • Prior interaction history
  • These can use models of "scent"
  • Varying the distribution corresponds to varying
    properties of the links
  • Spot-check important choices

34
X
One Monte Carlo simulation step for Design 1,
Task 1. Simulation starts from the home page and
the target information is at Renter Support.
35
X
Monte Carlo simulation results for Design 1, Task
1. Simulation runs start from all pages in the
site. Average Navigation times are shown for
Tasks 2 3.
36
Using Simulator Results
  • Design Decisions
  • Use Design 1
  • Improve Tasks 1 2
  • Next Steps
  • Analyze results for Tasks 1 2
  • Create new Design 1
  • Repeat simulation to compare old new designs
  • Iterate if necessary

37
Research Issues Navigation Predictions
  • Develop IR model for predicting link selection
  • Requirements
  • Information need (task metadata)
  • Representation of pages (page metadata)
  • Method for selecting links (relevance ranking)
  • Maintaining users conceptual model during site
    traversal (scent Fur97,LC98,Pir97)
  • One possible approach
  • Information Foraging Theory PC95,Pir97,PPR96
  • Functional categorization of pages based on
    features
  • Prediction of relevance to current page
  • Consider link connectivity, text similarity
    usage

38
Other HCC-Related Projects
  • Using a large digital desk in design
  • Ame Elliot
  • Using visualization for light design
  • Dan Glaser
  • User interfaces and computer security
  • Prof. Doug Tygar, Rachna Dahmija
Write a Comment
User Comments (0)