Search Text Mining Web Site Usability - PowerPoint PPT Presentation

About This Presentation
Title:

Search Text Mining Web Site Usability

Description:

UCB CS Research Fair. BAILANDO Projects. Better Access to Information. using Language Analysis and ... UCB CS Research Fair. Cha-Cha ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 31
Provided by: harva54
Category:
Tags: fair | mining | search | site | text | usability | web

less

Transcript and Presenter's Notes

Title: Search Text Mining Web Site Usability


1
SearchText MiningWeb Site Usability
  • Marti Hearst
  • SIMS

2
BAILANDO Projects
  • Better Access to Information
  • using Language Analysis and
  • Novel Dynamic Organizations

3
Current BAILANDO Projects
  • CHA-CHA FLAMENCO
  • Better Search Interfaces
  • LINDI
  • UI support for Search
  • Text Data Mining
  • TANGO
  • Automated Web Site Usability

4
Search UIs
  • Combine Browsing Search
  • Place Search Results in Context

Large Category Hierarchies
5
Cha-Cha Students Mike Chen, Jamie Laflen, Jason
Hong, Jimmy Lin, Shiang Chen
6
Medical Category Hierarchy
7
DynaCat (Pratt, Hearst, Fagan 99)
8
DynaCat Study
  • Design
  • Three queries
  • 24 cancer patients
  • Compared three interfaces
  • ranked list, clusters, categories
  • Results
  • Participants strongly preferred categories
  • Participants found more answers using categories
  • Participants took same amount of time with all
    three interfaces
  • Similar results have been verified by another
    study by Chen and Dumais (CHI 2000)

9
Cat-a-Cone Interface(Hearst Karadi 97)
10
FLAMENCOImproving Search via Large Category
Hierarchies
  • How to show intersections across category types?
  • How to preview related categories in a
    user-tailored, dynamic manner?

11
Text Data Mining
  • Relationships between information in documents
    can create new facts, not previously known.

12
Imagine
  • You are a medical researcher
  • Your patient has
  • spinal inflammation
  • numbness in fingers
  • low TC levels
  • negative results for all tests
  • How can you help her?

13
Idea
  • A new way of searching text.
  • Link pieces of information together
  • to formulate hypotheses

14
LINDILinking Information for New DIscoveries
  • Three main parts
  • Search UI for building and reusing hypothesis
    seeking strategies.
  • Statistical language analysis techniques for
    interpreting the text.
  • Backend for interfacing with various databases
    and translating different formats.

15
Gathering Evidence
Spinal Inflammation
Numbness in fingers
Low TC Levels
16
Gathering Evidence
Find diseases associated with each
Spinal Inflammation
Numbness in fingers
Low TC Levels
17
Gathering Evidence
Find unanticipated commonalities
Spinal Inflammation
Numbness in fingers
Low TC Levels
18
Supporting Cascaded Search Operations
19
(No Transcript)
20
New Language Analysis
  • First use category labels to retrieve candidate
    documents
  • Then use language analysis to detect causal
    relationships between concepts
  • Title
  • Magnesum deficiency implicated in increased
    stress levels.
  • Interpretation
  • ltnutrientgtltreductiongt related-to
    ltincreasegtltsymptomgt
  • Use these to find relationships and formulate
    hypotheses

21
Statistical Semantic Parsing
  • Modern statistical techniques
  • Mainly applied to syntactic structure
  • Probabilistic knowledge representation
  • Represent hypotheses with different degrees of
    certainty.

22
Automating Assessment of Web Site Usability

23
Why Worry?
  • Problem IBM's extranet
  • Heavy use of help and search
  • Unhappy users
  • Solution
  • Massive web site redesign
  • Focus on info-organization, not the purchasing
    process.
  • Cost "in the millions"
  • Results
  • Not announced or trumped up
  • Use of "help" decreased 84
  • Sales increased 400

24
Web TANGOTool for Assessing NaviGation
Organization
  • Goal automated support for comparing design
    alternatives
  • How Assess usability of the information
    architecture
  • Approximate peoples information-seeking behavior
    (Monte Carlo simulation)
  • Output quantitative usability metrics

25
Guidelines
  • There are many usability guidelines
  • A survey of 21 sets of web guidelines found
    little overlap (Ratner et al. 96)
  • Why?
  • Our hypothesis not empirically validated
  • So lets figure out what works!

26
An Empirical Study
Which features distinguish well-designed web
pages?
27
Methodology
  • Data collection
  • 1108 pages
  • 163 sites
  • 3 levels per site
  • 14 metrics
  • About 85 accurate
  • Text cluster and text positioning counts less
    accurate

28
Metrics
29
Preliminary Results
  • Linear regression to predict Webby judges ratings
  • Top 30 vs bottom 30
  • Prediction accuracy
  • 72 if categories not taken into account
  • 83 if categories assessed separately

30
Goals
  • Create empirical foundations for what is still
    guesswork
  • Next step
  • A free online tool
  • Long term goal
  • An monte carlo simulator for comparing potential
    designs

31
For More Information
  • http//webtango.berkeley.edu
  • hearst_at_sims.berkeley.edu
Write a Comment
User Comments (0)
About PowerShow.com