SIMS 202 Information Organization and Retrieval - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

SIMS 202 Information Organization and Retrieval

Description:

Orienteering ... Orienteering Post-Search Behaviors: Read and Annotate. Analyze: 80% fell ... Berry picking/orienteering offer an alternative to the ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 41
Provided by: melody2
Category:

less

Transcript and Presenter's Notes

Title: SIMS 202 Information Organization and Retrieval


1
SIMS 202Information Organization and Retrieval
  • Prof. Marti Hearst and Prof. Ray Larson
  • UC Berkeley SIMS
  • Tues/Thurs 930-1100am
  • Fall 2000

2
Today
  • Modern IR textbook topics
  • The Information Seeking Process

3
Information Retrieval (IR)
  • Information
  • Representation
  • Storage
  • Organization
  • Access

4
Information vs. Data Retrieval
  • IR concerned more with retrieving information
    about a subject than the retrieving data which
    satisfies a given query
  • IR usually deals with natural language text which
    is not always well structures and could be
    semantically ambiguous

5
Textbook Topics
6
More Detailed View
7
What Well Cover
A Lot
A Little
8
Search and RetrievalOutline of Part I of SIMS 202
  • The Search Process
  • Information Retrieval Models
  • Content Analysis/Zipf Distributions
  • Evaluation of IR Systems
  • Precision/Recall
  • Relevance
  • User Studies
  • System and Implementation Issues
  • Web-Specific Issues
  • User Interface Issues
  • Special Kinds of Search

9
What is an Information Need?
10
The Standard Retrieval Interaction Model
11
Standard Model
  • Assumptions
  • Maximizing precision and recall simultaneously
  • The information need remains static
  • The value is in the resulting document set

12
Problem with Standard Model
  • Users learn during the search process
  • Scanning titles of retrieved documents
  • Reading retrieved documents
  • Viewing lists of related topics/thesaurus terms
  • Navigating hyperlinks
  • Some users dont like long disorganized lists of
    documents

13
Search is an Iterative Process
14
Berry-Picking as an Information Seeking
Strategy (Bates 90)
  • Standard IR model
  • assumes the information need remains the same
    throughout the search process
  • Berry-picking model
  • interesting information is scattered like berries
    among bushes
  • the query is continually shifting

15
A sketch of a searcher moving through many
actions towards a general goal of satisfactory
completion of research related to an information
need. (after Bates 89)
Q2
Q4
Q3
Q1
Q5
Q0
16
Berry-picking model (cont.)
  • The query is continually shifting
  • New information may yield new ideas and new
    directions
  • The information need
  • is not satisfied by a single, final retrieved set
  • is satisfied by a series of selections and bits
    of information found along the way.

17
Information Seeking Behavior
  • Two parts of a process
  • search and retrieval
  • analysis and synthesis of search results
  • This is a fuzzy area we will look at several
    different working theories.

18
Search Tactics and Strategies
  • Search Tactics
  • Bates 79
  • Search Strategies
  • Bates 89
  • ODay and Jeffries 93

19
Tactics vs. Strategies
  • Tactic short term goals and maneuvers
  • operators, actions
  • Strategy overall planning
  • link a sequence of operators together to achieve
    some end

20
Information Search Tactics (after Bates 79)
  • Monitoring tactics
  • keep search on track
  • Source-level tactics
  • navigate to and within sources
  • Term and Search Formulation tactics
  • designing search formulation
  • selection and revision of specific terms within
    search formulation

21
Term Tactics
  • Move around the thesaurus
  • superordinate, subordinate, coordinate
  • neighbor (semantic or alphabetic)
  • trace -- pull out terms from information already
    seen as part of search (titles, etc)
  • morphological and other spelling variants
  • antonyms (contrary)

22
Source-level Tactics
  • Bibble
  • look for a pre-defined result set
  • e.g., a good link page on web
  • Survey
  • look ahead, review available options
  • e.g., dont simply use the first term or first
    source that comes to mind
  • Cut
  • eliminate large proportion of search domain
  • e.g., search on rarest term first

23
Source-level Tactics (cont.)
  • Stretch
  • use source in unintended way
  • e.g., use patents to find addresses
  • Scaffold
  • take an indirect route to goal
  • e.g., when looking for references to obscure
    poet, look up contemporaries
  • Cleave
  • binary search in an ordered file

24
Monitoring Tactics(strategy-level)
  • Check
  • compare original goal with current state
  • Weigh
  • make a cost/benefit analysis of current or
    anticipated actions
  • Pattern
  • recognize common strategies
  • Correct Errors
  • Record
  • keep track of (incomplete) paths

25
Additional Considerations(Bates 79)
  • Add a Sort tactic!
  • More detail is needed about short-term
    cost/benefit decision rule strategies
  • When to stop?
  • How to judge when enough information has been
    gathered?
  • How to decide when to give up an unsuccesful
    search?
  • When to stop searching in one source and move to
    another?

26
Lexis-Nexis Interface
  • What tactics did you use?
  • What strategies did you use?

27
Implications
  • Interfaces should make it easy to store
    intermediate results
  • Interfaces should make it easy to follow trails
    with unanticipated results
  • Makes evaluation more difficult.

28
Orienteering (ODay Jeffries 93)
  • Interconnected but diverse searches on a single,
    problem-based theme
  • Focus on information delivery rather than search
    performance
  • Classifications resulting from an extended
    observational study
  • 15 clients of professional intermediaries
  • financial analyst, venture capitalist, product
    marketing engineer, statistician, etc.

29
Orienteering (ODay Jeffries 93)
  • Identified three main search types
  • Monitoring
  • Following a plan
  • Exploratory
  • A series of interconnected but diverse searches
    on one problem-based theme
  • Changes in direction caused by triggers
  • Each stage followed by reading, assimilation, and
    analysis of resulting material.

30
Orienteering (ODay Jeffries 93)
  • Defined three main search types
  • monitoring
  • a well-known topic over time
  • e.g., research four competitors every quarter
  • following a plan
  • a typical approach to the task at hand
  • e.g., improve business process X
  • exploratory
  • explore topic in an undirected fashion
  • get to know an unfamiliar industry

31
Orienteering (ODay Jeffries 93)
  • Trends
  • A series of interconnected but diverse searches
    on one problem-based theme
  • This happened in all three search modes
  • Each analyst did at least two search types
  • Each stage followed by reading, assimilation, and
    analysis of resulting material

32
Orienteering (ODay Jeffries 93)
  • Searches tended to trigger new directions
  • Overview, then detail, repeat
  • Information need shifted between search requests
  • Context of problem and previous searches were
    carried to next stage of search
  • The value was contained in the accumulation of
    search results, not the final result set
  • These observations verified Bates predictions.

33
Orienteering (ODay Jeffries 93)
  • Triggers motivation to switch from one strategy
    to another
  • next logical step in a plan
  • encountering something interesting
  • explaining change
  • finding missing pieces

34
Stop Conditions (ODay Jeffries 93)
  • Stopping conditions not as clear as for triggers
  • People stopped searching when
  • no more compelling triggers
  • finished an appropriate amount of searching for
    the task
  • specific inhibiting factor
  • e.g., learning market was too small
  • lack of increasing returns
  • 80/20 rule
  • Missing information/inferences ok
  • business world different than scholarship

35
After the Search Analyzing and Synthesizing
Search Results
  • Orienteering Post-Search Behaviors
  • Read and Annotate
  • Analyze 80 fell into six main types

36
Post-Search Analysis Types (ODay Jeffries 93)
  • Trends
  • Comparisons
  • Aggregation and Scaling
  • Identifying a Critical Subset
  • Assessing
  • Interpreting
  • The rest
  • cross-reference
  • summarize
  • find evocative visualizations
  • miscellaneous

37
SenseMaking (Russell et al. 93)
  • The process of encoding retrieved information to
    answer task-specific questions
  • Combine
  • internal cognitive resources
  • external retrieved resources
  • Create a good representation
  • an iterative process
  • contend with a cost/benefit tradoff

38
Sensemaking (Russell et al. 93)
  • Most of the effort is in the synthesis of a good
    representation
  • covers the data
  • increase usability
  • decrease cost-of-use

39
Summary
  • The information access process
  • Berry picking/orienteering offer an alternative
    to the standard IR model
  • More difficult to assess results
  • Interactive search behavior can be analyzed in
    terms of tactics and strategies
  • Sensemaking
  • Combining searching with the use of the results
    of search.

40
Next Time
  • IR Systems Overview
  • Query Languages
  • Boolean Model
  • Boolean Queries
Write a Comment
User Comments (0)
About PowerShow.com