Title: Information Seeking Behavior
1Information Seeking Behavior
2Introduction
- Every day we are deluged by data
- It is received through our five senses, which are
continuously at work - Wide variety of input sources
- Written material (hard copy and electronic)
- Auditory (speech, radio, CDs, etc.)
- Imagery (photographs, graphs, etc.)
- Video (TV, movies, etc.)
3Information Overload
- The greatest problem of today is how to teach
people to ignore the irrelevant, how to refuse to
know things, before they are suffocated. For too
many facts are as bad as none at all. (W.H.
Auden)
4Information Theory
- Claude Shannon, 1940s, studying communication
- Ways to measure information
- Communication producing the same message at its
destination as that seen at its source - Problem a noisy channel can distort the
message - Between transmitter and receiver, the message
must be encoded - Semantic aspects are irrelevant
5Information Theory
- Better called Communication Theory
- Communication may be over time and space
6What kinds of information are there?
- Text
- books, periodicals, WWW, memos, ads
- published/refeered
- Film
- Photos, other Images
- Broadcast TV, Radio
- Telephone Conversations
- Databases
7How much information is there?
8How Much Information?
- Stored Information
- Print
- Film
- Optical
- Magnetic
- Communicated
- Internet
- Broadcast
- Phone
- Mail
9Print
- Annual Production
- Books 968,735 8 Terabytes (compressed
image) - Newspapers 22643 25 Terabytes
- Journals 40000 2 Terabytes
- Magazines 80000 10 Terabytes
- Office Documents 12x109 pages 312 Terabytes
- TOTAL 357 Terabytes (1824 scanned, 35 text)
10Print
- Library of Congress Printed book collection
- About 18 Million books
- About 130 Terabytes (compressed image)
- For all of LC we should also assume
- 13M photographs, 5MB each 65 TB
- 4M maps, say 200 TB
- 500K files, 1GB each 500 TB
- 3.5M sound recordings, 2000 TB
- Grand total 3 petabytes (3000 terabytes)
- Books in Print
- 3.2 Million titles
- About 26 Terabytes
11Film and Image
- Film
- Photographs 410 Petabytes per year
- Movies 16 Terabytes (Commercial Production of
about 4000 films) - X-Rays 12 Petabytes
12Optical Media
- CD-Music 90,000 items 58 TB
- CD-ROM 3,000 items 3 TB
- DVD-Video 5,000 items 22 TB
- Total 83 TB
13Magnetic Media
- Audio Tape 184,200,000 184.2 Petabytes
- Video Tape 355,000,000 1420
- Floppy disks 0.07
- Removable disks 1.69
- Hard Disks 500
14Medium Type of content Terabytes/Year
Terabytes/Year
Upper Bound Lower Bound Paper
Books
8 7
Newspapers 25
20 Periodicals
12 12
Office documents 312
312 SUBTOTAL
357 351 Film
Photographs 410,000
100,000 Cinema
16 16
X-Rays 12,000
12,000 SUBTOTAL
422,000 112,016 Optical
Music CDs 58
40 Data CDs
3 3
DVDs
22 22
SUBTOTAL 83
65 Magnetic Camcorder
300,000 300,000
Disk drives 2,555,000
1,000,20 SUBTOTAL
2,855,000 1,300,200 TOTAL
3,277,440
1,412,632
15Current Size of Web
- There are an estimated 2.1 Billion pages on the
Web - About 21 Terabytes
- About 7500 further Terabytes in web-accessed DBs.
- 610 Billion email messages per year 11285 TB
- Internet Traffic is doubling every 100 days - An
estimated 62 Million Americans now use the
internet Radio took 38 years to get 50 M
listeners, TV took 13 years, the Net took 4
years...
16Internet Hosts 1989-2005
17Projected Voice and Data Traffic
18Language Distribution of Web Content
19Language Distribution on a 634 Million Web Pages
Corpus
20Human Memory
- Landauer 86 Human brain holds 200MB
- looked at rate of information intake and rate of
forgetting, and amount of information adults need
for normal tasks - 6B people on earth implies total memory of all
people alive about 1,200 petabytes - Another way
- estimate that people take in a byte/sec
- lifetime 250,000 days or 2B sec
- result is 2 GB (doesnt count synthesizing new
info)
21Data and Information
- These two terms are quite often used
interchangeably - used without any definitions or explanation
- There are no standard definitions for these two
terms - Two possible definitions
22Data and Information (cont.)
- Data
- items such as text, facts, numbers, images or
sounds that may or may not be useful for a
particular purpose - Information
- data which has been processed so that its form
and content are appropriate for a particular
purpose
23Intuitive Notion
- Information must
- Be something, although the exact nature
(substance, energy, or abstract concept) is not
clear - Be new repetition of previously received
messages is not informative - Be true false or counterfactual information is
mis-information - Be about something
- This human-centered approach emphasizes meaning
and use of message
24Knowledge
- Quite often the terms information and knowledge
are used interchangeably - One possible definition of knowledge
- a combination of information, instincts, rules,
ideas, procedures and experience that guide
actions and decisions
25Knowledge (cont.)
- Two types of knowledge
- Tacit
- also called implicit, private or personal
knowledge - knowledge held by an individual may not have
been articulated or may not be articulatable - For example, how does Michael Jordan accomplish
his slam dunks
26Knowledge (cont.)
- Explicit
- also called public or social knowledge
- expressed in a form that makes it available to
others - usually in a written form, but may be in other
forms such as verbal
27Continuum
- Quite often data, information and knowledge are
expressed as a continuum - Data gt Information gt Knowledge
28Pyramid
- Data, information and knowledge are also depicted
as a pyramid - a distillation occurs as we move up the pyramid
- data is raw material
- as data is processed, information is distilled
from it and the resulting amount is smaller in
size the same result is experienced in going
from information to knowledge
29(No Transcript)
30Wisdom
- Long term goal should be the acquisition of
wisdom - but there is not much discussion in the
literature or in the media - The current situation was aptly described by T.S.
Eliot - Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in
information?
31Wisdom (cont.)
- Wisdom connotes the ability to acquire and use
knowledge and information judiciously, possessing
the power of judging rightly and following the
soundest course of action based on knowledge,
skill, experience and understanding.
32Information Hierarchy
- Data
- The raw material of information
- Information
- Data organized and presented by someone
- Knowledge
- Information read, heard or seen and understood
- Wisdom
- Distilled and integrated knowledge and
understanding
33What is Data?
- Represented by shapes or symbols that require
cognitive skill to decipher - May not provide a context to fully understand its
meaning
- e.g.
- 10,000,000
-
- 5,000,000
34What is Information?
- Involves process of reception, recognition and
conversion - May involve a novelty factor--a new piece of
data - May have multiple interpretations resulting in
public and private information
- e.g.
- Joe won
- 10,000,000 in the lottery last year and
5,000,000 more this year.
35What is Knowledge?
- Is created/acquired from a collection of
information - Knowledge builds on a foundation of accurate
information and can be passed on to others
- e.g.
- Joe has been paying a lot of taxes because of
his lottery winnings and the brand new mansion he
bought.
36What is Wisdom/Insight?
- Represents highest level of complexity in chain
of concepts - Difficult to impart via a storage medium
- Argued to exist only within an individual
- e.g.
- He who has money has friends.
37Where is the Life we have lost in living? Where
is the wisdom we have lost in knowledge? Where is
the knowledge we have lost in information?
-- T.S. Eliot, The Rock
Where is the information we have lost in data?
38Whom Do People Ask for Information?
- People immediately present
- People they know
- People they trust
- Gatekeepers
- People in authority generally
- People with cognitive authority
- Teachers
- Librarians
39How Do People Ask for Information?
- At the moment of need
- By the easiest available route
- By what they expect will give them the most
suitable answer - By what they expect will give them the most
accessible answer
40Information and People
- Information reinforces social bonds
- People exchange familiar information
- People continue to believe erroneous information
- People say they value information (more than they
use it) - People want a known available source
41Limits to Information
- People do not want information that will upset
them - People do not want information that might upset
them - People do not want more information than they can
store - People do not want more information than they can
process - People must eventually stop getting information
and act on what they know
42Dangers of Information
- Information might be erroneous
- Information might be deliberately misleading
- Information might be contradictory
- Information might be so excessive as to paralyze
action - Information may cost more than its worth
- Relying on authority may be better than
information - Possessing information may make one a too
conspicuous social figure - Possessing information may make one a challenge
to authority
43Storing Information
- People do not want more information than they can
store - Immediate storage
- Short and long term memory
- Active knowledge
- People need more information than they can store
immediately - At hand
- In the library
- On the web
44Information Wants and Needs
- What people truly need
- What people recognize they need
- What people are willing to admit they need
- What people truly want now
- What people think they want now
- What people say they want now
45The Standard Retrieval Interaction Model
46Standard Model Assumptions
- Maximizing precision and recall simultaneously
- The information need remains static
- The value is the resulting document set
47Problems with Standard Model
- Users learn during the search process
- Scanning titles of retrieved documents
- Reading retrieved documents
- Viewing lists of related topics
- Navigating hyperlinks
- Some users dont like long disorganized lists of
documents
48Berry-Picking as an Information Seeking Strategy
- Standard IR model
- Assumes the information need remains the same
throughout the search process - Berry-picking model
- Interesting information is scattered like berries
among bushes - The query is continually shifting
49Berry-Picking Model (cont.)
- The query is continually shifting
- New information may yield new ideas and new
directions - The information need
- Is not satisfied by a single, final retrieved set
- Is satisfied by a series of selections and bits
of information found along the way
50(No Transcript)
51Systems View (cont.)
- Data enters the system and are converted into
information through a process of formatting,
filtering and summarizing. - knowledge is used to determine how to format,
filter and summarize data - Guided by knowledge, the resulting information is
interpreted - this leads to decisions and actions
52Systems View (cont.)
- The actions generate results.
- Comparison of actions and results helps
accumulate new knowledge - this improves the process of interpreting
information, making decisions and taking new
actions.
53The information search process
- the users constructive activity of finding
meaning from information in order to extend his
or her state of knowledge - the process of sense-making within a
personal frame of reference
54The user and information-seeking behavior
- There is a long history of studying human
behavior in seeking and using information - Systems-oriented studies and information-as-objec
t-oriented systems - User-oriented studies and user-oriented systems
55Systems orientation
- Information is viewed as
- an external objective entity
- having a content-based reality
- existing independently of users or social systems.
56User-centered
- Information is viewed as
- a subjective construction
- that is created internally in the minds of the
users - User orientation
- Users, looking for information to aid problem
solving and decision-making, have inadequacies in
their state of knowledge - gaps or uncertainties - sometimes they know what they need to find out
sometimes they dont
57User-centered (cont.)
- Information systems should be designed to assist
users in discovering and representing their
knowledge of a problem situation - User model
- A general user model of information seeking
behavior must encompass both the user and his
context, i,e., the information behavior of the
user and the environment in which this
behavior occurs.
58Information Behavior
- Information needs
- Information seeking
- Information use
59Dervins Sense-Making Model
- Dervins sense-making model focuses on the users
cognitive needs - the user moves through space and time
- making sense of his/her actions, the environment
and the information systems inputs - As long as everything is meaningful,
movement ahead is possible.
60Dervins Sense-Making Model
- But, movement ahead may be blocked by stops or
cognitive gaps - And user must define the nature of the gap or
the cause of the stop - Based on users assessment, he selects tactics
and information to bridge the gap.
61Kuhlthaus model
- Distinguished stages in the information search
and use process -each stage characterized by the
users behavior in three realms of experience - the affective (feelings)
- the cognitive (thought)
- the physical (action)
62Kuhlthaus model (cont.)
- Six stages of the information search
process - initiation
- selection
- exploration
- formulation
- collection
- presentation
63Communication between the user and the
information retrieval system
- each has its own language (concepts vs. symbols)
- user must translate his information need
into one the information system will
understand OR - the information system must interpret the
information need of the user and translate the
users request into one that the system can
process
64How is this communication accomplished?
- Different ways of searching
- controlled vocabulary
- natural language
65Information ecology
- User behavior and user environments
- part of what Davenport calls the information
ecology of an organization - internal environments and
- external environments
66Information ecology (Davenport)
- Davenport views an information ecology as
encompassing six components - Information strategy
- Information politics
- Information behavior and culture
- Information staff
- Information processes (use)
- Information architecture
67Information Systems
- An information system is a combination of work
practices, information, people, and information
technology organized to accomplish goals in an
organization - goals are actually outside the information system
68Forms of Information Systems
- database systems
- information storage and retrieval systems
- transaction processing systems
- management information systems
- decision support systems
- knowledge management systems
69Components of information systems
- Work practices
- Information
- People
- Information technology
70Information Life Cycle
- A useful way to envision information is in terms
of its life cycle - the life cycle identifies the phases through
which information passes from creation to final
disposition - Life cycle phases
- Creating (Authoring)
- Distribution (Networking)
71Information Life Cycle (cont.)
- Life cycle phases (cont.)
- Use
- Organizing/Indexing
- Storing/Retrieving
- Accessing/Retrieving
- Reusing/Modifying
- Disposition