INF384c: Organizing and Providing Access to Information - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

INF384c: Organizing and Providing Access to Information

Description:

An example: Information in crossword puzzles. Consider this very simplified ... in crossword puzzles ... of the crossword puzzle is itself informative. ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 39
Provided by: miles2
Category:

less

Transcript and Presenter's Notes

Title: INF384c: Organizing and Providing Access to Information


1
INF384c Organizing and Providing Access to
Information
  • Preliminaries Information and Organization

2
Why are we here?
  • Is organizing information different than
    organizing anything else?
  • Theres no class on organizing your medicine
    cabinet why a course on organizing information?
    Why do you need a masters to organize a library
    but not a grocery store?
  • What does it mean to organize information well?

3
I believe that information is unique with respect
to organization
  • A theme in this course information and
    organization are intimately related.
  • I might even argue that they are
    indistinguishable.

4
The crux of my argument
  • The act of organizing is itself informative.
  • Organizing information makes new information.

5
An example Information in crossword puzzles
  • Consider this very simplified word guessing game.
    How difficult is it?

6
An example Information in crossword puzzles
  • Consider this very simplified word guessing game.
    How difficult is it?

Lets constrain the problem a bit. Pretend that
the answer must appear in the Merriam-Webster
online dictionary.
7
An example Information in crossword puzzles
  • Consider this very simplified word guessing game.
    How difficult is it?

The dictionary contains 3275 three-letter words,
so we have a 1/3275 chance of guessing the answer
correctly.
8
An example Information in crossword puzzles
  • Consider this very simplified word guessing game.
    How difficult is it?

There are only 150 three-letter words with an r
in the middle. So now our odds of guessing the
answer have gone way up.
9
An example Information in crossword puzzles
  • Consider this very simplified word guessing game.
    How difficult is it?

Another way of saying this is that we have gained
a lot of information here this is synonymous
with saying weve reduced our uncertainty.
There are only 150 three-letter words with an r
in the middle. So now our odds of guessing the
answer have gone way up.
10
Information, Probability, and Uncertainty
  • Note that Im talking about information in a
    fairly constrained sense. i.e.
  • An event or object is informative to the extent
    that it reduces our uncertainty with respect to a
    given problem.
  • Here Ive casually equated uncertainty with
    probability.

11
Information, Probability, and Uncertainty
  • An event or object is informative to the extent
    that it reduces our uncertainty with respect to a
    given problem.

This is not my own idea. It was most thoroughly
formalized by Claude Shannon in a body of work
that has come to be called information
theory. For now we wont delve into Shannons
work. Instead, lets continue our example
12
Information, Probability, and Uncertainty
  • An event or object is informative to the extent
    that it reduces our uncertainty with respect to a
    given problem
  • Probability theory provides a convenient way to
    quantify information.
  • What is a probability?
  • What does it mean to say Pr(x)1/2 ?

13
An example Quantifying Information
Probabilistically
  • Consider this very simplified word guessing game.
    How difficult is it?

There are only 150 three-letter words with an r
in the middle. So now our odds of guessing the
answer have gone way up.
14
An example Quantifying Information
Probabilistically
  • Consider this very simplified word guessing game.
    How difficult is it?

There are 246 possibilities for this arrangement.
In information-theoretic terms, then, this is
less informative than knowing the middle letter
is an r.
15
An example Quantifying Information
Probabilistically
That is, 1/246 uncertainty is higher. Note what weve done
weve quantified how informative something is!
  • Consider this very simplified word guessing game.
    How difficult is it?

There are 246 possibilities for this arrangement.
In information-theoretic terms, then, this is
less informative than knowing the middle letter
is an r.
16
Quantifying information Shannons entropy measure
  • Entropy measures the uncertainty of a random
    event

17
Quantifying information Shannons entropy measure
  • Entropy measures the uncertainty of a random
    event

Event X a toss of a two-headed coin H(X)
-1 log(1) -0 log(0) -1 0 - 0 0 0
18
Quantifying information Shannons entropy measure
  • Entropy measures the uncertainty of a random
    event

Event X a toss of a fair coin H(X) -1/2
log(1/2) 1/2 log(1/2) -1/2 -1 -1/2
-1 1/2 1/2 1
19
Quantifying information Shannons entropy measure
Note that the entropy is inversely proportional
to the probabilities so well stick to those.
  • Entropy measures the uncertainty of a random
    event

Event X draw a letter from the alphabet at
random H(X) -26 (1/26 log(1/26)) -1
log(1/26) -3.26
20
An example Information in crossword puzzles
This guessing game (a simple crossword) harder
than our first gamebut how hard?
A naive guess at our uncertainty. Why is this
wrong?
21
An example Information in crossword puzzles
You only have to solve half of the clues to fill
the puzzle. But its more complex than just
that. Filling in any square reduces, to some
extent, the uncertainty for all other squares, so
this is actually a very hard problem.
22
An example Information in crossword puzzles
Given this puzzle, our chances of guessing the
remaining answer correctly are well more than
1/3275. i.e. Of the 3275 possibilities, how many
will make three three-letter words from our
across spaces? Probably on the order of 1-20.
23
An example Information in crossword puzzles
The crucial point for us is that the uncertainty
here is far below our naïve estimate. The
organization of the crossword puzzle is itself
informative.
24
Organizing Information Creates new Information
welfare
slash
income
taxes
redefine
improve
defense
increase
security
raise
hole
mow
turf
cart
slice
putter
swing
ball
miss
ride
25
Organizing Information Creates new Information
welfare
slash
income
taxes
redefine
improve
defense
increase
security
raise
hole
mow
turf
cart
slice
putter
swing
ball
miss
ride
26
Organizing Information Creates new Information
welfare
slash
income
taxes
redefine
improve
defense
increase
security
raise
hole
mow
turf
cart
slice
putter
swing
ball
miss
ride
27
Organizing Information Creates new Information
welfare
slash
income
taxes
redefine
improve
defense
increase
security
raise
hole
mow
turf
cart
slice
putter
swing
ball
miss
ride
28
Organizing Information Creates new Information
welfare
slash
income
taxes
redefine
improve
defense
increase
security
raise
????
hole
mow
turf
cart
slice
putter
swing
ball
miss
ride
We must ______ ______ !
29
Organizing Information Creates new Information
welfare
slash
income
taxes
redefine
improve
defense
increase
security
raise
????
hole
mow
turf
cart
slice
putter
swing
ball
miss
ride
We must ______ ______ !
30
Organizing Information Creates new Information
  • How does HEB help shoppers find, say, jelly?
    (assuming benign intent)
  • How does an academic library (PCL) help
    researchers find Everything is Miscellaneous?
  • How does Amazon help reader find Everything is
    Miscellaneous?

31
From http//tinyurl.com/6gvxs5
32
Grocery store example
  • What are the effects and side effects of a
    typical grocery store organization?
  • i.e. what does the organization communicate?
    What does it omit? Which omissions are
    intentional, and which are unintentional?

33
Searching/Finding/Browsing in a library
  • An exercise (see syllabus)

34
Searching/Finding/Browsing in a library
  • Christine Borgman argues that library catalogs
    contain vestiges of card catalogs. These relate
    to Charles Cutters objectives.
  • A catalog should
  • Enable a person to find a book by author, title,
    subject
  • Enable a person to find whether or not the
    library contains a book by a given author, on a
    given subject, of a given kind of literature
  • To assist in the choice of a book as to its
    edition (bibliographic) or its character
    (literary or topical).
  • Having just searched through a catalog, what
    objectives are missing from Cutters list?

35
Searching/Finding/Browsing in a library
bibliographic control
  • The term bibliographic control occurs often in
    the literature related to organizing information.
  • Roughly speaking bibliographic control is
    maintained by the library catalog. It refers to
    the work of organizing the collection by applying
    meaningful organizational structures to the
    contents of the library (the physical contents
    and their surrogates).
  • In fact, the idea that the library catalog should
    enable searcher access to relevant information is
    relatively new in the history of bibliographic
    control. This develop is due largely to Cutter
    and his formulation of the objectives we just
    discussed.

36
My mantra
  • The act of organizing information creates new
    information

37
Some implications
  • The act of organizing information creates new
    information

By organizing information, we are making
something new. We may even being saying
something new.
38
Some implications
  • The act of organizing information creates new
    information

Maybe we could compare the quality of two
organizational schemes by analyzing what they
communicate.
By organizing information, we are making
something new. We may even being saying
something new.
Write a Comment
User Comments (0)
About PowerShow.com