SEARCHING FOR TRUTH - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

SEARCHING FOR TRUTH

Description:

Much older content is not digitized. How to digitize flowing text, ancient scripts, etc. ... Electronic bookshelves. Restricted access since most are commercial ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 22
Provided by: Sny64
Category:
Tags: for | searching | truth

less

Transcript and Presenter's Notes

Title: SEARCHING FOR TRUTH


1
SEARCHING FOR TRUTH
chapter5
  • Locating Information on the WWW

2
Information Retrieval
  • Major issues
  • Finding information
  • How is it organized
  • Is it searchable
  • Is it available
  • Much older content is not digitized
  • How to digitize flowing text, ancient scripts,
    etc.
  • Who should have access at what cost
  • Intersection of computer science and library
    science

3
Information Retrieval
  • Online access to card catalogs first achievement
  • Digital library databases current
    state-of-the-art
  • Electronic bookshelves
  • Restricted access since most are commercial
  • You can access these via Pitt library
  • Future trend is to make the entire web an
    intelligent information repository

4
UW libraries Top 20 Databases with links
5
Summary links for UW libraries reference materials
6
National Public Radio (NPR) home page www.npr.org
7
The NPR home page Programming pull-down menu
What are the top level classifications?
8
The NPR programming hierarchy tree
How many levels in the hierarchy?
9
Top level, second level and third level
classifications of a collection
10
Information Retrieval
  • How search engines find pages
  • Crawl the web, scanning pages for keywords
    following links to other pages
  • Build huge databases of web resources (docs,
    images, etc.)
  • Google has over 6 billion in their DB

11
Information Retrieval
  • How search engines rank pages
  • Page ranking based on location frequency of
    keywords
  • Page more relevant if keyword in title or top of
    body
  • Page more relevant if keyword occurs frequently
  • Webmasters can manipulate page to increase
    ranking (meta tag in head section)
  • Monitor links clicked to measure relevance

12
The Google search engines advanced search view
13
Restricting the Thai restaurants hits by
eliminating any page containing the word review.
14
Making Effective Queries
  • Queries use the three logical operators
  • worda AND wordb -- both words must appear
  • worda OR wordb -- either word may appear
  • NOT worda -- the word is prohibited from
    appearing
  • Google has separate windows for each
  • When 1 window is available, write a formula
  • (Lincoln OR Jefferson) AND NOT Memorial

Which parts of the diagram do the operators cover?
15
Information Retrieval
  • Alternative Search Engines
  • A9.com
  • Search service provided by Amazon.com
  • Piggybacks on Google
  • Sorts web files into categories
  • Files, images, books, etc.
  • Can setup account and save search results
  • Amazon can track your search usage!
  • Search results for HTML Tutorial

16
Information Retrieval
  • Vivisimo.com
  • Located in Pittsburgh, founded by CMU researchers
  • Uses clustering technique to sort web files into
    topical folders (common associations between
    keywords and other words)
  • Off to a successful start, Google is pursuing
    same approach
  • Search results for HTML Tutorial
  • Search engines are not impartial
  • Most surfers cant identify links that are
    effectively advertisements (see article)

17
Information Retrieval
  • Semantic Web
  • Tim Berners-Lees article in Scientific American
  • Todays web is for people to read, not computers
  • Goal is to bring a knowledge structure to www
  • Software agents will roam the WWW perform tasks
    on behalf of the user
  • Based on knowledge representation inference
    rules..the field of Artificial Intelligence (AI)
  • XML other technologies used to structure
    knowledge
  • WWW becomes enormous database where new knowledge
    can be discovered
  • Will be used to control physical devices too

18
Information Retrieval
  • Semantic Web
  • SW agents will follow hyperlinks to definitions
    of key terms
  • And will be able to reason logically
  • Users will compose semantic web pages using
    commercial SW
  • Web will be structured info with sets of
    inference rules
  • Knowledge representation is part of AI
  • Basic computer science research that predates the
    web

19
Information Retrieval
  • Semantic Web
  • Custom XML tags used to structure the web
    document
  • Resource Description Framework (RDF) used to
    define what the XML tags mean
  • Subject - Verb - Object triple
  • John - is the brother of - Mary
  • Subjects, objects verbs are defined by a URI
  • URI is Uniform Resource Identifier (URL is type
    of URI)
  • Ontology is collection of logical statements
    written in RDF
  • Can also be used to identify synonyms
  • ex) zip code postal code

20
Information Retrieval
  • Semantic Web is potential Killer App
  • Improved web searches based on meaning of
    keywords
  • automated discovery of useful information via
    agents
  • Entire WWW becomes vast repository of linked
    information understandable by computers
  • Agents will be able to read digital signatures
    that prove authenticity of information
  • Exponential increase in the usefulness of
    information

21
Information Retrieval
  • Digital Libraries
  • Effort to turn WWW into immense library
    collection
  • Recently Google announced itll scan/digitize
    books from
  • Harvard, Stanford, Oxford, Michigan NYC
    libraries
  • Like semantic web, based on knowledge
    representation
  • More than just an electronic bookshelf
  • Organization, classification, retrieval
    analysis of information
  • Pitt SIS planning a Masters program in Digital
    Library Information Management (DLIM)
  • The WWW of the Future
  • An automated, intelligent web containing all of
    the worlds information (even historical
    archives!) that real-time decisions can be based
    on
Write a Comment
User Comments (0)
About PowerShow.com