John Cox - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

John Cox

Description:

Quality varies, mainly due to ease of publication and lack of checks ... Weblog. http://www.hw.ac.uk/libWWW/irn/irn.html. Towards a Brighter Future ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 43
Provided by: refe69
Category:
Tags: cox | john | weblog

less

Transcript and Presenter's Notes

Title: John Cox


1
The Search for Quality productive Web
searching
  • John Cox
  • James Hardiman Library
  • NUI, Galway

2
The Problem
  • 7.3 million new Web pages daily
  • Quality varies, mainly due to ease of publication
    and lack of checks
  • Quality is in the eye of the beholder
  • Over-dependence on general search engines
  • Simplistic use of search tools

3
Some Usage Findings
  • NUI, Galway Library survey, March 2000
  • Search engines cited by 79 out of 167 respondents
  • Exclusively used for, eg Nazism, defamation law,
    hepatitis C
  • Less than 50 satisfied
  • Other surveys show very simplistic use
  • 33 users enter one word only
  • Further 33 users enter two words only
  • UK survey indicates 80 searchers waste some time
  • US survey shows search rage within 12 minutes

4
Key Question
  • How much better than users are information staff
    at finding high-quality information on the Web
    and what leadership do we provide?
  • 5 key actions needed

5
5 Key Actions
  • Get the best from the search engines
  • Go vertical subject-specific sources
  • Take time to experiment, eg helper software
  • Exploit the invisible Web
  • Actively promote quality searching

6
1 Get the Best from the Search Engines
  • Understand how they work
  • Know their limitations
  • Use advanced features
  • Search more than one
  • Know when not to use them

7
Search Engine Components
  • Crawler follows links
  • Indexer builds database
  • Query processor lets us search

8
Common Limitations
  • Profit-oriented
  • Paid entries listed at top
  • Out of date
  • Partial site indexing
  • Technically must exclude many sites, eg
  • Password-protected
  • Registration needed
  • Database-driven
  • Hidden search facilities

9
Understanding Google
  • Strengths
  • Coverage
  • Cached pages
  • File types, eg PDF,.doc,.ppt
  • Relevance link popularity
  • Beyond pages images, newsgroups
  • Weaknesses
  • Poor Boolean support
  • No truncation
  • Limited date searching
  • Invisible search facilities
  • Two pages per site displayed by default

10
Google coverage
11
Google search modes
Basic
Advanced
12
Google file types
13
Google newsgroup search
14
Google cached pages 1
15
Google cached pages 2
16
Google Boolean limitations 1
Correct syntax medline OR embase
17
Google Boolean limitations 2
Correct syntax medline embase (or use Advanced
Search)
18
Google no truncation
Use clinton (tax OR taxes OR taxation)
19
Google few date limits
20
Google hidden features 1
Discovered at www.searchengineshowdown.com
(buried in Google help)
21
Google hidden features 2
Partial URL v Specific Site Search Not possible
on Advanced Search despite Domains limit
22
Other Search Engines
  • Always worth searching more than one, eg
  • All the Web (FAST)
  • AltaVista
  • Lycos/HotBot
  • Northern Light (?)
  • Overlap may be limited
  • Different ranking criteria

23
2. Go Vertical specific tools
24
Horses for Courses 1
25
Horses for Courses 2
26
Horses for Courses 3
27
3. Experimentation
  • Try out add-on search software, eg
  • BullsEye Pro
  • Copernic
  • Copernic Summariser

28
BullsEye Pro searching
29
BullsEye Pro Webliographies
30
Copernic
31
Copernic Summariser
32
4 Explore the Invisible Web
  • Material, often of high quality, that general
    search engines cant or wont index
  • Unlinked pages
  • Non-HTML file types, eg audio, video, PDF
  • Authenticated sites
  • Databases
  • Much greater in size than visible Web

33
invisibleweb.com
34
invisible-web.net
35
WebData
36
Librarians Index to the Internet
37
5. Promote Quality Searching
  • Old sources
  • Old habits
  • New media

38
Old Sources
39
Old Habits
Concept analysis
Search strategy formulation
Critical source selection
Patience
Flexibility
Critical appraisal of search hits
40
New Media
Library Web Site
E-newsletter
http//www.hw.ac.uk/libWWW/irn/irn.html
Weblog
41
Towards a Brighter Future
  • Automatically-generated, accurate metadata
  • Smarter search engines
  • More quality-sensitive
  • More penetrative
  • XML structured data

42
References
  • Sherman, Chris and Price, Gary The invisible Web
    uncovering information sources search engines
    can't see. Medford, N.J. Information Today,
    2001. ISBN 091096551X. (accompanying database at
    http//invisible-web.net)
  • Search Engine Watch http//www.searchenginewatch.
    com
  • Search Engine Showdown www.searchengineshowdown.c
    om
Write a Comment
User Comments (0)
About PowerShow.com