The Invisible Web - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

The Invisible Web

Description:

'Stuff' that search engine crawlers (spiders) can not -- or ... MapBlast. http://www.mapblast.com. Streetmap.co.uk. http://www.streetmap.co.uk/ Invisible Web: ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 29
Provided by: chriss151
Category:

less

Transcript and Presenter's Notes

Title: The Invisible Web


1
The Invisible Web
  • Gary Price, MLIS
  • George Washington University
  • Chris Sherman
  • Associate Editor
  • Search Engine Watch

2
How Search Engines Work
The Web
Crawler
URL1
URL2
Indexer
URL3
URL4
Your Browser
Eggs - 90 Eggo - 81 Ego- 40 Huh? - 10
All About Eggs by S. I. Am
Search Engine Database
Eggs?
Eggs.
3
What is the Invisible Web?
  • Stuff that search engine crawlers (spiders) can
    not -- or will not -- add to their databases
  • 2 to 50 times larger than the visible Web
  • Resources often much higher quality than the
    visible Web

4
What is the Invisible Web?
  • Certain file formats (PDF, Flash, Office files,
    streaming media)
  • Why? They arent HTML text
  • Most real-time data (stock quotes, weather,
    airline flight info)
  • Why? Ephemeral storage intensive

5
What is the Invisible Web?
  • Dynamically generated pages (cgi, javascript,
    asp, or most pages with ? in URL)
  • Why? Spider traps
  • Web accessible databases
  • Why? Spiders cant type

6
Invisible Web Gateways
  • Intelliseek
  • http//www.invisibleweb.com
  • http//beta.profusion.com
  • Complete Planet
  • http//www.completeplanet.com/
  • Librarians Index to the Internet
  • http//www.lii.org

7
The Invisible Web The Librarian
  • The Need For Knowledge!
  • Awareness that the IW ExistsMaybe the IW Hold
    the Content Your Users Cant Find! What is the
    cost in both wasted time/effort and total
    frustration?
  • Let Others Know About the IW
  • Awareness of The Synonyms
  • Invisible Web
  • Deep Web
  • Hidden Web
  • Let the Content be Your Calling CardFocus Less
    on the Amount IW Data

8
The Invisible Web The Librarian
  • Why is the IW Useful to the Librarian and the End
    User?
  • Quality of Content (Authority)
  • Deep Content on Subject Area (Comprehensiveness)
  • Focused Databases (Limited Scope)Smaller
    Universe of Documents to Search (Maximize
    Precision/Recall)

9
The Invisible Web The Librarian
  • Why is the IW Useful to the
  • Librarian the End User?
  • Material Unavailable Elsewhere on the Web
    (Uniqueness)
  • Many Options to Limit, Sort, Interact with the
    Data(Maximize Precision)
  • Timeliness vs. Time Lag of General Search Tools
    (Currency)

10
The Invisible Web The Librarian
  • The IW, The Librarian, The Future
  • What Happens If/When the General Search Tools
    Crawl IW Material? Good News? Bad News?
  • General Search Tools May NOTOffer Many
    Interactive/Limiting ToolsMay Not be
    Updated/Refreshed (time lag) as
    FrequentlyTimeliness, making current info
    available is one of the things the NET does well.

11
The Invisible Web The Librarian
  • The IW, The Librarian, The Future
  • The Search Engine Business, Will IW Material be a
    Priority?
  • Just One Dialog or SilverPlatter Database?NO, in
    Terms of Content!!!
  • Yes, Common Interface, SyntaxPerhaps XML will
    Assist

12
The Invisible Web The Librarian
  • Challenges
  • Its Not The Magic Bullet. Its a Tool
  • We Still Need Traditional Online Databases
  • Learning Curve, Sorry!
  • Database Selection, When To Use the IW?
  • Numerous Interfaces, Syntax
  • A Non-Stop Flow of New Material

13
The Invisible Web The Librarian
  • Things To Do!
  • Build Your Own CollectionsInternet Resource
    Collection Development
  • Mine Entire Sites, Often the IW Material Gets
    Little or No Notice In Reviews
  • Create Links When Possible DIRECT to the
    Interface.
  • Save the Time of the Web Researcher
  • Keep Current

14
The Invisible Web The Librarian
Types of IW Content in Librarian Terms
  • Bibliographic- OPACs- Subject Bibs
  • Non-Bibliographic- Full-Text- Numeric-
    Graphic- Directory- Real-Time

15
Future Trends
  • Killer apps will lead the way
  • Research Index (CiteSeer)
  • Search engines will work harder to find
    Invisible Web content
  • Inktomi (Index Connect, Ultraseek)
  • WhizBang (wrappers)
  • No matter what, there will always be a problem!

16
Coming Soon
Available July 2001 CyberAge Books
0-910965-51-X http//www.invisible-web.net
17
Invisible WebComputer Science
  • MacAfee World Virus Map
  • http//www.mcafee.com
  • ResearchIndex
  • http//www.researchindex.com

18
Invisible WebCompany Research
  • European High-Tech Industry Database
  • http//www.tornado-insider.com/radar/
  • Kompass
  • http//www.kompass.com

19
Invisible WebIntellectual Property
  • Delphion Intellectual Property Network
  • http//www.delphion.com/
  • ESP_at_CENET (European Patent Office) Patent
    Database
  • http//ep.espacenet.com/

20
Invisible WebDictionaries Languages
  • EuroDicAutom
  • http//eurodic.ip.lu
  • Verbix
  • http//www.verbix.com/index.html

21
Invisible WebArt Artists
  • ADAM (Art, Design, Architecture Media
    Information Gateway)
  • http//adam.ac.uk/
  • Artcyclopedia
  • http//www.artcyclopedia.com/

22
Invisible WebReal-Time Information
  • Flight Tracker
  • http//www.trip.com/ft/home/0,2096,1-1,00.shtml
  • J-Track 3-D Satellite Locator
  • http//liftoff.msfc.nasa.gov/realtime/JTrack/Space
    craft.html

23
Invisible WebMaps and Driving Directions
  • MapBlast
  • http//www.mapblast.com
  • Streetmap.co.uk
  • http//www.streetmap.co.uk/

24
Invisible WebGovernment Info
  • Parline Database
  • http//www.ipu.org
  • United Nations Daily Press Briefings
  • http//www.un.org/News/

25
Invisible WebHealth Medicine
  • Economics of Tobacco Control Database
  • http//www1.worldbank.org/tobacco/database.asp
  • International Digest of Health Legislation
  • http//www.who.int

26
Invisible WebNews Current Events
  • Cold North Wind Newspaper Archive Project
  • http//www.coldnorthwind.com
  • Financial Times Global Archive
  • http//www.globalarchive.ft.com

27
Invisible WebScience
  • Great Barrier Reef Online Image Catalogue
  • http//www.gbrmpa.gov.au/corp_site/info_services/l
    ibrary/index.html
  • Nuclear Explosions Database
  • http//www.ausseis.gov.au/databases

28
Invisible WebTransportation
  • Equasis (Merchant Ships)
  • http//www.equasis.org/
  • World Aircraft Accident Summary (WAAS) Fatal
    Airline Accident Subset
  • http//www.waasinfo.net/
Write a Comment
User Comments (0)
About PowerShow.com