Search Engine Comparisons - PowerPoint PPT Presentation

About This Presentation
Title:

Search Engine Comparisons

Description:

Will an 'open web' search engine always have my answers? ... SpeechBot (keyword search engine demo by Compaq, uses speech technology to ... – PowerPoint PPT presentation

Number of Views:159
Avg rating:3.0/5.0
Slides: 24
Provided by: thomiev
Category:

less

Transcript and Presenter's Notes

Title: Search Engine Comparisons


1
Search Engine Comparisons
  • By Thomie Ventura

2
Search Engines
  • Today, much, but not all, of the work we do
    revolves around the web
  • Internet is accessible to almost anyone
  • Impact on businesses, schools, professionals,
    home users
  • Web is changing every day, but everything is
    still not ACCESSIBLE

3
FTP Servers
  • Only way of sharing files up to 1990
  • FTP Servers and FTP Clients
  • Down Side
  • Servers were mostly known through word of mouth
  • Not everyone was setting up their servers

4
Grandfather, Grandmother, Mother
  • Archie ( Grandfather)
  • Used FTP file Servers
  • Veronica (Grandmother)
  • Used Gopher file Servers
  • World Wide Web Wanderer (Mother)
  • First Robot
  • Caused Controversy
  • Are Robots a good or bad thing for the Internet?

5
Web Search
  • What exactly does it mean?
  • Involve tools ?
  • Accessing proprietary databases such as
    www.Factiva.com or www.dialog.com
  • Well focus on web search as an open web
    source, and look at a searchers point of view

6
Difficulty Coping
  • Volume and Speed of the web and Search Engines
  • Something new happens each day
  • So many things to do, so little time to do it
  • Dynamic nature of web searching (indexing new
    documents)
  • Staying up-to-date with traditional tools( also
    undergo changes)
  • Other random issues that arise everyday

7
Will an open web search engine always have my
answers?
  • Questions that should arise about searching the
    web
  • How long did it take to get it?
  • What is the database or search engine?
  • What kinds of questions will it help me answer?
  • Open web will not always give me the answer
  • What can it be used for?

8
Quality of Information
  • Anyone can become a publisher
  • Evaluating content is crucial
  • Reputation
  • Background
  • Qualifications
  • Where did it come from?
  • What its purpose?
  • Relevant to my topic?

9
Limitations of General Web Search Tools
  • Spiders dont crawl in real-time
  • Recency
  • Linked or Submitted Sites
  • If a website contains 1000 pages, does not mean
    Search Engines make all of them accessible

10
Invisible or Hidden Web resources
  • Examples
  • Interacting resources, return custom sites
  • Registration
  • Why is it hidden?
  • Created on the fly
  • Spiders dont fill in registration forms
  • No-Robot Tag

11
Hidden is not always bad
  • Research and Effort
  • Without proper tools, we can make large databases
    even larger
  • Google
  • Altavista
  • Excite
  • Distributing Information Properly

12
Specialized Focused and Site Specific Search Tools
  • Necessary and Important
  • Hidden Web is out of reach of general purpose
    Search Engines
  • More Precision than Recall
  • Examples
  • www.Psychcrawler.com www.Inomics.com
    http//newssearch.bbc.co.uk/ ksenglish/query.htm
    ,

13
Identifying and Collecting Specialized Engines
  • Profusion
  • http//www.profusion.com
  • Librarians Index
  • Covers large amount of specialized and invisible
    web databases
  • http//www.lii.org

14
Meta Search Engines
  • Major Disadvantages
  • You get it all!! High Recall Low Precision
  • Basics of Search Engines used
  • Send queries to pay for placement engines
  • A good metasearch Engine
  • www.vivisimo.com

15
Old Pages, GONE!
  • Trying to find old pages?
  • Contact webmaster
  • Fortunately
  • Archiving Old Material
  • Example
  • http//www.clinton.nara.gov/index.html
  • ALexa Research
  • http//archive.alexa.com/
  • carries over 18 terabytes of data covering some 5
    million Web sites and some 1.9 billion pages

16
Search Engine Sizes
  • This is a search engine size analysis as of
    December 11, 2001
  • Google Dominates

17
Sizes Over Time
18
Closer Look
19
Dealing with Coping
  • Use the Search Engine
  • Conduct research on a topic
  • This will get you familiar with search engine
  • You can see how results are displayed
  • Relevancy of returned documents
  • Let you gather your own bookmarks

20
Understanding limitations
  • What to do with these limitations?
  • Know limitations
  • Use more than one search engine
  • Use specialized search engines that go deeper
    into a site to collect more information
  • Use invisible web resources
  • Use web directories, and bookmark important sites

21
Ability to Search Multimedia
  • Now Available, but still expanding
  • Wait weeks now becomes instant
  • search tools that provide access to video and
    audio material using a non-text mechanism to
    access the material ex searching a specific
    background or type color
  • Still image tools
  • Google, Altavista, and Fast, use text surrounding
    image

22
Become Aware of Multimedia Search
  • Video Searches
  • Virage www.virage.com
  • TVeyes www.tveyes.com
  • ShadowTv www.shadowtv.com
  • Wordwave www.wordwave.com
  • SpeechBot (keyword search engine demo by Compaq,
    uses speech technology to create real-time
    transcripts) www.speechbot.com
  • Image Searches
  • Webseek (search or browse criteria in image)
    www.ctr.columbia.edu/webseek/
  • Visoo( uses software that looks for words
    embedded in image www.visoo.com

23
Making Old Pages Stay
  • Long Term?
  • Offer comments ( suggest how material can be
    more accessible and searcheable, a great archive
    of content without the correct means of accessing
    it will be a hassle and is not great)
  • Short Term?
  • Take advanatage of Googles cache feature ( google
    crawls a site and makes a copy unless
    unauthorized, and puts it on server, if site is
    gone, the copy is in googles server, you must go
    to search results and next to URL go to cached,
    will not always be there, next time spider crawls
    site and it is missing it will not save onto
    server
  • www.savethis.com (lets you save web pages, and
    access them)
Write a Comment
User Comments (0)
About PowerShow.com