Things%20You%20Just%20Have%20to%20Know%20About%20Search%20Engines - PowerPoint PPT Presentation

About This Presentation
Title:

Things%20You%20Just%20Have%20to%20Know%20About%20Search%20Engines

Description:

... example of a search done in individual engines, then in metasearch engines ... 5 - Metasearch engines are not 'search engines' ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 28
Provided by: randol7
Category:

less

Transcript and Presenter's Notes

Title: Things%20You%20Just%20Have%20to%20Know%20About%20Search%20Engines


1
Things You Just Have to Know About Search Engines
  • Ran Hock
  • Online Strategies
  • May 14, 2002
  • InfoToday 2002

2
Things You Just Have to Know About Search Engines
  • 1 - No Search Engine Covers Everything
  • 2 - Different Engines "Miss" and Find
    Different Things
  • 3 - Large Numbers Arent Necessarily Bad Searches
  • 4 - All Search Engines Have Techniques That Allow
    You Improve Results

3
Things You Just Have to Know About Search Engines
  • 5 - Metasearch engines
    are not "search engines"
  • 6 - Google is great, but not the only one you
    should use.
  • 7 - Some Things Change, Some Don't

4
1 -No Search Engine Covers Everything
  • There are pages no engine covers Invisible pages
  • Un-linked pages, database pages, password
    protected sites, deep pages, etc.
  • Different engines miss" and find different
    things (Point 2)

5
2 - Different Engines Find and Miss Different
Things
  • Each engine may find something others missed.
  • Even 2nd tier engines find things missed by the
    top 3
  • Consider the results of the following search on
    erris head sailing

6
2 - Different Engines Find and Miss Different
Things
7
2 - Different Engines Find and Miss
Different Things
  • Of the 20 different records retrieved by all the
    engines, Google found (only) 14 (70)
  • Google missed 6 (30)
  • If you had searched Google, then just one more
    engine, your retrieval would have increased by
    15
  • Even HotBot found 2 the other three engines
    missed.

8
2 - Different Engines Find and Miss Different
Things - Why ?
  • Indexing "policies"
  • What words and other items get indexed
  • How those things are "parsed"
  • Crawling differences
  • Starting points
  • Depth / Breadth of crawling etc.
  • Spam policies
  • Ranking

9
3 - Large Numbers Arent Necessarily Bad Searches
  • Most common complaint
  • Youre not obligated
  • All use some form of relevance ranking
  • Relevance ranking does, to some degree at least,
    the same things we do to find the best items
  • What relevance ranking uses

10
3 - Large Numbers Arent Necessarily Bad Searches
  • Relevance ranking uses some combination of
  • Popularity
  • Frequency of terms
  • Weighting by field (e.g., Title counts more than
    Summary)
  • Proximity of terms
  • Weighting by size of the type
  • Weighting according to the order in which the
    searcher entered terms
  • Etc.

11
3 - Large Numbers Arent Necessarily Bad Searches
  • Most search engines automatically enhance your
    search
  • Automatic phrase identification
  • Word variants (and/or truncation)
  • Case sensitivity
  • Analysis of documents in the database (links,
    term association, associative networks, cluster
    analysis, co-occurrence, etc.)
  • Etc.

12
Automatic Re-Write - AllTheWeb
13
4- All Search Engines Provide Options for You to
Enhance Your Search
  • Field Searching
  • title
  • URL
  • date
  • language
  • etc.
  • Boolean (yes, Boolean, which is neither
    difficult nor bad)

14
4- All Search Engines Provide Options for You to
Enhance Your Search
  • How do you know about these options
  • Use the Advanced Search page
  • Read the documentation
  • ________________

15
4- All Search Engines Provide Options for You to
Enhance Your Search
  • Use the Advanced Search page

16
(No Transcript)
17
5 - Metasearch engines are not search engines
  • Consider the following example of a search done
    in individual engines, then in metasearch engines

18
Search done for geologic resources worcester
19
5 - Metasearch engines are not search
engines
  • Most dont search all of the largest engines
  • Most dont give you more than 10 or 20 records
    from each engine
  • Most dont convey your full query syntax to the
    target engines
  • Most give paid sites first
  • Client-side metasearch programs, e.g., Copernic
    and Bulls-Eye do NOT have the above problems.
  • Even online metasearch engines have occasional
    socially redeeming features (vivisimos
    clustering).

20
6 - Google is Great, But Not the Only One You
Should Use
  • Points 1 and 2 - No search engine finds
    everything and different engines find different
    things

21
6 - Google is Great, But Not the Only One You
Should Use
  • Great Because of
  • Size
  • Popularity-based ranking
  • Unique content
  • newsgroups
  • PDFs and other file types
  • largest image collection
  • Dandy little features like addresses,
    definitions, etc.
  • Pretty good search options

22
6 - Google is Great, But Not the Only One You
Should Use
  • But Doesnt Have
  • Everything
  • Truncation and NEAR that AltaVista has
  • As much news coverage as AllTheWeb
  • As much currentness as AllTheWeb (maybe)
  • Etc.

23
7 - Search Engines Change
  • In some ways a lot, in other ways very little

24
7 - Search Engines Change
  • Areas of little change
  • For most engines How they do basic things such
    as phrases, Boolean, truncation, field searching
    etc.

25
7 - Search Engines Change
  • Areas of frequent/considerable change
  • Some come, some go
  • Gone Go/InfoSeek et al.
  • Arrived WiseNut, Teoma
  • How things are arranged on the home page (esp.
    AltaVista)
  • Partners (which directory they use, featured
    partners and tools, etc.)
  • Added content, esp, content types (PDFs,
    newsgroups, etc. in Google.)

26
In Summary
  • 1 - No Search Engine Covers Everything
  • 2 - Different Engines "Miss" and Find Different
    Things
  • 3 - Large Numbers Arent Necessarily Bad Searches
  • 4 - All Search Engines Have Techniques That Allow
    You Improve Results
  • 5 - Metasearch engines are not "search engines"
  • 6 - Google is great, but not the only one you
    should use.
  • 7 - Some Things Change, Some Don't

27
  • Ran Hock
  • Online Strategies
  • 1-800-871-4033
  • www.onstrat.com
  • ran_at_onstrat.com
Write a Comment
User Comments (0)
About PowerShow.com