Title: Shayna Keces
1Basic Internet Search Techniques
- Or
- How to really find information on the internet
August 2004
2Agenda
- Size of Internet
- Types of search engines
- Search strategies
- Some hints on selecting search strategies
- Interpretation of search results
- Tutorials on searching and search engines
3Size of Internet/World Wide Web
- July 2000 2.1 billion web pages, est. 4 billion
pages by early 2001 (Some place much higher if
count invisible or deep web) - Size of search engine databases
- Google 4.28 billion
- Fast (alltheweb) 2.1 billion
- AltaVista 1.1 billion
- Yahoo 2 million catalogued
4Search strategies
- Do not
- use search button
- use a string of keywords without specifying
Boolean properties - use upper case unless part of strategy
- use NOT or - unless absolutely sure is necessary
- elimination of unanticipated pages
- format is non standardized
5Search Strategies
- Do
- Consider what type of resource will best answer
your question and search for that resource (eg.
dictionary or certain type of web page) - think of a list of keywords that will narrow or
broaden your search keeping in mind that with the
internet, narrowing your search is usually better - Stick to small list of search engines and learn
the search syntax for the search engine youre
using
6Types of search engines
- Keyword or robot based (builds a database)
- Directory based (categories indexed by people
rather than computer) - Annotated directory-based search engines
- Meta indexes (can combine searches or allow you
to search a variety of engines individually) - Specialized search engines
7Keyword or robot based Search Engines
- Large database of web pages
- No human involvement and no quality control
- Can submit website or will find some on own
- Searches full text to certain level, does not
search deep or invisible web - Google (www.google.com)
- Alta Vista (www.altavista.com)
- Fast (www.alltheweb.com)
- Wisenut (www.wisenut.com)
8Google (www.google.com)
- Presently largest database (ca. 4 billion)
- Very sophisticated placement of results
particularly good for popular sites, company
sites - Advanced search can limit search to title of
page or to URL - implied AND
- for stop words
- If you want or needs to be expressed in caps
- not case sensitive
9Google (www.google.com) cont.
- no stemming or truncation (except on ad hoc basis
controlled by Google. - description shows keywords in context
- cached pages helpful for sites not working
- Searches some formats not found in other search
engines (eg. Adobe acrobat and postscript files,
Excel, Powerpoint, and Word files as well as rich
text files.) - Innovative in new features (eg. ability to
convert measurements, eg. 4 miles in km) See
www.google.ca/help/features.html for a
description of features.
10AltaVista (www.altavista.com)
- One of larger search engines (1.3 billion
pages/objects or more) - Particularly good for finding less popular sites
- Implied and but noted for changing
- Case sensitive when word is in quotations
- Stemming with at end or in middle of words
- Has related terms which helps you focus your
search
11AltaVista Advanced Search
- Has build a Boolean search facility or can
create your own - Can specify pages be from certain country based
on country codes so will not include .com etc. - Can specify dates of last modification
12Directory-based Search Engines
- Indexed by individuals so subject searches will
be more accurate - Smaller database than Robot engines
- Used mainly for finding good site on general
topic - Yahoo (www.yahoo.com or ca.yahoo.com)
- About (about.com )
- Looksmart (www.looksmart.com)
13Yahoo (ca.yahoo.com)
- Most popular of directory based search engines
- Many different versions (international have same
pages as others but local options are supplied
first) - Now has own web search which is competing with
Googles - Can search by categories and sub-categories
14Annotated directory-based search engines
- Because annotated, database is even smaller than
Directory-based engine - Quality of web pages is better
- Web pages often rated
- Librarians Index to the Internet (lii.org)
- The Internet Public Library (www.ipl.org/)
15Librarians Index to the Internet (lii.org)
- Topical list of high quality websites with
abstracts and qualitative analysis - Can willow down by topic or use search capability
- Only websites which meet the standards of the
editors are included - Provides date site was added to index as well as
date the lii entry was last updated
16Meta indexes
- One site searches more than one search engine
- Results can be separated or combined
- Sometimes a problem in interpreting question
equally effectively for all search engines - Used if not sure which search engine will give
you best results and/or for obscure topics
17Meta indexes examples
- Dogpile (www.dogpile.com)
- Metacrawler (www.metacrawler.com/index.html)
- Surfwax (www.surfwax.com)
- Hotbot (www.hotbot.com)
18Specialized Search Engines
- Geographic based (www.altavistacanada.com,
http//www.ottawastart.com/ - Phone directories (canada411.sympatico.ca/,
www.infospace.com/canada/index.htm) - Newsgroup searching (groups.google.com)
- News searching (news.google.ca)
- Womens information (wwwomen.com)
- Different formats (www.gimpsy.com/,
www.kartoo.com/)
19Specialized sites
- Ottawa Public Library (www.library.ottawa.on.ca)
- Reference tools (see library reference sites, eg.
lii.org, www.ipl.org/ref) - Encyclopedias (www.britannica.com, Columbia
encyclopedia www.bartleby.com/65/ - Canadian information (vrl.tpl.toronto.on.ca/,
Canadian information by subject
www.nlc-bnc.ca/caninfo/ecaninfo.htm, Canadian
encyclopedia online, www.thecanadianencyclopedia.c
om/
20Some hints on selecting search strategies
- For any page on general topic to which you need
an introduction try Directory-based search
engine. If do not need specific quality can use
address bar search - For web page of major company or organization try
Google or Alta Vista - For a specific web page that would not
necessarily be popular try Alta Vista or Google
21Some hints on selecting search strategies cont.
- For health topics try a health website engine
like www.medbroadcast.com or the Canadian Health
Network www.canadian-health-network.ca/customtools
/homee.html, or the librarys health database,
Health Source (www.library.ottawa.on.ca/electronic
/index.htm), or the health links on the librarys
web page (www.library.ottawa.on.ca/english/links/
PublicAdults/index.htm).
22Some hints on selecting search strategies cont.
- For very obscure topic topic try Google or Alta
Vista or one of meta indexes - For items in databases, try to find the correct
host or search a special site for invisible
websites (eg. www.invisible-web.net/)
23Interpretation of search results
- Look at results and reformat search using things
like searching within results, Prisma and adding
new keywords. - Analytically choose which sites to look at in
result list - Anatomy of URL domain type of name, I.e. the
name or organization followed by the type of
organization. Some popular suffixes are - .com for commercial sites, .edu for university
sites (mainly American), .org for non-profit
organizations, .gov for U.S. government sites,
and .gc.ca for Canadian government sites.
24Interpretation of search results cont.
- Consider things like the authority of the author,
the currency of the information, and the reason
for creating the website (implications for bias) - Do not look through pages and pages of results.
If the first three pages are not promising refine
the search (see the first point on interpreting
the results).
25Some useful tutorials for searching
- See Learning to search section of Collection
of special search engines (appears under contents
on left-hand side of the page) - www.leidenuniv.nl/ub/biv/specials.htm
- Web searching tips www.searchenginewatch.com/facts
/index.html - Net tutor (gateway.lib.ohio-state.edu/tutor/les5/)
26Some useful tutorials for searching cont.
- In the links section of the Ottawa Public
Librarys web site, (www.library.ottawa.on.ca/engl
ish/links/PublicAdults/index.htm), look under the
category WWW under the subcategory Internet
27To find more info on search engines
- Searchenginewatch (www.searchenginewatch.com)
- Searchengineshowdown (www.searchengineshowdown.com
)
28For More Help on Searching
- Contact the Reference Dept. of the Main Branch of
OPL by phoning 236-0302, ext. 233, or email
ref_at_library.ottawa.on.ca - Consult this web page or other specialized web
presentations on the librarys web page at
http//www.library.ottawa.on.ca/english/services/r
eference/index.htm