Title: INTERNET REFERENCE SOURCES
1INTERNET REFERENCE SOURCES
- Krys Bottrill
- http//www.tehuti.co.uk/links.htm
- - Links for indexers
2THE SIZE OF THE PROBLEM
- Aggregating estimates from several reputable
sources puts the size of the visible web at
somewhere between 2.5 and 4bn pages, growing at
the rate of about 7m a day. Search for the
invisible web, Chris Sherman, The Guardian,
Thursday September 6, 2001 - 10,000,000,000 (?) web pages
-
3THE INVISIBLE WEB
- Not all web pages are indexed
- The Internet consists of more than just html
pages - databases
- other file formats (doc, pdf, multimedia )
- content on sites needing a log-in
- Need to use specialised resources or choose
appropriate search engine
4SEARCH ENGINES
FACTORS INFLUENCING SEARCH RESULTS
- coverage - estimated max. 25
- indexing depth
- re-indexing frequency new pages, dead links
- search features - operators, possibility to
limit and refine search, other facilities - speed
- algorithms
- Comparisons of major engines on
- Searchenginewatch, Searchengineshowdown
5ADVANCED SEARCH FEATURES
- Boolean or other operators (sometimes disguised)
- Filters
- Sort by keyword
- Refine search
- Document type image, sound, pdf
6OPERATORS 1
- OR - to obtain either or both of the search
terms. - You want a soup recipe either tomato or onion
will do tomato OR onion
ONION
TOMATO
7OPERATORS 2
- You have lots about tomatoes and onions, but
where is the soup? - AND - to combine search terms together
- (tomato OR onion) AND soup
ONION
TOMATO
SOUP
8OPERATORS 3
AND NOT to exclude a term soup AND NOT tomato
SOUP
TOMATO
9OPERATORS 4
- ADJ, NEAR syntax!
- Used in non-Boolean searches
- phrase
- must include
- must exclude
10Used with permission. http//www.searchengineshowd
own.com/
11MY FAVOURITE SEARCH ENGINES
Google 3080 million, link popularity ranks,
cache, pdf, xls, ppt, search within results,
similar page search, for word in
phrase WiseNut 1500 million, sneak a peek,
clustersAllTheWeb 2100 million, very fast (lt
0.5 sec. average), pdf, ftp, multimedia,
clusters, refinement suggestions, multimedia
separated AltaVista 240 million multimedia
files, all Boolean operators, wildcards, Teoma
subject-specific popularity, resource lists,
hints for search refinement
12METASEARCHERS
Ixquick highly relevant results, wide range
commands, highlight feature, search results
depend on selected language!Vivisimo
categorises results into folders, can also search
PubMed, patents and other sources as well as the
Web.
13(No Transcript)
14META-SITES
- Provide information about other sites
- Can be subject-specific or classified by subject
- Selection and quality control
- Guarantee (as far as is possible!) that links are
current. - BIOME (OMNI), EEVL, BUBL, ARGE, PINAKES - a
meta-site of meta-sites
15OTHER RESOURCES TO ACCESS THE INVISIBLE WEB
- Scirus searches scientific web sites, Medline,
Neuroscion, Biomed Central (not all info is free) - Mednets set of search engines for searching
medical databases, organized by subject
specialities. - Web-Lens resources for searching the Invisible
Web
16HOW TO KEEP CURRENT?
- Internet Resources Newsletter
- Free Pint
- Scout Report
- Regular visits to favourite meta-sites
- Subject-specific newsgroups and mailing lists
- Alerts services
17INTERNET SKILLS
- Explore advanced search features of engines
- Netskills Interactive Course (TONIC)
- Internet Detective tutorial on quality evaluation
- RDN Training Suite
18WHAT TO DO WITH THE LOOT?
- Downloading files Web right click, save
target as FTP sites IE5 Win2000 or
software, eg FTP Explorer (http//www.ftpx.com/
), Cute FTP (http//www.cuteftp.com ) - Reading files pdf Adobe reader
(http//www.adobe.com) MS Office files
(Powerpoint, Word, Excel, Access) - Microsoft
Office Converters and Viewers (http//www.microsof
t.com/office/000/viewers.asp)