Search Engines - PowerPoint PPT Presentation

About This Presentation
Title:

Search Engines

Description:

use human editors for page selection, indexing and classification ... Links in guestbooks. particularly effectively with high-ranking guestbooks 'Blind Text' ... – PowerPoint PPT presentation

Number of Views:138
Avg rating:3.0/5.0
Slides: 31
Provided by: john1501
Category:

less

Transcript and Presenter's Notes

Title: Search Engines


1
Search Engines
2
Introducing
  • Directories, Meta-Searchengine
  • How search engines work
  • What influences the ranking

3
Directories
  • hand-constructed hierarchy of topics (e.g.
    Yahoo!)
  • use human editors for page selection, indexing
    and classification
  • Covers a small part of the web
  • Small updatability
  • No ranking

4
Directories II
  • No searching across the index
  • Searching across the reviews
  • Sometimes partnership with search engines to
    increase coverage

5
Meta-Searchengine
  • Rare keyword requests require use of more than
    one web search engine
  • Submit the same query parallel to many engines
  • Duplicated entries are eliminated
  • The results are shown in uniform format
  • No harvesting or indexing

6
How search engines work
  • Harvesting
  • Indexing
  • Analyzing Requests
  • Ranking

7
Harvesting
  • programs (robots, gatherer or crawler )visit web
    sites and gather the web pages for indexing
  • Start with an initial page
  • Follows hyperlinks (lta hrefgt)
  • Sometimes, more then 2 sub-levels are visited
  • These programs are started periodically

8
Harvesting II
  • Problems
  • Links arent found in
  • Frames
  • Imagemaps
  • Many robots are started by a search engine
  • gt traffic

9
Robot Exclusion
  • Two Methods
  • Meta-Tags
  • ltmeta name"robots" contentnoindex,nofollow"gt
  • robots.txt
  • User-agent Scooter
  • Disallow /privat/geht_dich_gar_nix_an.html
  • Allow /allesOffen

10
Robot Exclusion II
  • robots.txt (Example 2)
  • User-agent
  • Allow /allesOffen

11
Indexing
  • Indextable gets the harvesting-resuls
  • Indextable includes keywords
  • Table is located in main-mamory gt fast access

12
Analysing Requests
  • Comparison between searchstring and index-table
  • The searchstring consists of a word
  • gt easy processing
  • The search word consists of truncation or
    booleans
  • gt complex processing
  • If the searchstring in the index is discovered,
    the side is taken up to the hit-list

13
Ranking
  • influences on the ranking
  • How many keywords are found
  • keyword-frequency
  • keywords-position
  • Domain/URL
  • Documentname

14
Ranking II
  • Headline
  • Early in the text
  • Meta-Tags
  • Ranking for cash
  • Page Rank
  • Clicking frequency/ Hit Popularity Engine

15
Ranking for cash
  • Capitalism principle
  • Paying money gt high ranking-level
  • Contents are not relevant
  • additional incomes

16
Ranking for cash II
  • not independently in the employment
  • Mostly used by e-commerce-companies
  • Second method
  • pay for faster indexing time

17
Page Rank (Google)
  • Evaluation through internet-community
    (web-admins)
  • Realtion between quality of a page and number of
    links that point to it
  • Links of the popular web-sites are regarded as
    better

18
Page Rank (Google) II
  • Disadvantage
  • new web-sites have a bad ranking
  • Querys with many boolean-connections and keywords
    are not easy to process

19
Hit Popularity Engine
  • index already exists and is pre-sorted
  • A click on a link leads to a voting for this site
    concerned gt click is recorded to the database
  • pages with many clicks are more popular
  • developed by Direct Hit

20
Hit Popularity Engine II
  • This method is usually combined with others
  • Disadvantage
  • new web-sites have a bad ranking

21
Ranking-Manipulation
  • Why?
  • commercial interest
  • Done of
  • Search Engine Optimizer, SEO
  • Sense of
  • to boost the pagerank

22
Linkfarm
  • Many Domains are registered
  • Programs generate thousands among themselves
    linked pages
  • each page contains keywords
  • Partly these sides are arranged even complex

23
Forwarding
  • intermediate page contains the looked for terms
  • HTML Meta tags and simple Javascript can be
    recognized
  • SEOs complicate the forwarding instructions gt
    no recognizing

24
IP Delivery
  • normal site is indicated by Robots
  • After this, contents of the site are exchanged

25
IP Cloaking
  • Servers programs determine who the Request starts
  • Robots request "cloaked" content is delivered
    which is designed to influence ranking
  • Human visitors do not see the "cloaked" content

26
Other simple tricks
  • Links in guestbooks
  • particularly effectively with high-ranking
    guestbooks
  • Blind Text
  • Text in background-color

27
Trade with weblinks
  • Paying for linking
  • Partnership gtCommission

28
Resumee
  • suitable tools select
  • The www is dynamic gt
  • new developments consider
  • correct estimate of ranking

29
Thank You!
30
Sources
  • 1 www.suchfibel.de
  • 2 Jo Bager Orientierungslose Infosammler ct
    23/99
  • 3 Stefan Karzauninkat Zielfahndung ct 23/99
  • 4 Sven Lennartz Ich bin wichtig ct 23/99
  • 5 Stefan Karzauninkat Google zugemüllt ct 1/03
  • 6 www.google.com/webmasters
  • 7 Dr. Wolfgang Sander-Beuermann Schatzsucher
    ct 13/98
  • 8 Arno Dittmar Suchmaschinen und Anfragen im
    WWW
  • 9 Ralf Rudolf Suchmaschinen und Anfragen im WWW
Write a Comment
User Comments (0)
About PowerShow.com