Title: Internet Data Syndicator IDS
1Internet Data Syndicator(IDS)
2Motivation
- Web users rely on search engines to find
information of interest
- Two issues remain with traditional search
engines
- Capacity of web coverage
- Browsability of search results
3- Capacity of web coverage
- The coverage of any one search engine is limited
No single search engine indexes more than about
one-third of the indexable web
- Steve Lawrence C. Lee Giles (1998)
- NEC Research Institute
4- Browsability of search results
- Users have to browse through a long ranked list
of search results to find their target
information
5Internet Data Syndicator
- Gather information from Internet using minimum
cost and time
- Basically divided into two parts
- Content Collection
- Content Navigation
6- Content Collection
- Metasearch approach for web searching
- Gathers information from many sources
- Better web coverage
7- Content Navigation
- Filtering, organizing and displaying information
in a sensible way
- Help user to search for target information
easier
- Improve the browsability of search results
8Content Collection
Content Navigation
9(No Transcript)
10- Extract information from the Internet
- Meta Search Engine supports unified access to
multiple search engines
- (e.g. Dogpile, Metacrawler)
- News Crawler extracts daily news from various
news sites
- (e.g. NewzCrawler)
11 Meta Search Engine consults multiple search
engines at once and combines their results as
input for further processing
12c
o
m
p
u
t
e
r
Search
13(No Transcript)
14Filtering Filter search results by numerous cri
teria,
e.g. broken links, duplicates
15(No Transcript)
16Grouping Classify the search results into prede
fined categories in a concept hierarchy.
17Grouping
18Visualizing Provides a mechanism for navigating
the concept hierarchy
19Visualize
20Summarizing Download full pages and produce sum
maries of the retrieved text
21Summarize
22Discussion
- Prepare a framework for research and practical
use
- Help user to search and identify target
information
23References
- Hierarchical Structural Approach to Improving the
Browsability of Web Search Engine Results
- Hang Cui Osmar R. Zaïane
- In Proc. of International Workshop on Digital
Libraries (DLib'2001) in conjunction with 12th
International Conference on Database and Expert
Systems Applications DEXA'2001, Munich, Germany,
September 3-7, 2001 - Integrating Web Information Sources
- Kurt D. Fenstermacher Kristian J. Hammond
- Intelligent Information Laboratory , 1890 Maple
Avenue, Evanston, IL 60201, USA
24- Copernic Agent Family
- http//www2.copernic.com/index.html
- Dogpile
- http//www.dogpile.com
- Metacrawler
- http//www.metacrawler.com/
- NewzCrawler
- http//www.newzcrawler.com/
25Thank You
26example of visualization
http//www.sims.berkeley.edu/hearst/papers/cac-si
gir97/sigir97.html
27example of visualization
http//www.kartoo.com/