Open Search - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Open Search

Description:

Metasearch and Fixed Lists of Sources. Open Search Architecture ... Metasearch engines should provide users with up-to-date lists of existing digital libraries ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 50
Provided by: usfi
Learn more at: https://www.cs.usfca.edu
Category:
Tags: engines | meta | open | search

less

Transcript and Presenter's Notes

Title: Open Search


1
Open Search
  • David Wolber

2
Overview
  • Proliferation of Digital Libraries
  • Metasearch and Fixed Lists of Sources
  • Open Search Architecture
  • PublishMe for P2P knowledge Sharing
  • Webtop Metasearch Clients

3
Contributors
  • Michael Kepe
  • Igor Ranitovic
  • Iman Sadreddin
  • Senior Team 03
  • Ken Chong
  • Rudd Stevens
  • Colin Bean
  • Tim Chan
  • Julian Chan
  • Pooja Garg

4
Information Source Explosion
  • Google, Amazon APIs
  • Internet Archive
  • Technorati The World Live Web
  • Domain Specific
  • ACM Digital Library for CS
  • Lexis-Nexis for law
  • MLA for literature

5
End-User Created Digital Libraries
  • Personal Web (shared Google desktop)
  • Personal Web Neighborhood
  • Topic-Specific Personal Crawlers
  • Ordinary people creating search engines as easily
    as web pages

6
Subsets of the Web
7
Motivation for Small, Independent Subsets of the
Web
  • Avoid information being channeled through a
    single portal Googleopoly
  • Google does no evil, but
  • Censorship in China
  • Creeping level of commercialization
  • Unregulated manipulation of secret ranking
    algorithms (see PageKing case)
  • Other media is lost, this is the last frontier

8
Little support for using multiple search engines
9
Overview
  • Proliferation of Digital Libraries
  • Metasearch and Fixed Lists of Sources
  • Open Search Architecture
  • PublishMe for P2P knowledge Sharing
  • Webtop Metasearch Clients

10
Metasearch
  • Help users discover and use digital libraries
  • Send queries to multiple, selected search engines
  • filter, process, and unify results
  • A9.com Amazons metasearch

11
Web Services Basis
html
server
Web Page Model
html
server
software
xml
server
Web Service Model
12
How does metasearch evolve?
New Digital library
13
How does metasearch evolve?
New Digital library
Metasearch clients discover it
14
How does metasearch evolve?
New Digital library
Metasearch clients discover it
Metasearch Programmers write adaptor/scraper
15
How does metasearch evolve?
New Digital library
Metasearch clients discover
Metasearch Programmers write adaptor/scraper
User can access within metasearch
SLOWLY
16
Overview
  • Proliferation of Digital Libraries
  • Metasearch and Fixed Lists of Sources
  • Open Search Architecture
  • PublishMe for P2P knowledge Sharing
  • Webtop Metasearch Clients

17
Goal Automate the Process
  • Metasearch engines should provide users with
    up-to-date lists of existing digital libraries
  • Digital libraries should be able to register and
    be made immediately available to all Metasearch
    clients.
  • Metasearch and Library development is independent.

18
What is Necessary?
  • Standard Search API
  • So Metasearch clients can use polymorphism to
    access sources.
  • for each source s in sourceList
  • searchEngine.endPointUrl s.endPointUrl
  • resultList searchEngine.keywordSearch(keyword
    s)
  • Search API Registry
  • Metasearch clients can get dynamic list

19
Web Service Standards
  • WSDL Web Service Description Language
  • SOAP Simple Object Access Protocol
  • UDDI Universal Description, Discovery, and
    Integration

20
Standards on top of Web Services
  • WSDL, SOAP, UDDI basis for standards in many
    domains.
  • e.g., MS initiated for securities information
    providers
  • Businesses agree on a standard, then client
    applications can use polymorphism and new
    businesses can register services.
  • In this case, we want cross-domain standard.

21
Open Search Architecture
  • Open Search Protocol (OSP)
  • Cross-Domain Search-related services
  • Not just keyword search, but citations, authorOf,
    etc.
  • Open Search Registry
  • Based on UDDI
  • Can add customization, e.g., parsing to find out
    which search operations are implemented.
  • Web and web service access

22
Open Search Architecture
OSP metasearch clients
source list
OS Registry
Register service
OSP-Conforming Libraries
23
User Can Choose Sources
24
Open Search Protocol
  • Keyword search
  • Citations (inward links, outward links)
  • AuthorOf and other associative operations
  • Metadata object results based on Dublin Core
  • Restriction object for advanced search stuff

25
Publishing a Library
  • Access OSP WSDL Specification from
    webtop.cs.usfca.edu
  • Generate code in language of choice
  • Implement the search operations for the digital
    library
  • Deploy the service
  • Register with Open Search registry

26
Deploying an Open Search Lib.
Library server
4. deployed service
Open Search information Registry
1. OS wsdl
programmer
2.wsdl
5. registration info
3. skeleton code
wsdl2java
27
Wrapping a Library
Custom search API, e.g., Google API
2. Custom query
3. Custom Result
Open Search Wrapper
Located on 3rd party server
1. OSP Query
4. OSP Result
Metasearch Client
28
Wrappers Developed at USF
  • Google
  • Amazon (sort of)
  • Internet Archive
  • Technorati
  • Feedster

29
Overview
  • Proliferation of Digital Libraries
  • Metasearch and Fixed Lists of Sources
  • Open Search Architecture
  • PublishMe for P2P knowledge Sharing
  • Webtop Metasearch Clients

30
PublishMe
  • Like Google Desktop, but shared.
  • Periodically updates inverse index and linkbase
    on PC
  • Deploys Web Service on Users PC
  • Auto-Registers with Open Search Registry

31
Metasearch with P2P Knowledge Sharing
WEBTOP
32
Integrating Global and Personal Libraries
33
Motivation for Sharing Personal Webs
People create knowledge everyday when they
bookmark, annotate, link, organize, and
synthesize. Communication is a separate step
which often doesnt happen
34
Motivation for Sharing Personal Webs
Collaborative Work
Experts
35
Computers are designed using our brains for a
model
  • Knowledge creation and dissemination separate
  • Explicit effort required to communicate
  • Just as we model our word processors on paper.

36
Additions to OSP for P2P
  • GetFile
  • OnLine(ip)
  • Handles user starting up
  • Dynamic IPs
  • OffLine

37
But What About PRIVACY?
  • The Big Question
  • How much of the information hidden
  • within your personal web is hidden due
  • to privacy concerns?

38
I Want you to be a Search Engine!
39
Overview
  • Proliferation of Digital Libraries
  • Metasearch and Fixed Lists of Sources
  • Open Search Architecture
  • PublishMe for P2P knowledge Sharing
  • Metasearch Clients

40
Goal Implement Vannevar Bushs Association Trails
  • View a document/thing in context
  • History of an idea

41
Thinkmap-like Interface
42
Association Types
  • Outward links
  • Inward links
  • Similar-Content links
  • People Links
  • author, people referenced in paper
  • Domain-Specific links
  • law citations
  • movie-actor
  • Associations specified by Annotators

43
Webtop Tree View webtop.cs.usfca.edu
44
Expanding a Tree
  • Birds Eye View
  • Local/Web files integrated
  • Follow different Associative Trails
  • Ins of Outs of Ins, etc.
  • Siblings
  • Weird though, as ins and outs both expand right

45
Webtop Side Panel View
46
Project Status
Too many bugs, Dad
47
Future Work
  • Open Search Protocol
  • In-depth study of existing search APIs
  • Provide Rest alternative to SOAP
  • Metasearch development
  • Complete and refine existing clients
  • Dream up new ones
  • Thinkmap Graph
  • Automated Source Selection and Reputation System
  • Page Ranking
  • Initiate grass-roots involvement

48
Future Work Documents and Things
resourceassociationsannotations
document
person
creative work
html
word
pdf
film
book
49
Stop talking about Webtop daddy!
webtop.cs.usfca.edu
Write a Comment
User Comments (0)
About PowerShow.com