Title: Distributed Databases and Applications
1Distributed Databases and Applications
- John Wieczorek
- Museum of Vertebrate Zoology, UC Berkeley
2Warning
- What you are about to witness is not meant to be
experienced in a single dose. - Side effects may include nausea, eye twitching,
and general discomfort. - Do not panic, call for help.
3DiGIRDistributed Generic Information Retrieval
- John Wieczorek, Stan Blum, Dave Vieglais, P.J.
Schwartz
4Information Retrieval
- Distributed - a protocol for retrieving
structured data from multiple, heterogeneous
databases across the Internet. - Generic - a protocol independent of the data
retrieved and of the software to retrieve it.
5Project Rationale
- Avoid multiple incongruous development efforts
- Pool resources and create a support community of
experts - Solve scalability problems
6Design Goals
- Use open protocols and standards, such as HTTP
and XML - Decouple the protocol, software and semantics
- Make new data provider installations as easy as
possible - Develop open source software with GNU General
Public Licensing (Its free).
7DiGIR Component Summary
8DiGIR Architecture
9Provider
- Receives requests
- Retrieves data from database
- Sends results to requestor
- Supplies metadata to describe content, contacts,
and capabilities - Logs requests
10DiGIR Architecture
11Portal Engine
- The entry point for an application
- Can query a registry to discover potential
providers - Can determine, based on provider metadata,
whether a provider should be queried - Can send requests to multiple providers
12Portal Engine, continued
- Assembles responses from providers
- Returns packaged results to the requesting
application - Communicates via protocol compliant messaging
only - Logs activity
13Registry
- Provides a yellow pages to advertise the
existence and capabilities of a provider - Provides a means to discover potential providers
of interest - May be public or private
- Need not be a part of the architecture
14DiGIR Architecture
- Provider
- Registry (register)
15DiGIR Architecture
- Portal Engine
- Registry (discover)
16DiGIR Protocol
- Defines request and response message formats for
communication between provider, portal engine,
and applications - Metadata requests
- Search requests
- Inventory requests
- Remains unfettered by the structure of the data
it transfers
17DiGIR Architecture
18DiGIR Architecture
- Application
- Protocol (request)
- Portal Engine
19DiGIR Architecture
- Application
- Protocol (request)
- Portal Engine
- Protocol (request)
- Provider
20DiGIR Architecture
- Application
- Protocol (request)
- Portal Engine
- Protocol (response)
- Provider
21DiGIR Architecture
- Application
- Protocol (response)
- Portal Engine
22Applications
- Must be able to assemble and send a request
document to a portal - Must be able to receive and interpret a response
document from the portal - Must do something incredibly useful and
interesting with the data - This is where the real fun is!
23MaNIS The Mammal Networked Information System
Its more than just a pangolin
24MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
25MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
26MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
27MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
28MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
29MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
30MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
31MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
32MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
33MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
34MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
35MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
36MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
37MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
38MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
39MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
40MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
41MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
42MaNIS Network Configuration
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
43MaNIS Network Configuration
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MaNIS DiGIR Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
44MaNIS Network Configuration
CalNet DiGIR Portal
45MaNIS Network Configuration
CalNet DiGIR Portal
BioGeomancer Web Service
46MaNIS Network Configuration
NBII DiGIR Portal
47MaNIS Network Configuration
NBII DiGIR Portal
GBIF Presentation Layers
48Intra-Network Configuration (BNMH)
BNHM DiGIR Portal
BNHM Presentation Layer
49Other Network Configurations
50Other Network Configurations
51Other Network Configurations
52Other Network Configurations
53Other Network Configurations
54Other Network Configurations
55Distributed vs. centralized
- Multiple sources of data
- under local control,
- with concepts in common
- and a desire to deliver as part of a community
56Distributed vs. centralized
- In other words, distribute the headache avoid
migraines.
57Project Information
- DiGIR is a collaborative open source development
project on SourceForge (https//sourceforge.net/pr
ojects/digir). - Software and documentation are available on the
DiGIR web site (http//digir.net). - MaNIS is an international network collaboration
among mammal specimen collections
(http//elib.cs.berkeley.edu/manis).
58Portal Demonstrations