Title: ORNIS: Looking into the Future
1ORNISLooking into the Future
Accipiter gularis
- Principal Investigators A. Town Peterson Carla
Cicero - Lead Programmer John Wieczorek
- Web Site Portal http//ornisnet.org
2What is ORNIS?
An integrated, multi-institutional,
cross-disciplinary information resource
(ORNithological lnformation System) providing
state-of-the-art, up-to-the-minute information
about bird biology and bird diversity worldwide
that can address pressing questions in
Ornithology and in broader policy decisions
3ORNIS Scope
- NSF Biological Databases and Informatics Program
- 5 years (from 1 Sep 2004)
- 1.5M (65 cut from original budget)
- 29 institutions funded
- 33 participating
- open to additional participants (U.S./foreign)
- gt5M specimens, gt21M observations
4Taxonomic data
Scientific literature
Gene sequence data Genomics
Recordings, images, videos
Stable isotope data
Primary Species Occurrence Data
Field notes, other ancillary information
Parasites etc.
Stomach contents, etc.
Geospatial data describing locality
Remote-sensing data showing locality in space
and time
5ORNIS Improved resource for
- Identifying gaps in taxonomic and/or geographic
sampling - need for additional collecting for genetic or
other studies - Tracking emerging diseases (e.g., WNV, avian flu)
- Correlating changes in species distributions
and/or abundances with environmental changes (GIS
analysis) - inclusion of observational (monitoring) data
improves power of analysis
6ORNIS Rationale
- Multiple sources of data
- under local control
- with concepts in common
- Distributed nature of biodiversity data requires
collaboration across institutions for efficient
access - similar efforts for mammals, herps, fish, etc.
- registered with GBIF for global access
7Example Mexico
8ORNIS Goals
- Project Web Site http//ornisnet.org
- Distributed Data Network
- 19 of 33 providers currently online
- Georeferencing (promised North America, can do
more) - automated georeferencing via Biogeomancer
(http//www.biogeomancer.org) - Data Validation (taxonomy, geography, collecting
events) - error detection
- outlier detection
9ORNIS Network Architecture
Sasia ochracea
10DiGIRDistributed Generic Information Retrieval
- Distributed - A protocol for retrieving
structured data from multiple, heterogeneous
databases across the Internet. - Generic - A protocol independent of the data
retrieved and of the software to retrieve it.
Developed collaboratively by University of Kansas
Biodiversity Research Center, California Academy
of Sciences, Museum of Vertebrate Zoology. Used
widely by similar initiatives (MaNIS, HerpNET,
FishNet, GBIF, etc.).
11DWC2Darwin Core 2 Schema
- A simple set of data element definitions designed
to support the sharing and integration of primary
biodiversity data - Relevant extensions for ORNIS include
- Curatorial
- Geospatial
- (new version still under review
http//darwincore.calacademy.org)
12Relevance for DNA barcoding
- Allows querying of tissues across multiple
institutions - now a separate field
- new version of DWC2 includes tissues as part of
specimen preparation in the curatorial extension - need standardized way of storing tissue data
- Allows linking to Genbank Accession numbers if
supported by institutional database - new version of DWC2 includes GenbankNum as
standard field in the curatorial extension
13Provider
- Receives requests
- Logs requests
- Retrieves data from database
- Sends results to requester
- Can be registered
14Registry
- Provides a yellow pages to advertise the
existence and capabilities of a provider - Provides a means to discover potential providers
of interest - May be public or private
- Need not be a part of the architecture
15Portal/Dispatcher
- Queries a registry to discover potential
providers - Determines, based on provider metadata, whether a
provider should be queried - Sends requests to multiple providers
- Assembles responses from providers
- Returns packaged results to the requesting
application - Logs activity
16Construct query on data portal
ORNIS DiGIR Dispatcher
ORNIS Data Portal
17Dispatcher distributes request
ORNIS DiGIR Dispatcher
ORNIS Data Portal
18Providers query databases
ORNIS DiGIR Dispatcher
ORNIS Data Portal
19Databases respond to providers
ORNIS DiGIR Dispatcher
ORNIS Data Portal
20Providers respond to dispatcher
ORNIS DiGIR Dispatcher
ORNIS Data Portal
21Last response returns to dispatcher
ORNIS DiGIR Dispatcher
ORNIS Data Portal
22Dispatcher sends results to portal
ORNIS DiGIR Dispatcher
ORNIS Data Portal
23Key Issues
- Ownership maintained locally
- data controlled locally
- varying data quality and standards (e.g.,
taxonomic names, geographic names, specimen part
names) - not all fields supported by all institutions
- mapping of fields across institutions may be a
problem - Institutions may restrict classes of data
- entire records (e.g., proprietary research
material) - certain fields (e.g., locality, collector)
- Updated data served in real time
- frequency of updates depends on local institution
24What does this mean forDNA Barcoding?
- Data on tissues will vary depending on local
databases (how data are stored) - Access to data IS NOT access to tissues
- institutional policies
- institutional decision-making
- Critical link between barcoding data and original
provenance of samples (NO centralization) - barcode sequences should be voucher-based
- data should be maintained at institution where
voucher is deposited (taxonomic/identification
updates, etc.) - Genbank sequences should link directly back to
vouchers and institutional data
25Success of DNA barcoding initiative depends on
community collaboration to provide data and
genetic resources!
26Downstream Costs
- Support for maintenance, enhancement, growth, and
use of genetic resource collections - Support for institutions that manage and serve
data - computerization of collections (staff time,
dedicated computers) - secure servers to enhance online access
- programming time to add institutions to the
network
27ORNIS portalhttp//ornisnet.org
Glaucidium brodei
28Query Steps
- Select one or multiple providers
- Select one or more query conditions
- Concepts, e.g., Genus, Country
- Comparators, e.g., equals, contains
- Select result set structure
- Mapping
- allows mapping, table views, downloads
- subset of fields displayed
- only georeferenced points mapped
- Full result set
- all fields displayed
- currently not mappable
- Custom result set
- Selected (custom) fields displayed
- currently not mappable
- use to output Genbank numbers
- Specify record limit (default to display 10)
29ORNIS portalhttp//ornisnet.org
Glaucidium brodei