Title: Natural History Collections Infrastructure
1Natural History Collections Infrastructure
- Ricardo Scachetti-Pereira
- The University of Kansas
- Biodiversity Research Center
- Natural History Museum
- KU-BRC/NHM
2Biological Collections
- Natural History Museums, Herbaria and
Culture Collections provide fundamental
resources for biodiversity and ecological
research - Development of IT infrastructure to make
results of those services (specimen data)
more accessible
3Collections Data Integration
- Integrate Common Information
- Scientific Name, Taxonomy
- Geography, Locality, GPS coordinates
- Collection Events (Collector) and other
information - Geographically Distributed
- Birds of Mexico spread over 43 institutions
around the world main holder had only
16 of total specimens - Heterogeneous Hardware Software
- Database Vendors (Access, Oracle, SQL Server)
- Database Schemas (Table Definitions)
- Software (Specify, Biota, In-house, etc)
4Collections Data Integration
- Distributed Generic Information Retrieval
(DiGIR) - XML based protocol for retrieving structured
data from multiple, distributed,
heterogeneous databases over the Internet
5DiGIR Protocol
- Portal (UI) builds XML query
- Portal broadcasts XML query to providers
- Each provider translates XML query into
native SQL query (database schema) - Provider translates results into XML result
set and send it back to portal - Portal integrates answers from various
providers into a single, homogeneous result
table
6MaNIS
7MaNIS
8MaNIS
9GBIF
10GBIF
11GBIF
12GBIF
13Lifemapper
14Lifemapper
15Applications of Collections Data Methods
- Prediction of Species Distributions
- Looks for non-random correlations between
point occurrence and environmental conditions - Genetic Algorithms (GARP)
- Artificial Neural Networks (ANN)
- Generalized Linear Models (GLM)
- Generalized Additive Models (GAM)
- Many, many others
16Applications of Collections Data Examples
- Prediction of Species Actual and Potential
Geographical Distribution - Invasive Species
- Spread of Diseases
- Evolutionary Biology
- Management
- Monitoring
17Applications Predicting the distribution of
filovirus disease
Slide by A. Townsend Peterson (KU)
18Applications Predicting the impact of
species invasions
19Ecological Data Integration
- Specimen data currently integrated
- Other data required for analysis
- Climate, Relief, Land Cover Use, Remote
Sense, etc. - Acquisition, processing and integration is
still largely manual
20Ecological Data IntegrationThe SEEK Vision
- Science Environment for Ecological Knowledge
(SEEK) - Partners NCEAS, UNM, SDSC, KU
21SEEK Objectives
- Provide access to biodiversity, ecological
and environmental data (discovery, sharing
and reuse) - Provide scalable and streamlined framework
for analysis and synthesis - Use Semantic Mediation to integrate
heterogeneous data and analytical steps
22SEEK Overview
Slide by Chad Berkeley (NCEAS)
23SEEK Ecogrid
- Integrate diverse data networks from
ecology, biodiversity and environmental
sciences - XML based language used for data
documentation - Access to computational resources via the Grid
Slide by Matt Jones (NCEAS)
24SEEK Data Integration
Slide by Matt Jones (NCEAS)
25SEEK Analysis Modeling
Slide by Matt Jones (NCEAS)
26SEEK Kepler
27SEEK Kepler
28SEEK Taxonomic Object Service
Elliot 1816
R. plumosa
- Taxon concepts change
- over time (and space)
- Multiple competing
- concepts coexist
- Names are re-used for
- multiple concepts
Gray 1834
R. plumosa
Rhynchospora plumosa s.l.
R. Plumosa v. intermedia
R. plumosa v. plumosa
Chapman 1860
R. Plumosa v. interrupta
R. intermedia
Kral 1998
R. plumosa
R. pineticola
Peet 2002?
R. plumosa v. plumosa
R. plumosa v. pinetcola
R. sp. 1
A
B
C
Slide by Bill Michener (UNM)
Information by Robert Peet (UNC)
29SEEK Road map
- Now into the 2nd year (out of 5)
- Working prototypes for
- Ecogrid Kepler (UI)
- Semantic Mediation System
- Taxonomic Object Service
30Role of Collections in NEON
- Provide fundamental services for biodiversity
and ecological research and monitoring - Collections count on IT infrastructure to
provide valuable information to NEON - Will be seamlessly integrated to other
relevant sources of data
31Integrating Collections into NEON
- Rate and amount of deposits limited by
- Physical Installations (Storage Facilities)
- Personnel (Allocation and Training)
- Preservation/Storage Processes
- Computerization Process
- Require proper allocation of resources to
function as part of a monitoring facility