Title: North American initiatives in Ecoinformatics:
1North American initiatives in Ecoinformatics Vegb
ank and SEEK
Robert K. Peet and The Ecological Society of
America Vegetation Panel The SEEK development team
2Case Studies
3(No Transcript)
4Mean Species Richness
Upland (1090 plots)
Riparian (121 plots)
31.12
55.66
Native
0.20 (268 plots with exotics)
7.98 (110 plots with exotics)
Exotic
Kruskal-Wallis Native Richness Chisq 353.2,
df 1, P lt 0.0001
Exotic Richness Chisq 127.7, df 1, P lt
0.0001
5(No Transcript)
6Type A plots that keep on giving
- Unusual supply-side driven, influenced little
by competition - Little spatial structure disturbance important
(? species pool)
(? competitive exclusion)
7(No Transcript)
8(No Transcript)
9Traditional Community Ecology
- The questions
- How are communities structured?
- How do taxa interact?
- The solutions
- Simple observations.
- Simple experiments.
- The scale
- Stand or landscape.
10Major data types
- Site data climate, soils, topography, etc.
- Taxon attribute data identification, phylogeny,
distribution, life-history, functional
attributes, etc. - Occurrence data attributes of individuals
(e.g., size, age, growth rate) and taxa (e.g.,
cover, biomass) that co-occur at a site.
11- EcoInformatics ?
-
- Massive plot data have the potential to create
new disciplines and allow critical syntheses. - Theoretical community ecology. Which taxa occur
together, and where, and following what rules? - Remote sensing. What is really on the ground?
- Monitoring. What changes are really taking
place in the vegetation? - Restoration. What should be our restoration
targets? - Vegetation species modeling. Where should we
expect species communities to occur after
environmental changes?
12Conclusions?
- Standard data structures
- Standards for data exchange
- Public data archives (functions for deposit,
discovery, withdrawal, citation, annotation) - Standards for data archiving
- Standards for reference to taxonomic data
- Standard software tools
13Background
The ESA Vegetation Classification Panel was
established in 1993 with a mandate to support the
emerging U.S. Vegetation Classification.
14I am pleased to acknowledge the support and
cooperation of
15Guidelines for Vegetation Classification The ESA
Vegetation Panel and its partners have
collaborated to develop guidelines for the
floristic levels of the classification covering
- Requirements for vegetation field plots.
- Documentation description of floristic types.
- Submission peer review of proposed types.
- Management, citation, archiving of vegetation
data.
16Overview of online resources
Stores plots and makes them publicly accessible
Stores current communities in the NVC
Stores current plant taxonomy
Allows people to change and update NVC and plants
17- VegBank
- The ESA Vegetation Panel is developing a public
archive for vegetation plots known as VegBank
(http//vegbank.org). - VegBank is expected to function for vegetation
plot data in a manner analogous to GenBank. - Primary data will be deposited for reference,
novel synthesis, and reanalysis. - The database architecture is generalizable to
most types of species co-occurrence data.
18http//www.vegbank.org
19(No Transcript)
20VegBank data are open access
- All data placed in VegBank are available to the
public at no charge (unless the plot contributor
places restrictions to protect location
information for rare and endangered species or
private lands). - Key data can be viewed by a simple web link.
- The following link shows information for two
VegBank plots - http//vegbank.org/get/std/observation/5153,5906
21http//vegbank.org/get/std/observation/'VB.Ob.5153
.YOSE98M19'
22Biodiversity data structure
SynTaxon
Community type databases
23Core elements of VegBank
Project
Plot
Plot Observation
Taxon / Individual Observation
Taxon Interpretation
Plot Interpretation
24Taxon/community interpretation
- Multiple concepts can be linked simultaneously
- Degree of fit for each can be indicated
- Subsequent interpretations supported.
25- VegBank Interface Tools
- Desktop client (VegBranch) for data preparation
and local use. - Flexible XML data import supporting VegBranch (
TurboVeg) formats. - Flexible data export.
- Easy web access to central archive
26VegBranch can be used for converting legacy data,
entering data, and maintaining a local plot
database.
27Challenges
- Data ownership, intellectual property rights,
confidentiality - Multiple classifications of organsms and
communities - Multiple plot types (relevés Hubbell plots)
- Data entry submission tools
- Perfect archiving
- Plot and taxon interpretation
28The Taxonomic database challengeStandardizing
organisms and communities The problem
Integration of data potentially representing
different times, places, investigators and
taxonomic standards. The traditional solution
A standard list of organisms / communities.
29- Standardized taxon lists fail to allow dataset
integration - The reasons include
- The user cannot reconstruct the database as
viewed at an arbitrary time in the past, - Taxonomic concepts are not defined (just lists),
- Multiple party perspectives on taxonomic concepts
and names cannot be supported or reconciled. - This is the single largest impediment to
large-scale synthesis in ecology
30High-elevation fir trees of western North
America
AZ NM CO WY MT AB eBC wBC WA OR
Distribution
Abies lasiocarpa var. arizonica
Abies lasiocarpa var. lasiocarpa
USDA - ITIS
Abies bifolia
Abies lasiocarpa
Flora North America
31Three concepts of shagbark hickory Splitting one
species into two illustrates the ambiguity often
associated with scientific names. If you
encounter the name Carya ovata (Miller) K. Koch
in a database, you cannot be sure which of two
meanings applies.
Carya carolinae-sept. (Ashe) Engler Graebner
Carya ovata (Miller)K. Koch
Carya ovata (Miller)K. Koch
sec. Gleason 1952
sec. Radford et al. 1968
32Six shagbark hickory assertions Possible
taxonomic synonyms are listed together
Names Carya ovata Carya carolinae-septentrionalis
Carya ovata v. ovata Carya ovata v. australis
Taxon concepts (One shagbark)C. ovata sec
Gleason 52 C. ovata sec FNA 97 (Southern
shagbark)C. carolinae-s. sec Radford 68C.
ovata v. australis sec FNA 97 (Northern
shagbark) C. ovata sec Radford 68 C. ovata (v.
ovata) sec FNA 97
References Gleason 1952 Britton Brown Radford
et al. 1968 Flora Carolinas Stone 1997 Flora
North America
33- Party Perspective
- The Party Perspective on a Concept includes
- Status Standard, Nonstandard, Undetermined
- Correlation with other concepts Equal,
Greater, Lesser, Overlap, Undetermined. - Start Stop dates.
34VegBank is populating USDA concepts
relationships
- Reference list
- USDA PLANTS / ITIS
- 1999, 2005
- Standard treatments
- Flora North America (8 volumes)
- Isley Legumes
- Rollins Brassicaceae
- Selected treatments
35Best practices
- When reporting identity of organisms, provide not
only the full scientific name of each kind of
organism, but also the reference that formed the
basis of the taxonomic concept. - Reference high quality sources for taxon concepts
such as major compendia that provide their own
defined concepts. - Comprehensive checklists typically lack true
taxonomic circumscriptions, but might be
considered to contain taxonomic concepts
sufficient for documenting organism identity. - Identifications should include linkage to at
least one concept, but in some cases should be
linked to multiple concepts.
36NatureServe provides access to the NVC and
supporting documentationhttp//www.natureserve.or
g/explorer
37Simple searches allow information on communities
to be located.
38Key descriptive data are available online, but
the classification process is not yet open to the
full scientific community.
39Coming soon direct links to views of typal and
occurrence plots in VegBank
40The ESA Panel and VegBank staff are developing an
open peer-review system to allow anyone to
contribute proposed revisions for the NVC.
41(No Transcript)
42- The results of the peer-review process will be
published in an online journal linked to VegBank
43- Concluding remarks
- Much of what we are doing with the US National
Vegetation Classification is common to the
vegetation classification enterprise worldwide,
but much is also novel. - Public plot archives, initially driven by the
classification enterprise, have the potential to
radically change the development of ecology and
biodiversity management in general.
44Highlights from theScience Environment for
Ecological Knowledge (SEEK)
45What is SEEK?
- Science Environment for Ecological Knowledge
- Multidisciplinary project to create
- Scientific-workflow system (Kepler)
- Design, reuse, and execute scientific analyses
- Distributed data network (EcoGrid)
- Environmental, ecological, and systematics data
- KR Semantic Mediation
- Discover, integrate, and compose hard-to-relate
data and services via ontologies - Taxonomic concept services
- Resolve taxon ambiguities
- Collaborators (the SEEK team)
- NCEAS, UNM, SDSC/UCSD, U Kansas
- Vermont, Napier, ASU, UNC
46Kepler Scientific Workflows
- Model the way scientists work with their data now
- Mentally coordinate export and import of data
among software systems - Workflows emphasize data flow
- Metadata-driven data ingestion
- Output generation includes creating appropriate
metadata
Archive output to EcoGrid with workflow metadata
Query EcoGrid to find data
47SEEK EcoGrid
- Goal allow diverse environmental data systems to
interoperate - Hides complexity of underlying systems using
lightweight interfaces - Integrate diverse data networks from ecology,
biodiversity, and environmental sciences - Data systems
- Any system can implement these interfaces
- Prototyping using
- Metacat, DiGIR, etc.
- Supports multiple metadata standards
- EML, Darwin Core as foci
48EcoGrid client interactions
- Modes of interaction
- Client-server
- Fully distributed
- Peer-to-peer
- EcoGrid Registry
- Node discovery
- Service discovery
- Aggregation services
- Centralized access
- Reliability
- Data preservation
49Knowledge Representation
- Current Ontologies
- Ecological Concepts, Models, Networks
- Measurements
- Properties
- Statistical Analyses
- Time and Space
- Taxonomic Identifiers
- Units
- Symbiosis
- Recent Developments
- Biodiversity (measured traits, computation of
traits) - Descriptive Terminology for Plant Communities
- Ontology documentation
50Data Procurement
- Find all datasets that contain abundance
measurements of Manica bradleyi inter-ant
parasites observed within California
51SEEK High-Level Approach
52Acknowledgements
This material is based upon work supported
by The National Science Foundation under Grant
Numbers 9980154, 9904777, 0131178, 9905838,
0129792, and 0225676. Collaborators NCEAS (UC
Santa Barbara), University of New Mexico (Long
Term Ecological Research Network Office), San
Diego Supercomputer Center, University of Kansas
(Center for Biodiversity Research), University of
Vermont, University of North Carolina, Napier
University, Arizona State University, UC
Davis The National Center for Ecological Analysis
and Synthesis, a Center funded by NSF (Grant
Number 0072909), the University of California,
and the UC Santa Barbara campus. The Andrew W.
Mellon Foundation. Kepler contributors SEEK,
Ptolemy II, SDM/SciDAC, GEON
53Conclusions?
- Standard data structures
- Standards for data exchange
- Public data archives (functions for deposit,
discovery, withdrawal, citation, annotation) - Standards for data archiving
- Standards for reference to taxonomic data
- Standard software tools