Title: Observation Data Model: Creating a Unified Model for Observational Data in the Ecological and Environmental Sciences.
1- Observation Data Model Creating a Unified Model
for Observational Data in the Ecological and
Environmental Sciences. - Steve Kelling
- Cornell Lab of Ornithology
2There is an enormous variety of observation
data. A major challenge is joining observation
data gathered in different projects.
Nitrogen
Skunk
Ocean Current
Trilobites
3Integrating Variables to Model Species
Occurrence.When multiple independent variables
(i.e. land cover, human density climate)
areincorporated into species distribution and
abundance analysis, accurate estimatesof species
occurrence can be obtained.
Top Distribution of Northern Cardinalduring the
summer of 2006. Over 200 independent
variableswere used in the model. Right Confidenc
e intervals provide anindication of difference
betweenlocations.
4Observations Workshop
St. Barbara/ NCEAS9.-11. July 2007
Mark Schildauer I agree with what I said here
5A variety of observational data models were
analyzed at the meeting.
Organization Short description of observational data modeling approach
SEEK The SEEK extensible observations ontology (OBOE) focuses on capturing the essential information about observations required to comprehensively discover and integrate heterogeneous ecological data.
NatureServe The NatureServe Observational Data Standard focuses on developing an XML Schema for specimen-oriented survey data to improve data aggregation and sharing within and between organizations.
ALTER-NeT The European ALTER-NeT Ontology, CEDEX, focuses on developing an object-oriented data system for cataloguing observational ecological data while retaining semantic information to aid data discovery and analysis.
SPIRE The Spire initiative focuses on developing domain-independent, general-purpose ontologies to enable annotation of the contents and structure of existing ecological databases with an initial focus on taxonomy and food web issues (ETHAN).
OGC The OGC Observation and Measurement Standard focuses on developing a generic conceptual XML Schema for representing all aspects of observation and measurement data.
VSTO The Virtual Solar-Terrestrial Observatory focuses on building ontologies for interoperating among different existing meteorological and atmospheric metadata standards.
TDWG TDWG is developing a meta-modelto integrate biodiversity observations with specimen data by identifying similarities between these two data types, determining whether existing standards suffice to describe them, and if not, developing the additional concepts needed for clarification
ODM The CUAHSI Observations Data Model (ODM) and associated relational database focus on storing hydrologic observations data in a system designed to optimize data retrieval for integrated analysis of information collected by multiple investigators.
6Observations Workshop Summary
- Holistic integrative large-scale science would
benefit from better data discovery,
interpretation, and integration within and across
disciplines. - The workshop participants found much commonality
among their approaches in modeling observational
data. - An extensible observational data model has
advantages over conventional models. - The development of a core observational data
model should be domain independent and be
conducted through an established standards body.
7This Presentation will review the following
outcomes of the meeting.
- Definition of Observation
- Capabilities
- Requirements
8- An Observation is the Determination of the Value
of a Property of some Entity in a particular
Context - Entity thing or process or phenomena
- Determination the outcome of the process by
which Value of a Characteristic is measured - Characteristic a property that can be assigned a
value - Value discrete or continuous quantification or
qualification of a Characteristic - Context the setting and conditions that
constrain the interpretation and applicability
of a Measurement, such as space, time, or
treatment - property
- Observer
- Protocol
- Standard
9Capabilities Organizational Approach
- Producers
- Design
- Create
- Manage
- Publish
- Consumers
- Find/Discover
- Access
- Interpret
- Integrate
- Analyze
- Report/Present/Visualize
10- Community
- Consumers
- Scientist -- obtain content, information about
content, runs data analysis - Aggregator -- organizations such as GBIF
- Application Programmer -- writes analyses,
obtains data, query - Citizens -- K-12, analysis results, aggregated
data - Producers
- Scientist -- collect data
- Application Programmer -- tools for publishing,
sharing data - Information Manager -- database admin, schema
designer - Data Entry Personnel -- enter data into system
- Citizen Scientists -- eg, collecting census data
11Producers
- Design- Standardize schema components, catalogs
of properties, and attributes so they can be
shared.
- Create- Import/export of data assets should
preserve data integrity, and maintain data
ownership.
- Manage- Develop flexible tools for resource
management and access control.
- Publish- Enable structured data and provenance
descriptions that can be published via common
data exchange formats.
12Consumers
- Discover- Facilitate discovery by providing
access to content themes, - context, provenance, and attributes via
semantic searches.
- Access- Standardize data access processes via
exchange schemas, and improved machine to
machine communication.
- Integrate- Capture relationships, and mediate
differences between datasets to allow
integration.
- Analyze- Enable scientific workflow processes
to explore patterns and test hypotheses.
- Report- Provide resources for data
visualizations and result publication.
13Requirements Organizational Approach
- Data Model Characteristics Accurate portrayal of
observational data via well defined terms that
conform to existing standards. - Data Model Items
- Extensibility
14Data Model Items
- Organization of observations by survey-type,
protocol, project, data set, data stream, or
particular entity must be maintained.
- Identify specific relationships among controlled
terms, eg, taxon names, categorical response
values H/M/L must be identified.
- Context must unambiguously represent space time
location and other relevant aspects of data
with some indication of uncertainty.
- Provenance and ownership information maintained
at atomic level of data precision.
- Experimental design and methods must be
described.
- Collection Event must be maintained.
- Measurements must be accurately maintained.
15- Extensibility
- Support extensions that are specific to
sub-disciplines. - Allow referencing to ontologies from different
domains. - Allow terms and definitions to be packaged for
re-use. - Allow competing domain extensions.
- Extensions should not impact the core model.
- Allow extensions to be related (crosswalks).
- Allow extensions to be further extended.
- Allow core extensions for a particular
community.
16Conclusion Look at comparisons between the
developing BIS TDWG Observations Specimen Records
Interest Group model and OGC and SDD. For
example, SDD may provide a vocabulary for
Characteristics/Properties in the Observations
model.
17Acknowledgements Mark Schildauer Matt
Jones Paul Allen