Ontologies and Semantic Applications in Earth Sciences - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Ontologies and Semantic Applications in Earth Sciences

Description:

Peter Fox (TWC/RPI; formerly HAO/NCAR) Thanks to many. ... anywhere above 100km and toward the arctic circle (above 45N) at any time of ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 40
Provided by: debor134
Category:

less

Transcript and Presenter's Notes

Title: Ontologies and Semantic Applications in Earth Sciences


1
Ontologies and Semantic Applications in Earth
Sciences
  • Peter Fox (TWC/RPI formerly HAO/NCAR)
  • Thanks to many.
  • Projects funded by NSF/OCI and NASA/ACCESS/ESTO

2
Background
  • Scientists should be able to access a global,
    distributed knowledge base of scientific data
    that
  • appears to be integrated
  • appears to be locally available
  • But data is obtained by multiple means (models
    and instruments), using various protocols, in
    differing vocabularies, using (sometimes
    unstated) assumptions, with inconsistent (or
    non-existent) meta-data. It may be inconsistent,
    incomplete, evolving, and distributed
  • And there exist(ed) significant levels of
    semantic heterogeneity, large-scale data, complex
    data types, legacy systems, inflexible and
    unsustainable implementation technology

3
Data-types as service
Limited interoperability
  • VOTable
  • Simple Image Access Protocol
  • Simple Spectrum Access Protocol
  • Simple Time Access Protocol

VO App2
VO App3
VO App1
Open Geospatial Consortium Web Feature,
Coverage, Mapping Service Sensor Web
Enablement Sensor Observation, Planning,
Analysis Service use the same approach
VO layer
DBn
DB2
DB3

DB1
4
VO API
Web Serv.
VO Portal
Knowledge as service!
Query, access and use of data
  • Mediation Layer
  • Ontology - capturing concepts of Parameters,
    Instruments, Date/Time, Data Product (and
    associated classes, properties) and Service
    Classes
  • Maps queries to underlying data
  • Generates access requests for metadata, data
  • Allows queries, reasoning, analysis, new
    hypothesis generation, testing, explanation, etc.

Semantic mediation layer - VSTO - low level
Standard, or not, vocabularies and schema
Metadata, schema, data
DBn
DB2
DB3

DB1
5
Semantic Web Methodology and Technology
Development Process
  • Establish and improve a well-defined methodology
    vision for Semantic Technology based application
    development
  • Leverage any existing vocabularies

Adopt Technology Approach
Leverage Technology Infrastructure
Science/Expert Review Iteration
Rapid Prototype
Open World Evolve, Iterate, Redesign, Redeploy
Use Tools
Analysis
Use Case
Develop model/ ontology
Small Team, mixed skills
6
E.g. Science and technical use cases
  • Find data which represents the state of the
    neutral atmosphere anywhere above 100km and
    toward the arctic circle (above 45N) at any time
    of high geomagnetic activity.
  • Extract information from the use-case - encode
    knowledge
  • Translate this into a complete query for data -
    inference and integration of data from
    instruments, indices and models
  • Provide semantically-enabled, smart data query
    services via a SOAP web for the Virtual
    Ionosphere-Thermosphere-Mesosphere Observatory
    that retrieve data, filtered by constraints on
    Instrument, Date-Time, and Parameter in any order
    and with constraints included in any combination.

7
VSTO - semantics and ontologies in an operational
environment vsto.hao.ucar.edu, www.vsto.org
8
Semantic Web Services
9
Semantic Web Services
OWL document returned using VSTO ontology - can
be used both syntactically or semantically
10
Semantic Web Benefits
  • Unified/ abstracted query workflow Parameters,
    Instruments, Date-Time across widely different
    disciplines
  • Decreased input requirements for query in one
    case reducing the number of selections from eight
    to three
  • Semantic query support by using background
    ontologies and a reasoner, our application has
    the opportunity to only expose coherent queries
    (portal and services)
  • Semantic integration in the past users had to
    remember (and maintain codes) to account for
    numerous different ways to combine and plot the
    data whereas now semantic mediation provides the
    level of sensible data integration required, and
    exposed as smart web services
  • understanding of coordinate systems,
    relationships, data synthesis, transformations,
    etc.
  • returns independent variables and related
    parameters
  • A broader range of potential users (PhD
    scientists, students, professional research
    associates and those from outside the fields)
  • VSTO http//vsto.hao.ucar.edu,
    http//www.vsto.org

11
http//dataportal.ucar.edu/schemas/vsto_all.owl
(1.0, 2.0 coming)
12
Ingest/pipelines problem definition
  • Data is coming in faster, in greater volumes and
    outstripping our ability to perform adequate
    quality control
  • Data is being used in new ways and we frequently
    do not have sufficient information on what
    happened to the data along the processing stages
    to determine if it is suitable for a use we did
    not envision
  • We often fail to capture, represent and propagate
    manually generated information that need to go
    with the data flows
  • Each time we develop a new instrument, we develop
    a new data ingest procedure and collect different
    metadata and organize it differently. It is then
    hard to use with previous projects
  • The task of event determination and feature
    classification is onerous and we don't do it
    until after we get the data

13
(No Transcript)
14
Use cases
  • Who (person or program) added the comments to the
    science data file for the best vignetted,
    rectangular polarization brightness image from
    January, 26, 2005 184909UT taken by the ACOS
    Mark IV polarimeter?
  • What was the cloud cover and atmospheric seeing
    conditions during the local morning of January
    26, 2005 at MLSO?
  • Find all good images on March 21, 2008.
  • Why are the quick look images from March 21,
    2008, 1900UT missing?
  • Why does this image look bad?

15
(No Transcript)
16
(No Transcript)
17
Provenance
  • Origin or source from which something comes,
    intention for use, who/what generated for, manner
    of manufacture, history of subsequent owners,
    sense of place and time of manufacture,
    production or discovery, documented in detail
    sufficient to allow reproducibility
  • Knowledge provenance enrich with ontologies and
    ontology-aware tools

18
(No Transcript)
19
(No Transcript)
20
Quick look browse
21
(No Transcript)
22
Visual browse
23
(No Transcript)
24
(No Transcript)
25
Search and structured query
Structured Query
Search
26
Search
27
Data Integration Use Case
  • Determine the statistical signatures of both
    volcanic and solar forcings on the height of the
    tropopause

28
Detection and attribution relations
29
(No Transcript)
30
SWEET 2.0
31
Semantic framework indicating how volcano and
atmospheric parameters and databases can
immediately be plugged in to the semantic data
framework to enable data integration.
32
Faceted Search
33
Summary
  • Level of ontology encoding relates to use, e.g.
  • VSTO
  • SPCDIS
  • SESDI Data integration needs higher level of
    curation of ontologies and mapping to data
  • Languages and tools
  • Rapid prototyping (PHP, Semantic MediaWiki)
  • Clean and simple (RDFS, Perl and SPARQL)
  • Complex and rich (Java, Protégé, Jena, Pellet,
    ELMO, Maven, Eclipse)

34
Modified GEON Solution Framework
Data Discovery
Data Integration
Level 1 Data Registration at the Discovery
Level, e.g. Volcano location and activity
Level 2 Data Registration at the Inventory
Level, e.g. list of datasets by, types, times,
products
Level 3 Data Registration at the Item
Detail Level, e.g. access to individual quantities
Earth Sciences Virtual Database A Data Warehouse
where Schema heterogeneity problem is Solved
schema based integration
Ontology based Data Integration
A.K.Sinha, Virginia Tech, 2006
35
Spare material
36
Example 1 Registration of Volcanic Data
  • Location Codes
  • U - Above the 180 turn at Holei Pali (upper
    Chain of Craters Road)
  • L - Below Holei Pali (lower Chain of Craters
    Road)
  • UL - Individual traverses were made both above
    and below the 180 turn at Holei Pali
  • H - Highway 11

SO2 Emission from Kilauea east rift zone -
vehicle-based (Source HVO)
Abreviations t/dmetric tonne (1000 kg)/day,
SDstandard deviation, WSwind speed, WDwind
direction east of true north, Nnumber of
traverses
37
Registering Volcanic Data (2)
  • No explicit lat/long data
  • Volcano identified by name
  • Volcano ontology framework will link name to
    location

38
Registering Atmospheric Data (2)
39
Building blocks
  • Data formats and metadata IAU standard FITS,
    with SoHO keyword convention, JPeG, GIF
  • Ontologies OWL-DL and RDF
  • The proof markup language (PML) provides an
    interlingua for capturing the information agents
    need to understand results and to justify why
    they should believe the results.
  • The Inference Web toolkit provides a suite of
    tools for manipulating, presenting, summarizing,
    analyzing, and searching PML in efforts to
    provide a set of tools that will let end users
    understand information and its derivation,
    thereby facilitating trust in and reuse of
    information.
  • Capturing semantics of data quality, event, and
    feature detection within a suitable community
    ontology packages (SWEET, VSTO)
Write a Comment
User Comments (0)
About PowerShow.com