Title: CCBR Systems applied to Earthquake Science
1CCBR Systems applied to Earthquake Science
Mehmet Aktas Dr. David Leake Dr. Marlon
Pierce Indiana University
2SERVOGrid
- SERVOGrid is a NASA project to integrate
historical, measured, and calculated earthquake
data (GPS, Seismicity, Faults) with simulation
codes. - Using GML extensions as common data format.
3General Earthquake Model (GEM)
4Distributed Resources
5GEM Codes
http//www-aig.jpl.nasa.gov/public/dus/gem/gemcode
s.html
6Motivation
- Purpose
- Intelligent retrieval on metadata describing
codes written for earthquake science. - Guidance on how to run the codes to get
reasonable results. - Guidance for inexpert users to browse and select
codes through a portal. - Casebase
- disloc produces surface displacements based on
multiple arbitrary dipping dislocations in an
elastic half-space - simplex - inverts surface geodetic displacements
to produce fault parameters - VC simulates interactions between vertical
strike slip faults. visulization and analysis
services
7CCBR Case
CCBR CASE
Feature
Solution
Problem
Feature
Feature
ltQuestion, Answergt
8Conversational CBR
CCBR CASEBASE
A Case from CASEBASE
Query Case
A
B
Feature 1 Feature 2 Feature 5
Feature 1 Feature 2 Feature 3 Feature 4
Case
IF ((A.Feature1.Solution B.Feature1.Solution)
(A.Feature2.Solution B.Feature2.Solution
)) THEN Consistency 2
Case
ltProblem, Solutiongt
9Prototype CCBR Application for GEM Code Selection
http//ripvanwinkle.ucs.indiana.edu4747/cbr/index
.jsp
10Prototype CCBR Application for GEM Code Guidance
11Developing an Ontology for General Earthquake
Modeling Codes
12Motivating Scenario
- We have a collection of codes, visualization
tools, computing resources, and data sets that we
want to combine in an ontology. - Instances of the ontology can then be made that
describe specific resources. - After we have built instances, we can pose
queries on the data to retrieve values. - Values may be structured, so we can do stepped
queries. - We thus need to start by grouping together
related resources.
13Group 1 Simulation Codes
- Disloc calculates surface stress displacements
causes by a fault placed in an elastic
half-space. Surface data can be either on a grid
or on defined scattered points. Can also create
InSAR-style surface displacements. - Simplex inverts Disloc to estimate fault
parameters from observed surface displacements.
Surface displacements can be either on a grid or
at defined points. - GeoFEST does a realistic model of stresses
created by a fault. Uses finite element method,
realistic material properties. - AKIRA Converts a geometry (layers, faults)
specification into a finite element mesh.
Successive calls refine the mesh. Needed as a
helper application for GeoFEST. - lee2geof Converts the finite element mesh to
GeoFest by associating boundary conditions and
material properties with the nodes. - Virtual California Based on realistic fault and
fault friction models, simulates interacting
fault systems.
14Group 2Visualization Codes
- We associate simulation codes with zero or more
visualization systems. - GMT (General Mapping Tool)
- IDL
- RIVA
- In practice, we usually refer to scripts for
specific tasks rather than the entire toolkit.
15Group 3 Compute Resources
- Grids a Sun Ultra 60 with Disloc, Simplex, and
VC installed. - Danube a linux duel processor machine with
GeoFEST, lee2geof, Akira, GMT installed. - Jabba an SGI 8 processor machine with RIVA
installed.
16Group 4 Data Types and Formats
- This is a mixture of data objects and
representations. As always, the data itself is
not represented but information like the creator
of the data is. - Faults
- GPS data
- Seismicity
- Surface stress data
- INSAR data
- Surface data representation grid or point data
17Building an Ontology
- Putting together an Ontology with RDF and RDFS
18Front Matters
- We now wish to create classes and properties that
we can couple into an ontology. - First, lets define a base object, GEMObject,
that we will extend as necessary. - This object doesnt do anything but it will have
some uses when we define property ranges and
domains. - lt?xml version"1.0"?gt
- ltrdfRDF xmlnsrdf"http//www.w3c.org/1999/02/22-
rdf-syntax-ns" - xmlnsrdfshttp//www.w3c.org/2000/01/rdf-schema
gt - ltrdfDescription rdfID"GEMObject"gt
- ltrdftype"http//www.w3c.org/2000/01/rdf-schema
Class"/gt - ltrdfslabelgtGEMObjectlt/rdfslabelgt
- ltrdfscommentgtThis is a generic object from
which everything in our - ontology will be derived.
- lt/rdfscommentgt
- lt/rdfDescriptiongt
- lt/rdfRDFgt
19Defining Some Useful Classes
- Based on our introductory comments, we need the
following classes - GEMCodes, with application and visualization
extensions - GEMData, such as Faults, GPS, and so on.
- GEMDataFormat either grid or point data
- ComputeResources host computers.
20Defining a GEMCode
Application Codes
- We next choose to subdivide the GEMCode class
into finer distinctions Application and
Visualization. - Application resources are described like this
- ltrdfDescription rdfID"ApplicationCode"gt
- ltrdftype"http//www.w3c.org/2000/01/rdf-schema
Class"/gt - ltrdfssubClassOf rdfresource"GEMCode"/gt
- ltrdfslabelgtApplication Codelt/rdfslabelgt
- ltrdfscommentgtThis is used to describe science
applicationlt/rdfscommentgt - lt/rdfDescriptiongt
- GEMCodes should extend our GEMObject generic
superclass. - It should itself be extended by other, more
specific resource types. - ltrdfDescription rdfID"GEMCode"gt
- ltrdftype"http//www.w3c.org/2000/01/rdf-schema
Class"/gt - ltrdfssubClassOf rdfresource"GEMObject"/gt
- ltrdfslabelgtGEMCodelt/rdfslabelgt
- ltrdfscommentgtThis is a general code class that
we will extendlt/rdfscommentgt - lt/rdfDescriptiongt
GEMData and GEMDataFormat
Computing Resources ltrdfDescription
rdfID"ComputeResources"gt ltrdfslabelgtComputeRes
ourceslt/rdfslabelgt ltrdfscommentgtThis is a
general compute resource class that we will
extendlt/rdfscommentgt ltrdfssubClassOf
rdfresource"GEMObject"/gt lt/rdfDescriptiongt
- The GEMData resource looks like this.
- ltrdfDescription rdfID"GEMData"gt
- ltrdfslabelgtGEMDatalt/rdfslabelgt
- ltrdfscommentgtThis is a general data class that
we will extendlt/rdfscommentgt - ltrdfssubClassOf rdfresource"GEMObject"/gt
- lt/rdfDescriptiongt
- The GEMDataFormat resource looks like this
- ltrdfDescription rdfID"GEMDataFormat"gt
- ltrdfslabelgtGEMDatalt/rdfslabelgt
- ltrdfscommentgtThis is a general data format
class that we will extendlt/rdfscommentgt - ltrdfssubClassOf rdfresource"GEMObject"/gt
- lt/rdfDescriptiongt
21Defining Useful Properties
- Making Classes Mean Something
22Defining Properties
- Classes by themselves dont tell us much, but
when we associate them with properties, things
start to fall into place. Before describing how
a property may be encoded, lets try and
enumerate the ones that we will need. - ownsGEMResource who owns a particular resource.
- installedOn, hasCode where a GEMCode is
installed, or, conversely, what codes a
particular computing resource has. - hasData where some piece of data is (on some
compute resource). - hasDataFormat associate a data format with a
piece of data. - takesInputData, createsOutputData what kind of
data a code takes as input and generates as
output. - dependsUpon a code depends upon another
operation before it can be completed.
23A Property for Ownership
- Now lets look at how to encode this first
property. It looks like this - ltrdfDescription rdfID"ownsGEMResource"gt
- ltrdftype
- rdfresource"http//www.w3c.org/1999/02/22-rdf-
syntax-nsProperty"/gt - ltrdfsdomain rdfresource"http//www.w3c.org/200
1/vcard-rdf/3.0"/gt' - ltrdfsrange rdfresource"GEMObject"/gt
- lt/rdfDescriptiongt
24Tag Explanations
- Property descriptions start out the same as our
class descriptions, but instead of having a Class
lttypegt, they are of lttypegt Property. - Next, we use two new RDFS tags, domain and range.
- Domain is used to say which classes can be the
subject of this property and range specifies the
classes that can be objects. - The range of the ownsGEMResource object is
obviously class, GEMObject. - The domain, on the other hand, is a human and we
have not yet defined classes for humans. - Instead of inventing our own, we use a standard
definition for people, known as a VCARD. - VCARD properties include first and last names,
addresses, email addresses, and so on
25The installedOn Property
- The other properties can be constructed similarly
- We show the installedOn example
- ltrdfDescription rdfID"installedOn"gt
- ltrdftype rdfresource"http//www.w3c.org/1999/0
2/22-rdf-syntax-nsProperty"/gt - ltrdfsdomain rdfresource"GEMCode"/gt'
- ltrdfsrange rdfresource"ComputeResource"/gt
- lt/rdfDescriptiongt
26Developing a Fault Schema
- In addition to classes we outlined above, we need
to expand on our GEMData class. - This class however needs to be extended to
describe specific data types. - Faults are characterized in several ways, which
we will group as follows - Descriptive Characteristics name of the fault
(like San Andreas, Northridge, etc.), the journal
that describes the fault, the publication date,
and other Dublin Core like parameters. - Material Properties Viscosity, lame parameters
- Geometric Properties Size and orientation of the
fault (see figure). - Geo-location Properties Latitude and Longitude
27(No Transcript)
28Fault Resources
- Lets look at how to describe the fault,
concentrating on the geometric properties. Well
start as before - ltrdfRDF xmlnsrdf"http//www.w3c.org/1999/02/22-
rdf-syntax-ns" - xmlnsrdfs"http//www.w3c.org/2000/01/rdf-schem
a" - xmlnsgem"http//www.servogrid.org/schemas/GEMO
ntology"gt - ltrdfDescription rdfID"Fault"gt
- ltrdftype rdfresource"http//www.w3c.org/2000/0
1/rdf-schemaClass"/gt - ltrdfssubClassOf rdfresource"http//www.servogr
id.org/schemas/GEMOntologyGEMData"/gt - ltrdfslabelgtFaultlt/rdfslabelgt
- ltrdfscommentgt
- This is a fault class. Would be useful to
compare to - an XML fault definition.
- lt/rdfscommentgt
- lt/rdfDescriptiongt
- lt!Property values go here --gt
- lt/rdfRDFgt
29Fault Properties
- We are simply defining a class instance called
Fault that is a subclass of the GEMData
resource we defined previously. The real
usefulness comes from the properties. We show
only geometric properties, but we can easily add
to these. Lets look at one such value - ltrdfDescription rdfID"hasDepthValue"gt
- ltrdftype rdfresource"http//www.w3c.org/1999/0
2/22-rdf-syntax-nsProperty"/gt - ltrdfdomain rdfresource"Fault"/gt
- ltrdfrange
- rdfresource"http//www.w3c.org/2001/XMLSchema
double"/gt - lt/rdfDescriptiongt
- ltrdfDescription rdfabout"http//www.w3c.org/20
01/XMLSchemadouble"gt - ltrdftype rdfresource"http//www.w3c.org/2000/0
1/rdf-schemaDataType"/gt - lt/rdfDescriptiongt
30Fault Properties with Units
- Fault depths are measured in kilometers, so we
could add units by defining a new resource called
KilometerUnits. This value would be described
by two properties hasKilometerUnits and
hasKilometerValue. - ltrdfDescription rdfID"KilometerUnits"gt
- ltrdftype rdfresource"http//www.w3c.org/2000/0
1/rdf-schemaClass"/gt - ltrdfslabelgtKilometerUnitslt/rdfslabelgt
- lt/rdfDescriptiongt
- ltrdfDescription rdfID"hasKilometerUnits"gt
- ltrdftype rdfresource"http//www.w3c.org/1999/0
2/22-rdf-syntax-nsProperty"/gt - ltrdfdomain rdfresource"KilometerUnits"/gt
- lt/rdfDescriptiongt
- ltrdfDescription rdfID"hasKilometerValue"gt
- ltrdftype rdfresource"http//www.w3c.org/1999/0
2/22-rdf-syntax-nsProperty"/gt - ltrdfdomain rdfresource"KilometerUnits"/gt
- ltrdfrange rdfresource"http//www.w3c.org/2001/
XMLSchemadouble"/gt - lt/rdfDescriptiongt
31Giving hasDepthValue a Property with Units
- We then modify the hasDepthValue so that its
range is the KilometerUnits resource instead of
the simple XML double type. - ltrdfDescription rdfID"hasDepthValue"gt
- ltrdftype rdfresource"http//www.w3c.org/1999/0
2/22-rdf-syntax-nsProperty"/gt - ltrdfdomain rdfresource"Fault"/gt
- ltrdfrange rdfresource"KilometerUnits"/gt
- lt/rdfDescriptiongt
32Creating Schema Instances
- Were now ready to look at some simple examples
of creating instances of our ontology. Lets
first start with a simple example. Lets take a
look at Disloc. We need to know the following
about Disloc - Where is it installed?
- What does it take for input data?
- Where is the input data?
- What does it create as output data?
- What can I use to visualize its output?
- Note that we are not actually trying to invoke
the code. We are just trying to manage enough
information about the code so that we know how to
invoke it with other services
33Creating URIs
- I first need to assign several URIs for the
things that I will use to describe my resource. - Again, these are just intended to be structured
names of things on the internet, or at least
pointers to information about things. - The URIs we will use are
- The GEMOntology itself will be assigned the URI
of http//www.servogrid.org/schemas/GEMOntology. - All computing platforms will be subdirectories of
the URI http//www.servogrid.org/instances/Compute
Resource - All data names will start with http//www.servogri
d.org/instances/data/ - All applications will have names starting with
http//www.servogrid.org/instances/gemcodes/. - We can further extend the relative paths to
specify/distinguish applications and
visualizations.
34An Instance for Disloc
- We first need to describe the applications
(Disloc). In particular, we need to - describe the input and output that Disloc takes,
and the places where this code is - installed. This looks like the following
- ltrdfRDF xmlnsrdf'http//www.w3c.org/1999/02/22-
rdf-syntax-ns' - xmlnsrdfs'http//www.w3c.org/2000/01/rdf-schem
a' - xmlnsgem'http//www.servogrid.org/schemas/GEMO
ntology' - xmlnsdc"http//purl.org/dc/elements/1.0/"gt
- ltrdfDescription rdfID"Disloc"gt
- ltrdftype rdfresource"http//www.servogrid.org/
schemas/GEMOntologyApplicationCode"/gt - ltdccreatorgtA. Donnellanlt/dccreatorgt
- ltgeminstalledOn rdfresource"http//www.servogr
id.org/instances/ComputeResources/Grids"/gt - ltgemtakesInputData rdfresource"http//www.serv
ogrid.org/instances/data/Faults"/gt - ltgemcreatesOuputData rdfresource"http//www.se
rvogrid.org/instances/data/SurfaceStress"/gt - lt/rdfDescriptiongt
- lt/rdfRDFgt
35RDF Representation in CCBR Systems
36RDF Representation of a GEM Code disloc.c
(Meta-data)
- ltrdfRDF xmlnsrdf'http//www.w3c.org/1999/02/22-
rdf-syntax-ns' - xmlnsrdfs'http//www.w3c.org/2000/01/rdf-schem
a' - xmlnsgem'http//www.servogrid.org/schemas/GEMO
ntology' - xmlnsdc"http//purl.org/dc/elements/1.0/"gt
- ltrdfDescription rdfID"Disloc"gt
- ltrdftype rdfresource"http//www.servogrid.org/
schemas/GEMOntologyApplicationCode"/gt - ltdccreatorgtA. Donnellanlt/dccreatorgt
- ltgeminstalledOn rdfresource"http//www.servogr
id.org/instances/ComputeResources/Grids"/gt - ltgemtakesInputData rdfresource"http//www.serv
ogrid.org/instances/data/Faults"/gt - ltgemcreatesOuputData rdfresource"http//www.se
rvogrid.org/instances/data/SurfaceStress"/gt - lt/rdfDescriptiongt
- lt/rdfRDFgt
37Triples in disloc.c RDF Representation
- Class Property Property Value
- Disloc dccreator A. Donnellan
- Disloc gemtakesInputData
- rdfresource"http//www.servogri
d.org/instances/data/Faults - Disloc gemcreatesOutputData
rdfresource"http//www.servogrid.org/instances/d
ata/SurfaceStress" - Disloc geminstalledOn
- rdfresource"http//www.servogri
d.org/instances/ComputeResources/Grids"
38Queries on RDF Meta-data - I
39Queries on RDF Meta-data - II
40Ideas on RDF representation and integration with
IUCBRF
- Each case has an RDF instance
- Each RDF instance has a number of triples
- Each RDF instance might have more descriptive
information than necessary for CCBR system
retrieval. - User will form a triple as a query, rather than
answering a question. (Protégé Query Service has
a good example for this). -
41Protégé Query Service
42Ideas on RDF representation and integration with
IUCBRF
- Ranking of the cases
- Cases will be ranked based on their consistent
triple numbers. - If the case has a matching triple, it will have
higher ranking. - If the case does not have the entered triple, its
ranking wont change, unless user wants the cases
which dont have this triple. - Ranking of the questions
- We need to recommend user what properties of
GEMObject Class are good to start. - Ranking can be based on (property, property
value) apperance in the triples stored in the
case base.
43CCBR Case with RDF Representation
CCBR CASE
RDF Triple
Solution
Problem
RDF Triple
RDF Triple
(Subject, Predicate, Object)
44How will it work?
CCBR CASEBASE
A Case from CASEBASE
Query Case
A
B
RDF Triple 1 RDF Triple 4 RDF Triple 5
RDF Triple 1 RDF Triple 2 RDF Triple 3 RDF Triple
4
Case
//case ranking calculation for B.RDFTriple
1 Model model new ModelMem(case
A) StmtIterator stmts model.listStatements()
while (stmts.hasNext())
Statement st (Statement) stmts.next()
if (st q_st) //increase the
consistency for this case else
//do nothing
System.out.println(stmts.next())
Case
ltProblem, Solutiongt
45Ideas on RDF representation and integration with
IUCBRF
- Any suggestions and/or comments?
- Thanks.