Title: Tutorial
1Tutorial 5Scientific Data Integration and
Mediation
Bertram Ludäscher Ilkay Altintas Amarnath
Gupta Kai Lin
San Diego Supercomputer Center U.C. San Diego
2Acknowledgements
- National Science Foundation (NSF)
- www.nsf.gov
- GEOsciences Network (NSF)
- www.geongrid.org
- Biomedical Informatics Research Network (NIH)
- www.nbirn.net
- Science Environment for Ecological Knowledge
(NSF) - seek.ecoinformatics.org
- Scientific Data Management Center (DOE)
- sdm.lbl.gov/sdmcenter/
3Outline
- 830 1030am Tutorial Data Integration
Mediation - Introduction to database mediation
- motivation and architecture
- XML-based data integration
- Database mediation theory primer
- logic view definitions, view unfolding, computing
feasible plans - From XML-based to Knowledge-based mediation
- use of ontologies in data integration, ...
- 1030 1045am BREAK
- 1045 1200 Applications and Demos
- 1045 1105 Mediator Demo
- 1105 1120 Queries w/ Ontology Support
- 1120 1140 Scientific Workflows
- 1140 1200 KNOW-ME Ontology Tool
4Information Integration Challenges
- System aspects Grid Middleware
- distributed data computing
- Web Services, WSDL/SOAP,
- sources functions, files, databases,
- Syntax Structure
- XML-Based Mediators
- wrapping, restructuring
- XML queries and views
- sources XML databases
- Semantics
- Model-Based/Semantic Mediators
- conceptual models and declarative views
- SemanticWeb/KnowledgeGrid stuff ontologies,
description logics (RDF(S), DAMLOIL, OWL ...) - sources knowledge bases (DBCMsICs)
5Information Integration from a DB Perspective
- Information Integration Problem
- Given data sources S1, ..., Sk (DBMS, web sites,
...) and user questions Q1,..., Qn that can be
answered using the Si - Find the answers to Q1, ..., Qn
- The Database Perspective source database
- Si has a schema (relational, XML, OO, ...)
- Si can be queried
- define virtual (or materialized) integrated
views V over S1 ,..., Sk using database
query languages (SQL, XQuery,...) - questions become queries Qi against V(S1,..., Sk)
6Standard (XML-Based) Mediator Architecture
wrappers implemented as web services
7Some BIRNing Data Integration Questions
Biomedical Informatics Research
Network http//nbirn.net
- Data Integration Approaches
- Lets just share data, e.g., link everything from
a web page! - ... or better put everything into an relational
or XML database - ... and do remote access using the Grid
- ... or just use Web services!
- Nice try. But
- Find the files where the amygdala was
segmented. - Which other structures were segmented in the
same files? - Did the volume of any of those structures differ
much from normal? - What is the cerebellar distribution of rat
proteins with more than 70 homology with human
NCS-1? Any structure specificity? How about other
rodents?
8An Online Shoppers Information Integration
Problem
El Cheapo Where can I get the cheapest copy
(including shipping cost) of Wittgensteins
Tractatus Logicus-Philosophicus within a week?
One-World Mediation
9A Home Buyers Information Integration Problem
What houses for sale under 500k have at least 2
bathrooms, 2 bedrooms, a nearby school ranking
in the upper third, in a neighborhood with
below-average crime rate and diverse population?
Multiple-Worlds Mediation
10A Geoscientists Information Integration Problem
What is the distribution and U/ Pb zircon ages of
A-type plutons in VA? How about their 3-D
geometry ? How does it relate to host rock
structures?
Complex Multiple-Worlds Mediation
11A Neuroscientists Information Integration Problem
Biomedical Informatics Research
Network http//nbirn.net
What is the cerebellar distribution of rat
proteins with more than 70 homology with human
NCS-1? Any structure specificity? How about other
rodents?
Complex Multiple-Worlds Mediation
12Structural / XML-Based Mediation
13Abstract XML-Based Mediator Architecture
USER/Client
Query Q o V (S_1,...,S_k)
Integrated XML View V
Integrated View Definition IVD(S1,...,Sn)
MEDIATOR
XML Queries Results
XML View
XML View
XML View
Wrapper
Wrapper
Wrapper
S_1
S_2
S_k
14Extensible Markup Language (XML)
... in their wonderful book called lttitlegtSemWeb
Tractat lt/titlegt by B. Schatz and T.B. Lee, the
authors show how ...
... in their wonderful book called lttitlegtSemWeb
Tractatlt/titlegt by ltauthorgtB. Schatzlt/authorgt and
ltauthorgt T.B. Leelt/authorgt, the authors show how
...
... in their wonderful book called SemWeb Tractat
by B. Schatz and T.B. Lee, the authors show how
...
ltbookgt lttitlegtSemWeb Tractatlt/titlegt
ltauthorgtB. Schatzlt/authorgt ltauthorgtT.B.
Leelt/authorgt lt/bookgt
- (meta)language for marking up text data with
user-definable tags - (X)HTML, XSLT, XML Schema, ...
- MathML, BioML, GeoML, NeuroML, ...
- XML-RPC, SOAP, ...
- semistructured tree data model
- flexible marked-up text, web-pages, databases,
... - container model
- boxes within boxes
15Example Relational Data gt XML
R
?R? ?tuple? ?A? a1 ?/A? ?B? b1 ?/B? ?C? c1
?/C? ?/tuple? ?tuple? ?A? a2 ?/A? ?B? b2
?/B? ?C? c2 ?/C? ?/tuple? ?/R?
16Tag Names Nesting gt XML DTDs (Grammars)
17XML DTDs vs. XML Schema
- XML DTDs
- set of allowed tag names
- their nesting structure (via grammar rules)
- XML Schema
- tag names and nesting structure
- user-defined complex data types
- subtyping (no multiple inheritance) RESTRICT and
EXTEND - separate namespace for type names and tag
(element) names - ...
18XML Schema User-Defined Type/Class Hierarchy
19XML Schema Declarations (home-style syntax)
Complex Type Declarations
20XML Schema (home-style)
Simple Type Declarations
Complex Types
21XML Schema Substitution Groups
Elements of a substitution group (hexagons) and
associated complex types (boxes)
22XML Schema Declarations (W3C syntax)
23XML Query Languages
- XPath
- root//books/bookcover_stylepaperbackpric
elt80 - XQuery
- the W3C XML query language
- XSLT
- XML transformations (XMLgtHTML, XMLgtXML)
- ...
24Transforming and Rendering XML XSLT
25XMAS XML Matching And Structuring language
Integrated View Definition Find books from
amazon.com and DBLP, join on author, group by
authors and title
26Database Mediation Theory Primer
27Mediator Query Processing
Query Q
Integrated View Definition V
Translator
parsed plan
Composition (Q o V)
composed plan
Rewriter/Optimizer
Compile-time
optimized plan
Run-time
Plan Execution
28Logic View Definitions (Global-as-View)
orQuerying and Reasoning with the Family ...
- Warm up Who says this?
- Your are my son, but Im not your father!
- The mother!
29Logic View Definitions (Global-as-View)
- Globals-as-View (GAV)
- Integrated view V is defined in terms of the
sources Src_1, ... , Src_k - Given the following source databases
- Src_1 schema father(Father,Child),
mother(Mother,Child) - Src_2 schema spouse(Spouse, Spouse)
- Src_3 schema male(Person), female(Person)
- Can you define integrated views V for ... ?
- parent(Parent,Child)
- short parent/2, i.e., table/relation name is
parent, arity (columns) is 2 - son/2, daughter/2
- brother/2, sister/2
- brother_in_law/2, sister_in_law/2
- aunt/2, uncle/2
- married/2, bachelor/2
30Logic View Definitions (Global-as-View)Source
relations father/2, mother/2, spouse/2, male/1,
female/1 ? , conjunction (and) ?
disjunction (or) ? not negation
- parent(C,P) ?
- father(C,P) mother(C,P) .
- son(P,S) ?
- parent(S,P) , male(S) .
- brother(X,B) ?
- parent(X,P), son(P,B), X ? B .
- brother_in_law(X,B) ?
- sister(X, Z), spouse(Z, B)
- spouse(X, Z), brother(Z, B) .
31Logic View Definitions (Global-as-View)Source
relations father/2, mother/2, spouse/2, male/1,
female/1 ? , conjunction (and) ?
disjunction (or) ? not negation
- uncle(X, U) ?
- parent(X, Z), brother(Z, U)
- parent(X, Z), brother_in_law(Z,
U) . - aunt(X, A) ?
- parent(X, Z), sister(Z, A)
- parent(X, Z), sister_in_law(Z, A)
. - married(X) ?
- spouse(X, _) .
- bachelor(X) ?
- person(X) , not married(X) .
32Query Rewriting and Query Evaluation
- Query Rewriting
- - Given a user query Q in terms of virtual views
V... - - Find an equivalent query Q in terms of the
sources Src_1,...,Src_k - Query Evaluation
- - Given a query Q, evaluate Q over the source
databases - D Src_1 ? ... ? Src_k
- Examples
- Q_uncle/2 (X,Y) uncle(X,Y) holds in D
- Q_toms_uncle/1 X uncle(tom, X) holds in D
- Q_whose_uncle_is_tom/1 X uncle(X, tom)
holds in D
33Query Rewriting (for GAV)
- Query rewriting
- - Given a user query Q in terms of virtual views
V... - - Find an equivalent query Q in terms of the
sources Src_1,...,Src_k - Query Q, views V, source schemas S
- View unfolding
- starting with Q, repeatedly replace view
predicates by the definition - Creating a feasible plan
- here compute disjunctive normal form (DNF)
- DNF disjunction of conjunctions ( union of
joins) - order goals within each conjunction according to
sources query capabilities
34Example
- ?- plan(brother(X0,X1)) .
- brother(X0, X1)
- LQP gt
- (father(X0, X2) v mother(X0, X2))
- (father(X1, X2) v mother(X1, X2)) male(X1)
neq(X0, X1) - brother(X0, X1)
- NNF LQPgt
- (father(X0, X2) v mother(X0, X2))
- (father(X1, X2) v mother(X1, X2)) male(X1)
neq(X0, X1)
35Example (Contd)
- ?- plan(brother(X0,X1)) .
- brother(X0, X1)
- DNF LQPgt
- father(X0, X2)father(X1, X2)male(X1)neq(X0,
X1) - v mother(X0, X2)father(X1, X2)male(X1)neq(X0,
X1) - v father(X0, X2)mother(X1, X2)male(X1)neq(X0,
X1) - v mother(X0, X2)mother(X1, X2)male(X1)neq(X0,
X1)
36Example (Contd)
- ?- plan(brother(X0,X1)) .
- brother(X0, X1)
- Bp ordered LQPgt
- parentDb(father(X1, X2) father(X0, X2))
- genderDb(male(X1)) mediator(neq(X0, X1))
- v parentDb(father(X1, X2) mother(X0, X2))
- genderDb(male(X1)) mediator(neq(X0, X1))
- v parentDb(mother(X1, X2)father(X0,X2))
- genderDb(male(X1)) z_mediator(neq(X0, X1))
- v parentDb(mother(X1, X2)mother(X0, X2))
- genderDb(male(X1))z_mediator(neq(X0, X1))
37Computing Feasible Plans (Goal Ordering)
- A conjunctive query Q is an expression of the
form - q( X ) ? p1( X1 ) , ..., pn( Xn )
- order of subgoals p_i is irrelevant
- An ordered plan P is an expression of the form
- q( X ) ? p1( X1 ) , ..., pn( Xn )
- order of subgoals p_i is important
- Problem
- given Q, compute P which is feasible, i.e.,
observes the limited query capabilities of
sources - Here binding patterns, i.e., predicates
arguments can be - b bound
- f free
- _ bound or free
38A Simple Algorithm for Ordering Goals
39Query Containment
- A query Q1 is contained in Q2, denoted Q1? Q2
- if for all possible database instances, the set
of answers to Q1 is contained in the set of
answers to Q2. - Q1 and Q2 are called equivalent
- if Q1 ? Q2 and Q2 ? Q1.
- Query containment is undecidable for many
languages, e.g., for the relational calculus
(SQL). - For conjunctive queries, the problem is
NP-complete (and thus decidable) - Since query sizes tend to be small (in
particular, when compared to database sizes),
query containment is still of use in practice
(indeed, it is one of the most fundamental tools
for logic-based query optimization).
40Query Containment
- Q1(Xs,Ys) is contained in Q2(Xs,Zs) iff
- ALL Xs (EXISTS Ys Q1(Xs,Ys)) ? (EXISTS Zs
Q2(Xs,Zs)) - iff we can refute its negation
- iff
- NOT ALL Xs
- (EXISTS Ys Q1(Xs,Ys)) ? (EXISTS Zs Q2(Xs,Zs))
- iff
- EXISTS Xs (EXISTS Ys Q1(Xs,Ys))
- AND NOT (EXISTS Zs Q2(Xs,Zs))
- iff
- canonical_db(Q1) AND ? Q2(Xs,Zs)
- create database from Q1, then run Q2 as a
query...
41Query Containment Algorithm (in Prolog)
- Applications
- query minimization (conjunctive query is minimal
if not conjunct can be dropped) - semantic query optimization
- Q ? denial
- here denial is an integrity constraint and
states what must not hold - example denial false ? mother(X,M),
father(Y,M)
42Example
- 50 of the clauses of the executable plan are
irrelevant ...
43Mediator Demo
- Computer Science Challenges
- Given a query Q over virtual integrated database
V, how to come up with Q over the source
schemas? (cf. Garlic, DiscoveryLink, ...) - query rewriting of Q(V) into Q(SRCs) using
unfolding and normalization - computation of feasible orders (NP-complete!?)
while minimizing number of chunks sent to
sources - semantic query optimization (reasoning over
plans!) e.g. conjunctive query containment is
NP-complete Chandra-Merlin-77 - A Quick Demo of the current prototype
- Find 3D reconstructions of cells found in
cerebellar cortex - ?- ccdbData('cerebellar cortex').
- Join everything reachable along
cerebellar-cortex.(has-a) in UMLS - ....with concept markup in CCDB
- ... retrieve (links to) results
- ... also show on SmartAtlas tool
44Mediator Demo
45From XML-Based to Logic and Model-Based
(Semantic) Mediation
46Whats the Problem with XML Complex
Multiple-Worlds?
- XML is Syntax
- DTDs talk about element nesting
- XML Schema schemas give you data types
- need anything else? gt write comments!
- Domain Semantics is complex
- implicit assumptions, hidden semantics
- sources seem unrelated to the non-expert
- Need Structure and Semantics beyond XML trees!
- employ richer OO models
- make domain semantics and glue knowledge
explicit - use ontologies to fix terminology and
conceptualization - avoid ambiguities by using formal semantics
47From XML-Based to Model-Based Mediation
- Data and Knowledge Sharing Potential
- Database Mediation
- Knowledge Representation
- ________________________
- Model-Based Mediation
- Basic Ideas
- turn primary data sources into knowledge sources
- employ secondary glue knowledge sources
- generic UMLS, ...
- specific community/laboratory ontologies
48Information Integration Landscape
49Knowledge RepresentationRelating Theory to the
World via Formal Models
All models are wrong, but some are useful!
50XML-Based vs. Model-Based Mediation
CM Descr.Logic, ER, UML, RDF/XML(-Schema),
CM-QL F-Logic, DAMLOIL,
51Whats the Glue? Whats in a Link?
?
Y
X
- Syntactic Joins
- ?(X,Y) X.SSN Y.SSN equality
- ?(X,Y) X.UMLS-ID Y.UID
- Speciality Joins
- ?(X,Y,Score) BLAST(X,Y,Score) similarity
- Semantic/Rule-Based Joins
- ?(X,Y,C)
- X isa C, Y isa C, BLAST(X,Y,S), Sgt0.8
homology, lub - ?(X,Y,produces,B,increased_in)
- X produces B, B increased_in Y. rule-based
- e.g., X?-secretase, Bbeta amyloid,
YAlzheimers disease - Challenge
- compile semantic joins into efficient syntactic
ones
52Model-Based Mediation Methodology ...
- Lift Sources to export CMs
- CM(S) OM(S) KB(S) CON(S)
- Object Model OM(S)
- complex objects (frames), class hierarchy, OO
constraints - Knowledge Base KB(S)
- explicit representation of (hidden) source
semantics - logic rules over OM(S)
- Contextualization CON(S)
- situate OM(S) data using glue maps (GMs)
- domain maps DMs (ontology)
- terminological knowledge concepts roles
- process maps PMs
- procedural knowledge states transitions
53... Model-Based Mediation Methodology
- Integrated View Definition (IVD)
- declarative (logic) rules with object-oriented
features - defined over CM(S), domain maps, process maps
- needs mediation engineers domain KRDB
experts - Knowledge-Based Querying and Browsing (runtime)
- mediator composes the user query Q with the IVD
- ... rewrites (Q o IVD), sends subqueries to
sources - ... post-processes returned results (e.g.,
situate in context)
54Model-Based Mediator Architecture
First results Demos KIND prototype, formal DM
semantics, PMs SSDBM00 VLDB00 ICDE01
NIH-HB01 (w/ Gupta, Martone)
55Domain Maps (Ontologies) as Glue Knowledge Sources
- Domain Map Ontology
- representation of terminological knowledge
- Use in Model-Based Mediation
- (derived) concepts as drop points, anchor
points, context for source classes - compile-time use view definition, subsumption,
classification,... - runtime use querying/deduction, path queries,
.... - Formalisms
- Semantic nets, Thesauri, Frame-logic, Description
logics, ...
56Ontologies
- So what is an Ontology?
- definition of things that are relevant to your
application - representation of terminological knowledge
(TBox) - explicit specification of a conceptualization
- concept hierarchy (is-a)
- further semantic relationships between concepts
- abstractions of relational schemas, (E)ER, UML
classes, XML Schemas - Examples
- NCMIR ANATOM
- GO (Gene Ontology)
- UMLS (Unified Medical Language System
- CYC
57Formalism for Ontologies Description Logic
- DL definition of Happy Father
(Example from Ian Horrocks, U
Manchester, UK)
58Description Logic Statements as Rules
- In first-order logic (rule form)
- happyFather(X) ?
- man(X), child(X,C1), child(X,C2), blue(C1),
green(C2), - not ( child(X,C3), poorunhappyChild(C3) ).
- poorunhappyChild(C) ?
- not rich(C), not happy(C).
59Description Logics
- Terminological Knowledge (TBox)
- Concept Definition (naming of concepts)
- Axiom (constraining of concepts)
- gt a mediators glue knowledge source
- Assertional Knowledge (ABox)
- the marked neuron in image 27
- gt the concrete instances/individuals of the
concepts/classes that your sources export
60Querying vs. Reasoning
- Querying
- given a DB instance I ( logic interpretation),
evaluate a query expression (e.g. SQL, FO
formula, Prolog program, ...) - boolean query check if I ? (i.e.,
if I is a model of ?) - (ternary) query (X, Y, Z) I ?
(X,Y,Z) - gt check happyFathers in a given database
- Reasoning
- check if I ? implies I ? for all
databases I, - i.e., if ? gt ?
- undecidable for FO, F-logic, etc.
- Descriptions Logics are decidable fragments
- concept subsumption, concept hierarchy,
classification - semantic tableaux, resolution, specialized
algorithms
61Whats in an Answer?(Whats in a Link?
revisited)
?
Y
X
- Semantic/Rule-Based Joins
- ?(X,Y,produces,B,increased_in)
- X produces B, B increased_in Y. rule-based
- e.g., X?-secretase, Bbeta amyloid,
YAlzheimers disease - What is the Erdoes number of person P?
- 3
- Really? Why?
- authority based ltVIPgt said so
- faith based dont know but firmly believe
- query statement Q ... derived it from DB I
- query Q ... derived it from DB I and KB T using
derivation D - gt logic-based systems often come with
explanations (computations as proofs)
62Formalizing Glue KnowledgeDomain Map for
SYNAPSE and NCMIR
- Domain Map
- labeled graph with
- concepts ("classes") and
- roles ("associations")
- additional semantics expressed as logic rules
(F-logic)
63 Source Contextualization DM Refinement
- sources can register new concepts at the
mediator ...
64 ExampleANATOM Domain Map
65Browsing Registered Data with Domain Maps
66Process Maps with Abstractions and Elaborations
From Terminological to Procedural Glue
67Summary Mediation Scenarios Techniques
Common Schema Mediated
Schema Common Glue Maps
SQL, rules XML
query languages DOOD query
languages Schema Transformations
Syntax-Aware Mappings Semantics-Aware
Mappings Syntactic Joins
Syntactic Joins Semantic Joins via
Glue Maps DB expert DB expert
KRDB domain expert
68Semantic (Community) Webs
69Combine EverythingDie eierlegende Wollmilchsau
- Database Federation/Mediation
- query rewriting under GAV/LAV
- w/ binding pattern constraints
- distributed query processing
- Semantic Mediation
- semantic integrity constraints, reasoning w/
plans, automated deduction - deductive database/logic programming technology,
AI stuff... - Semantic Web technology
- Scientific Workflow Management
- more procedural than database mediation (often
the scientist is the query planner) - deployment using web services
70B R E A K
- ... followed by demos ...
71(No Transcript)
72GEON SMART Metadata Multihierarchical Rock
Classification for Thematic Queries (GSC)
Genesis
Fabric
Composition
Texture
73GEON SMART MetadataMultihierarchical Rock
Classification for Thematic Queries
http//klin-pc.sdsc.edu8080/examples/jsp/geon/co
mposition.jsp
74GEON Ontology Demo
- http//klin-pc.sdsc.edu8080/examples/jsp/geon/old
-rock.jsp - http//klin-pc.sdsc.edu8080/examples/jsp/geon/roc
k.jsp
75Architecture of Ontology Based Map Integration
Ontology Mapping
Global Web Map Server
Web Map Server
Web Map Server
Web Map Server
Database
Database
Database
76DOE Scientific Datamanagement Center
77Example A Scientific Workflow
Microarray analysis
Database search for promoter identification
cDNA Cluster
Promoter sequences
Promoter model
Common promoter alignment
- New candidate target genes
Database search
Adapted from Thomas Werner Biomolecular
Engineering, 17 87-94 (2001)
78Conceptual Workflow
Compute clusters (min. distance)
For each promoter
Select gene-set (cluster-level)
Compute Subsequence labels
For each gene
With all Promoter Models
Compute Joint Promoter Model
79Mapping This Workflow To Web Sites
80Customized CGI Application
81(No Transcript)
82(No Transcript)
83ClustalW Output
Transfac Query Results
84SDM-SciDAC System Architecture
AWF
EWF
web service invocation
web service invocation
ET
ET
query rewriting
semantic type checking
data type conversion
web service matching
Genbank
BLAST
Abstract Task (AT) Repository
Data Parameter Ontologies
Datatype Conversion Repository
Executable Task (ET) Repository
85AWF to EWF
User supplied
Declarative specification
GetGenomicSequence (selectedGene,
-GenomicSequence) - GENBANK
(selectedGene, -cDNASequence), BLAST
(cDNASequence, dbName, format,
-rankedGenomicSequenceList). GetGenomicSequence
(selectedGene, -GenomicSequence)
- GENBANK (selectedGene, -cDNASequence), BLA
T (cDNASequence, QueryType, SortCriteria,
OutputType , -rankedGenomicSequenceList). Ident
ifyPromoterElements (rankedGenomicSequenceList,
-element) - PromoterSequences
(rankedGenomicSequenceList, getBeginEnd(Specie
s, -Begin, -End), -element).
For each gene
Need extra domain knowledge
Translation to EWF needs creation of iterators
Same functionality, different operational
constraints and availability
86(No Transcript)
87Abstract Task (AT) Registration
88Abstract Task (AT) View and Delete
89Abstract Task (AT) Update
90AWF Design
91EWF Planning and Compilation
92EWF Execution
93BIRN Tools Demo
94Some References (starting points)
- XML
- General http//xml.coverpages.org/xml.html
- XQuery http//www.w3.org/XML/Query
- XSLT http//xml.coverpages.org/xsl.html
- Query Rewriting
- database research literature
- Logic Programming
- Learn Prolog Now! http//www.coli.uni-sb.de/kris
/learn-prolog-now/ - SWI-Prolog (nice free Prolog system)
http//www.swi-prolog.org/ - Ontologies
- Ontology Web language http//www.w3.org/TR/owl-fe
atures/ - http//www-ksl.stanford.edu/kst/what-is-an-ontolog
y.html - http//www.cs.utexas.edu/users/mfkb/related.html
- Model-Based Mediation
- http//www.sdsc.edu/ludaesch/Paper/icde01.html
- Semantic Web
- http//www.w3.org/2001/sw/
95References Project Web Sites
- GEOsciences Network (NSF)
- www.geongrid.org
- Biomedical Informatics Research Network (NIH)
- www.nbirn.net
- Science Environment for Ecological Knowledge
(NSF) - seek.ecoinformatics.org
- Scientific Data Management Center (DOE)
- sdm.lbl.gov/sdmcenter/