Dan Crichton, Manager, Enterprise Data Architecture Task, - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Dan Crichton, Manager, Enterprise Data Architecture Task,

Description:

Platforms - UNIX, Linux, WIN3.x/9x/NT, Mac, VMS, ... Interfaces - Web, ... Documentation for the data, instrument, flight project, etc. (metadata) Page 19 ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 41
Provided by: DAN383
Category:

less

Transcript and Presenter's Notes

Title: Dan Crichton, Manager, Enterprise Data Architecture Task,


1
Interoperability and Data Architecture for
Metadata DevelopmentBiomarkers Knowledge System
MeetingBethesda, MD September 8, 2000
Dan Crichton, Manager, Enterprise Data
Architecture Task, Principal Investigator Object
Oriented Data Technology Task Steve Hughes, Lead
System Engineer, Planetary Data System, Co
Investigator, Object Oriented Data Technology
Task Thuy Tran, Senior Software Engineer,
Enterprise Data Architecture and Object Oriented
Data Technology Tasks Jet Propulsion Laboratory,
California Institute of Technology National
Aeronautics and Space Administration
2
Outline
D. Crichton S. Hughes T. Tran
  • Describe JPL and enterprise computing
  • Define an Enterprise Data Architecture
  • Define why an EDA is Critical to NASA
  • Address Data Interoperability Challenges
  • Describe the Object Oriented Data Technology task
  • Case Study The Planetary Data System
  • Provide an Overview of PDS and Objectives
  • Describe the PDS Organizational Structure
  • What has the PDS accomplished
  • Discuss the role of Metadata within the PDS
  • Demo search of PDS data sets and PTI data sets

3
About JPL and Enterprise Computing
  • JPL is a federally funded research and
    development center (FFRDC) run by Caltech for
    NASA
  • JPL is NASAs lead center for robotic exploration
    of the universe
  • JPL has an enormous amount of data that it needs
    to manage from scientific data, to engineering,
    to institutional
  • We represent several efforts in both the research
    and enterprise side of JPL that is addressing
    enterprise architectures for integrating data at
    both JPL and NASA. Such efforts include
  • Knowledge Management
  • Enterprise Data Architecture Task
  • Planetary Data System
  • Object Oriented Data Technology

4
What is an Enterprise Data Architecture?
  • An enterprise data architecture provides the
    infrastructure necessary to enable development of
    interoperable, enterprise-wide applications
  • Data Interoperability
  • Data Sharing
  • Data Access
  • Facilitate access
  • Reduce complexity
  • Data Management
  • Data Archiving
  • Basic infrastructure to support knowledge
    discovery

5
Why is an EDA Critical to NASA?
  • Interoperability is an important key to unlock
    knowledge discovery
  • Allows scientists the ability to locate critical
    information
  • Enables knowledge management across NASA
  • A key to scientific discovery
  • State of data systems at NASA agency wide
  • Difficult to access (no standard interfaces)
  • Geographically distributed
  • Have no standard language or protocol for
    interchange (no EDI) agency wide
  • No common metadata language agency wide
  • Have no system for registration of data products
  • Have little or no interoperability
  • Have few common terms for describing data

6
Interoperability Challenges and Needs
  • Space scientists can not easily locate or use
    data across the hundreds if not thousands of
    autonomous, heterogeneous, and distributed data
    systems currently in the Space Science community.
  • Heterogeneous Systems
  • Data Management - RDBMS, ODBMS, Home Grown DBMS,
    Binary Files
  • Platforms - UNIX, Linux, WIN3.x/9x/NT, Mac, VMS,
  • Interfaces - Web, Windows, Command Line
  • Data Formats - HDF, CDF, NetCDF, PDS, FITS, VICR,
    ASCII, ...
  • Data Volume - KiloBytes to TeraBytes
  • Heterogeneous Disciplines
  • Moving targets and stationary targets
  • Multiple coordinate systems
  • Multiple data object types (images, cubes, time
    series,binary,document)
  • Multiple interpretations of single object types
  • Multiple software solutions to same problem.
  • Incompatible and/or missing metadata

7
Solution Build a Data Architecture
  • Solution build a data architecture by initially
    focusing on
  • Metadata management
  • A middleware framework for interoperability

8
Focus on Metadata
  • Metadata is data about data
  • Provides descriptive information about the data
  • Classification, identification, etc
  • Metadata Example
  • Data Value 55 (not descriptive)
  • Metadata Values
  • Data Element NameVehicle_Speed
  • Unit Miles per Hour
  • Description The average velocity of a vehicle.
  • Build metadata repositories that manage
    information about distributed data products (E.g.
    location, target, observation date, etc)
  • Use standards where appropriate
  • ISO/IEC 11179 A framework for the Specification
    and Standardization of Data Elements
  • Dublin Core A metadata element set intended to
    facilitate discovery of electronic resources.

9
Focus on Middleware
  • Middleware defined as
  • In the computer industry, middleware is a
    general term for any programming that serves to
    glue together or mediate between two separate
    and usually already existing programs. A common
    application of middleware is to allow programs
    written for access to a particular database to
    access other databases.
  • Messaging is a common service provided by
    middleware programs so that different
    applications can communicate. The systematic
    tying together of disparate applications is known
    as enterprise application integration.
  • http//www.whatis.com
  • Middleware allows for the encapsulation of
    individual data systems
  • Hide uniqueness by introducing the data
    architecture layer
  • Ties distributed applications together an often
    works with a Electronic Data Interchange (EDI)
    type mechanism
  • Enables reuse and promotes standards

10
Role of Middleware
Applications
User Interface
Middleware
Data
Middleware can tie application, data, and user
interfaces together and hide the unique interfaces
11
NIST I.T. Architecture for Federal Govt
Redrawn from Federal Enterprise Architecture
Framework version 1.1, September 1999, Chief
Information Officers Council
Drives ?
InformationArchitecture
Prescribes ?
Information SystemsArchitecture
Identifies ?
Enterprise Data Architecture
Supported by ?
Delivery Systems Architecture
12
Object Oriented Data Technology Task
  • Research task funded by the Office of Space
    Science (OSS) at NASA
  • Provides a framework for managing data access and
    interoperability
  • Archive Service For managing data sets
  • Profile Service For managing metadata profiles
    about data systems, data sets, and data products
  • Product Service To tie individual data systems
    into a larger enterprise data system
  • Presented a paper at CODATA in March 2000 called
    Science Search and Retrieval using XML

13
OODT Focus
  • Focus on building middleware components for an
    enterprise data architecture
  • Focus on building profiles for managing
    metadata information about cross-disciplinary
    resources
  • Provide sufficient layers of abstraction in the
    architecture to isolate technologies choices from
    the architecture choices
  • XML for the data content
  • CORBA for the data transport
  • Research technologies for implementing a
    distributed data architecture
  • Distributed Object Computing (CORBA, DCOM, etc)
  • Database Technology (RDBMS, ODBMS)
  • Data Access Technologies (O/JDBC, STEP, XML, etc)
  • Directory Implementations (LDAP)
  • Data Interchange (XML)
  • Communication Technologies (Web/HTTP, MOM, RPC,
    etc)

14
OODT Pilot Activity
  • Partner with the Planetary Data System (PDS) to
    address interoperability across 10 PDS silos
  • Build a generic XML Document Type Definition
    (DTD) that will support PDS data dictionary and
    metadata infrastructure
  • Demonstrate how a science query can return data
    across the PDS nodes
  • Demonstrate how the same interface can return
    information between planetary and astrophysics
    data systems

15
OODT Metadata Development
  • Metadata Registry Develop a data management
    system for managing the semantics of data that is
    shared within and between domains.
  • Terminology Base Domain specific name space.
  • Data Dictionary Inventory of domain terms with
    definitions and other distinguishing attributes.
  • Ontology A set of concepts, their relationships
    and constraints, all within the scope of a
    domain.
  • XML for metadata registry and communication
  • The PDS experience with the Planetary Science
    Data Dictionary has shown the criticality of
    metadata in enabling data sharing and system
    interoperability.

16
What is the PDS?
  • PDS is the official planetary science data
    archive for the NASA Office of Space Science
    (OSS) Solar System Exploration (SSE).
  • PDS is chartered to ensure that SSE planetary
    data are archived and available to the scientific
    community.
  • PDS is a distributed system designed to optimize
    scientific oversight in the archiving process.
  • The PDS has been in existence for 10 years.

17
Objectives of the PDS
  • Publish and disseminate documented data sets for
    use in scientific analysis
  • Work with projects to help design, generate, and
    validate data products for placement in archive
  • Develop and maintain archive data standards to
    ensure future usability.
  • Provide expert scientific help to the user
    community.

18
What is meant by a documentedData Set?
  • The goal of the PDS archiving system is for each
    data set to be autonomous, i.e., all information
    required to understand and interpret the data
    should be included in the archive.
  • To that end, an archive package includes
  • Raw data
  • Data calibrated to physical units
  • Calibration data and algorithms
  • Ancillary data, e.g. observation geometry
  • Higher level data products (maps, projections,
    other aggregations)
  • Documentation for the data, instrument, flight
    project, etc. (metadata)

19
What is the structure of the PDS?
  • PDS is a distributed system designed to optimize
    scientific oversight in the archiving process
  • The PDS is managed by discipline scientists
    working with the project manager
  • PDS Science Discipline Nodes
  • Archival of data and supporting documentation
  • Expertise in researching and interpreting the
    data
  • Expertise in the planning and design of future
    observations and data sets
  • Distribution of data to the community
  • PDS Central Node
  • Program management
  • Project engineering
  • Standards development

20
PDS Nodes and Institutions (Silos)
Geosciences/Washington University
Rings/Ames
Radio Science/Stanford
Small Bodies/UMD
Planetary Plasma/UCLA
Imaging/JPL
Central Node/JPL
Imaging/USGS
Atmospheres/New Mexico State
NAIF/JPL
21
What has the PDS Accomplished?
  • Produced a high-quality peer-reviewed archive of
    Solar System Exploration Data
  • Stored for long-term viability
  • Described by metadata
  • Distributed either online or on CD media
  • Developed a robust standards architecture
  • Planetary Science Data Dictionary - Provides the
    domain of discourse for the planetary science
    community.
  • Planetary Community Model - Provides formalized
    descriptions of the entities and their
    relationships within the planetary science
    community.
  • Developed science driven management structure
  • Responsive to changing mission project
    environment through distributed, science
    discipline oriented nodes.

22
The Use of Metadata in the PDS
Locate and Use Data - Use context to find data -
Use context to understand data
Mission
Target
Data Set Collection
System Interoperability - Use context to share
data
Spacecraft
Planetary Science Model
Data Set
Correlative Science - Use context to find new
relationships between data
Instrument
Spectrum
Time Series
Image
Document
Model Attributes
Label
Data
23
What OODT is doing for PDS?
  • Problem Statement - In spite of the Web and a
    common standards architecture, the PDS continues
    to be a collection of heterogeneous data systems
    with little resource sharing.
  • Solution
  • Prototype a PDS profile service that will manage
    metadata profiles for data sets, data products,
    and data systems.
  • Prototype PDS product servers to integrate
    individual data systems.
  • Promote the use of archive services by mission
    projects for more efficient production of data
    products.

24
OODT Demonstration
  • Search for PDS data sets
  • Search for PDS images (granules)
  • Search for Astrophysics data by star
  • Searches the Palomar Testbed Interferometer (PTI)
    archive

25
OODT Query Flow
Search Web Page
XMLQuery(no results)
XMLQuery(no results)
Userquery
Query Server
Profile Serverjpl
QueryClient
Web server
search.jsp
Profile DB
XMLQuery(profiles of resources to handle query)
XMLQuery(profiles ordata resultsas requested)
XSL(profiles ordata productsformatted)
Product Serverjpl.pti
PTI Repository
XMLQuery (product search)
Product Serverjpl.pds
XMLQuery (data results)
PDS DVD Jukebox
Product Serverjpl.pds.mola
PDS MOLA Oracle DB
26
More Information
  • Science Search and Retrieval using XML by OODT
    Team. Presented at Second National Conference on
    Scientific and Technical Data, National Academy
    of Sciences, Washington D.C.
  • http//oodt.jpl.nasa.gov/doc/papers/codata/paper.p
    df
  • Planetary Data System
  • http//pds.jpl.nasa.gov
  • Dublin Core
  • http//purl.oclc.org/dc
  • Extensible Markup Language
  • http//www.w3c.org/XML
  • ISO/IEC 11179 Specification and Standardization
    of Data Elements
  • Object Management Group (CORBA and UML standards)
  • http//www.omg.org
  • Federal CIO Statement on Metadata
  • http//www.cio.gov/docs/metadata.htm
  • National Information Standards Organization
    Z39.50 Information Retrieval Protocol
  • http//www.niso.org/z3950.html

27
Backup Slides
  • Backup Slides

28
JPL Org Chart (partial)
Caltech President
JPL Office of the Director
Institutional Computing/ Chief Information
Officer
Engineering and Science Directorate
Science and Earth Science Programs
Enterprise Infrastructure Office
Enterprise Applications Office
Object Oriented Data Technology
Planetary Data System
Science Data Management and Archiving
29
Org Chart - Responsibility Flow
Program Offices
Implementation Organizations
Enterprise Applications Office
Deliverables
Science Data Management and Archiving
Funding, Programmatic Oversight
Deliverables
Planetary Data System
Deliverables
Object Oriented Data Technology
Task Management Design and Implementation
Responsibility
30
Institutional Enterprise Data Architecture
Breakdown
  • Paradigm shift from stove pipe implementations to
    horizontal solutions that cross organizational
    boundaries
  • Include such services as
  • Enterprise Application Standards
  • Object Services
  • Data Infrastructure Services
  • Database Hosting
  • Metadata Management
  • Data Interchange
  • Information Architecture Services
  • Institutional directory and security access
  • Data system APIs for access
  • Data mining and data warehousing
  • Data Management Services
  • Data archiving

31
JPL Enterprise Architecture (Logical View)
32
What is a profile?
  • A profile is a set of resource definitions
    implemented in XML for data products residing in
    one or more distributed systems
  • Profile servers are CORBA servers that manage XML
    profile definitions
  • Profile servers communicate via XML-over-CORBA
  • Developed Java classes that map XML profiles to a
    Java object

Profile Distributed Node Architecture
33
Profile Server Architecture
34
Profile DTD
lt!ELEMENT profiles (profile)gt lt!ELEMENT
profile (profAttributes, resAttributes,
profElement)gt lt!ELEMENT profAttributes
(profId, profVersion, profTitle, profDesc,
profType, profStatusId,
profSecurityType, profParentId, profChildId,
profRegAuthority, profRevisionNote,
profDataDictId)gt lt!ELEMENT resAttributes
(Identifier, Title, Format, Description,
Creator, Subject, Publisher,
Contributor, Date, Type, Source,
Language, Relation, Coverage, Rights,
resContext, resAggregation, resClass,
resLocation)gt lt!ELEMENT profElement
(elemId, elemName, elemDesc, elemType,
elemUnit, elemEnumFlag, (elemValue
(elemMinValue, elemMaxValue)),
elemSynonym, elemObligation,
elemMaxOccurrence, elemComment)gt
35
Profile Example
ltltprofilegt ltprofAttributesgt
ltprofIdgtOODT_PDS_DATA_SET_INV_82lt/profIdgt
ltprofDataDictIdgtOODT_PDS_DATA_SET_DD_V1.0lt/profDat
aDictIdgt lt/profAttributesgt ltresAttributesgt
ltIdentifiergtVO1/VO2-M-VIS-5-DIM-V1.0lt/Identifiergt
ltTitlegtVO1/VO2 MARS VISUAL IMAGING SUBSYSTEM
DIGITAL IMAGING MODELlt/Titlegt
ltFormatgttext/htmllt/Formatgt ltLanguagegtenlt/Langu
agegt ltresContextgtPDSlt/resContextgt
ltresAggregationgtdataSetlt/resAggregationgt
ltresClassgtdata.dataSetlt/resClassgt
ltresLocationgthttp//pds.jpl.nasa.gov/cgi-bin/pdsse
rv.pl?OBJECT_IDPDS100751lt/resLocationgt
lt/resAttributesgt ltprofElementgt
ltelemIdgtARCHIVE_STATUSlt/elemIdgt
ltelemNamegtARCHIVE_STATUSlt/elemNamegt
ltelemTypegtENUMERATIONlt/elemTypegt
ltelemEnumFlaggtTlt/elemEnumFlaggt
ltelemValuegtARCHIVEDlt/elemValuegt
lt/profElementgt ltprofElementgt
ltelemIdgtTARGET_NAMElt/elemIdgt
ltelemNamegtTARGET_NAMElt/elemNamegt
ltelemTypegtENUMERATIONlt/elemTypegt
ltelemEnumFlaggtTlt/elemEnumFlaggt
ltelemValuegtMARSlt/elemValuegt
lt/profElementgt lt/profilegt
36
OODT Product Server
  • The Product Server plugs into the OODT framework
    and manages the handshake between the data
    system and the OODT system.
  • Extensible by dynamically loading objects at
    runtime which are specific to the data system
    model
  • Queries and results are passed using an OODT XML
    Query structure

Generic Server
Implementation Class
File Sys
Query
Result
Database
Product Server
37
XML Query Structure
  • Defined as follows
  • The query description
  • The results
  • Result 1
  • Result Header
  • Result Data
  • Result 2
  • Result Header
  • Result Data
  • ...
  • Result N
  • Result Header
  • Result Data

38
Why XML for OODT?
  • XML doesnt provide a silver bullet, but it
    does allow us to refocus the problem on metadata
  • Metadata is a key to interoperability
  • XML is language neutral
  • Allows the designer to separate the data and the
    transport (re CORBA vs XML-over-CORBA)
  • Transport mechanism and data are not tied
    together
  • Could be XML/HTTP
  • Simpler deployments
  • Simpler interfaces
  • Allows technologies to grow and change
    independently
  • Real value of XML is the content

39
CORBA vs XML
  • XML over CORBA/IIOP
  • module jpl module user interface
    UserManager string do(string xml)
  • lttransactiongt ltfindUsergt ltusergt
    ltsurnamegtDoelt/surnamegt lt/usergt
    lt/findUsergtlt/transactiongt
  • CORBA method
  • module jpl module user interface
    UserManager User findUser(string
  • name)
    interface User String getName()

40
Middleware Framework for OODT
Archive Client
OBJECT ORIENTED DATA TECHNOLOGY FRAMEWORK
Write a Comment
User Comments (0)
About PowerShow.com