Digital Government Research Projects at SDSC - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Digital Government Research Projects at SDSC

Description:

... And Structuring (XMAS) query language. Wrapper. Wrapper. Lazy evaluation of. XMAS queries using ... Spatial/statistical extensions to the XMAS query language ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 15
Provided by: digitalgo
Category:

less

Transcript and Presenter's Notes

Title: Digital Government Research Projects at SDSC


1
Digital Government Research Projects at SDSC
  • Chaitan Baru
  • baru_at_sdsc.edu
  • Data Intensive Computing Group
  • SDSC

2
Digital Government Research Projects
  • SGER Grant Evaluating an Architecture for a
    National Statistical Data Infrastructure
  • 6 month duration (20K from Census to SANDAG)
  • Evaluate architecture of the FERRETT/DataWeb
    being developed by CDC and Census
  • I2T An Information Integration Testbed for
    Digital Government
  • 3 year duration
  • Extending the MIX effort at SDSC for mediating
    statistical and geospatial information

3
Government Partners
  • SGER
  • Census Bureau, SANDAG
  • I2T
  • Federal
  • U.S. Census Bureau
  • Bureau of Labor Statistics
  • (NARA, USGS)
  • State
  • State of Pennsylvania Depts. of Community and
    Economic Development (DCED) and Labor and
    Industry (DLI)
  • Local
  • San Diego County SANDAG

4
Project Personnel
  • SDSC
  • Chaitan Baru
  • Amarnath Gupta
  • Bertram Ludaescher
  • Richard Marciano
  • Yannis Papakonstantinou
  • Shabbar Tamblawala
  • Ilya Zaslavsky
  • U.Penn
  • Robert Hollebeek, NCDM
  • U.Michigan
  • Peter Joftis, ICPSR
  • Also consultation with Iowa State
  • Sarah Nusser / Hal Stern

5
SGER Evaluating the Architecture of the Data Web
Demographic, economic, environmental, health,
(and more) datasets
Client
Client
  • Query Locate all surveys related to children
  • Group families by number of children and compute
    mean family income
  • Cross tabulate incidence of violent crime with
    percentage of families with children

Metadata Catalog Server
Metadata Catalog Server
Replicated metadata
Source
Source
Source
Source
Current Population Survey (CPS)
Crime survey
Health survey
6
The SDSC Storage Resource Broker (SRB)

Application (SRB client)
SRB Middleware
MCAT
SRB Servers
DB2, Oracle, Illustra, ObjectStore
HPSS, UniTree
UNIX, ftp
7
SGER Tasks
  • Interview users from Census Bureau, CDC, SANDAG
    re. their wish list for DataWeb
  • Evaluate software architecture based on
  • Extensibility ability to add new metadata, new
    data types, new source capabilities
  • Scalability number of users, number of sources,
    types of platforms
  • Performance efficient query processing, load
    balancing
  • Availability

8
SGER Tasks
  • Installation of a DataWeb replica site at
    SANDAG/SDSC
  • Study feasibility of XML-based standards for
    representing metadata, to support
  • Extensible metadata specifications
  • Easy plug-in of new sources (surveys)
  • Addition of new data types, data models (e.g.
    time series)
  • The SGER study will provide valuable input to our
    I2T research

9
I2T Information Integration Testbed
  • How we formed the team
  • FedWeb extended its membership to NPACI
  • Met with Census and BLS at FedWeb and NPACI
    meetings
  • Made contact with NARA via DARPA/USPTO-funded
    Distributed Object Computation Testbed (DOCT)
    project
  • NARA brought along USGS
  • Bob Holebeek at U.Penn was already working with
    State of Pennsylvania
  • Made efforts to contact San Diego Assocation of
    Governments (SANDAG), to bring in the local view

10
The MIX Architecture
MIXm Mediator
XML View(s)
XML View(s)
XML View(s)
Wrapper
Wrapper
Data Source
XML Data Source
Data Source
11
I2T Research Topics
  • Extending the MIX system to support integration
    and mediation of statistical and geospatial
    information sources
  • Spatial/statistical extensions to the XMAS query
    language
  • E.g. define a basic set of operators in a spatial
    algebra
  • Expose operators at language level
  • Provide mappings from declarative query language
    to navigational interfaces provided by GIS
  • Wrapping GIS sources

12
I2T Research Topics
  • Incorporating source metadata as part of query
    processing
  • Identifying and defining appropriate metadata
  • Dealing with heterogeneity in accuracy,
    resolution, feature space, schema
  • Supporting iterative or user-guided query
    processing
  • Query processing with inexact values
  • value ranges, probabilistic quantities, similar
    strings

13
I2T Research Topics
  • DTD-guided wrapping of unstructured text
  • Converting Census codebooks to XML based on the
    Data Documentation Initiative (DDI) DTD
  • Applications of the I2T infrastructure
  • The DataWeb
  • Distributed decision support and data mining
    applications
  • Sociology Workbench--access to remote DDI-encoded
    survey information. Ability to read DDI-encoded
    survey codebooks and other XML data sets

14
Some Related Projects
  • RD project with ESRI
  • Evaluation of the ArcXML specification
  • Provide advice for next version. It will include
    metadata, etc. services
  • Advise/help ESRI in their OGC Web Mapping Testbed
    2 effort
  • Evaluate the Geography Network system
  • GNXML
  • ESRI will be industrial partner in California
    Institute for Telecommunications and Information
    Technology (Cal-IT2)
  • Geoinformatics ITR
  • Working with the geoscience community
  • GeoGrid
Write a Comment
User Comments (0)
About PowerShow.com