Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter and MetaChaos - PowerPoint PPT Presentation

About This Presentation
Title:

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter and MetaChaos

Description:

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter and MetaChaos Joel Saltz University of Maryland, College Park – PowerPoint PPT presentation

Number of Views:270
Avg rating:3.0/5.0
Slides: 67
Provided by: JoelS152
Learn more at: http://www.spscicomp.org
Category:

less

Transcript and Presenter's Notes

Title: Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter and MetaChaos


1
Very Large Dataset Access and ManipulationActive
Data Repository (ADR)DataCutter and MetaChaos
  • Joel Saltz
  • University of Maryland, College Park
  • and
  • Johns Hopkins Medical Institutions
  • http//www.cs.umd.edu/projects/adr

2
What we do
  • Develop database tools for interacting with large
    multi-scale, multi-resolution datasets
  • Ad-hoc queries, produce data products, support
    visualization of disk and tape based datasets
  • Query, subset and filter very large archival
    datasets
  • Operating system and middleware for very large
    active network attached storage systems
  • Compilers that allow users to easily specify user
    defined data transformations (e.g. using Java
    dialect)
  • Tools targeted at distributed multi-architecture
    platforms

3
Tools to Manage Storage Hierarchy
  • Fast secondary storage (Active Data Repository)
  • Tools for on-demand data product generation,
    interactive data exploration, visualization
  • Target closely coupled sets of processors/disks
  • Archival Storage (DataCutter)
  • Load subset of data from tertiary storage into
    disk cache or client
  • Access data from distributed data collections
  • Preprocess close to data sources
  • Stand-alone and Integrated into NPACI Storage
    Resource Broker

4
Tool to Couple Applications
  • MetaChaos
  • Parallel programs distribute data structures
    between processor memories
  • Separately developed programs will use different
    schemes to distribute data
  • MetaChaos coordinates movement of data between
    separately developed, compiled parallel programs
  • Layered on standard message passing layer such as
    MPIch-g, PVM
  • Garlik integration of MetaChaos with KeLP floor
    plans

5
Irregular Multi-dimensional Datasets
  • Spatial/multi-dimensional multi-scale,
    multi-resolution datasets
  • Applications select portions of one or more
    datasets
  • Selection of data subset makes use of spatial
    index (e.g., R-tree, quad-tree, etc.)
  • Data not used as-is, generally preprocessing is
    needed - often to reduce data volumes

6
Querying Irregular Multi-dimensional Datasets
  • Irregular datasets
  • Think of disk-based unstructured meshes, data
    structures used in adaptive multiple grid
    calculations, sensor data
  • indexed by spatial location (e.g., position on
    earth, position of microscope stage)
  • Spatial query used to specify iterator
  • computation on data obtained from spatial query
  • computation aggregates data - resulting data
    product size significantly smaller than results
    of range query

7
Loading Datasets into ADR
  • A user
  • should decompose dataset into data chunks
  • optionally can distribute chunks across the
    disks, and provide an index for accessing them
  • ADR, given data chunks and associated minimum
    bounding rectangles in a set of files
  • can distribute data chunks across the disks using
    a Hilbert-curve based declustering algorithm,
  • can create an R-tree based index on the dataset.

8
Loading Datasets into ADR
  • ADR Data Loading Service
  • Distributes chunks across the disks in the system
    (e.g., using Hilbert curve based declustering)
  • Constructs an R-tree index using bounding boxes
    of the data chunks

Disk Farm
9
Data Loading Service
  • User must decompose the dataset into chunks
  • For a fully cooked dataset, User
  • moves the data and index files to disks (via ftp,
    for example)
  • registers the dataset using ADR utility programs
  • For a half cooked dataset, ADR
  • computes placement information using a Hilbert
    curve-based declustering algorithm,
  • builds an R-tree index,
  • moves the data chunks to the disks
  • registers the dataset

10
Query Execution in Active Data Repository
  • An ADR Query contains a reference to
  • the data set of interest,
  • a query window (a multi-dimensional bounding box
    in input datasets attribute space),
  • default or user defined index lookup functions,
  • user-defined accumulator,
  • user-defined projection and aggregation
    functions,
  • how the results are handled (write to disk, or
    send back to the client).
  • ADR handles multiple simultaneous active queries

11
ADR Query Execution
query
Send output to clients
Index lookup
Combine partial output results
Aggregate local input data into output
Generate query plan
Initialize output
12
Dataset Structure
  • Spatial and temporal resolution may depend on
    spatial location
  • Physical quantities computed and stored vary with
    spatial location

13
Processing Irregular DatasetsExample --
Interpolation
Output grid onto which a projection is carried out
Specify portion of raw sensor data
corresponding to some search criterion
14
Active Data Repository (ADR)
  • Set of services for building parallel databases
    of multi-dimensional datasets
  • enables integration of storage, retrieval and
    processing of multi-dimensional datasets on
    parallel machines.
  • can maintain and jointly process multiple
    datasets.
  • provides support and runtime system for common
    operations such as
  • data retrieval,
  • memory management,
  • scheduling of processing across a parallel
    machine.
  • customizable for various application specific
    processing.

15
Data Processing Scenario
source data elements
16
Data Processing Scenario
result data elements
mapping function
source data elements
17
Data Processing Scenario
result data elements
intermediate data elements (accumulator elements)
reduction function
source data elements
18
Data Processing Scenario
P0
P1
P2
P0
P1
P2
Data elements declustered across disks attached
to processors of distributed memory machines
19
Data Processing
  • Source and result datasets are multi-dimensional
  • Result dataset often smaller than source dataset
  • Perform processing near where source datasets
    live
  • Correctness of the reduction functions does not
    depend on the order source elements are processed

20
Order-independent Reduction Functions
P2
P0
P1
P2
P1
P0
combine phase
reduction phase
P0
P1
P2
P0
P1
P2
Correctness of reduction function does not depend
on the order elements are processed
21
Data Processing Strategies
  • Fully Replicated Accumulator (FRA)
  • initialization replicate accumulator on all
    processors
  • reduction
  • read src elements from local disks
  • process src elements with local accumulator
    elements
  • combine merge replicated accumulator elements
  • Sparsely Replicated Accumulator (SRA)
  • initialization only replicate accumulator where
    required
  • Distributed Accumulator (DA)
  • initialization partition accumulator among
    processors
  • reduction
  • read src elements on local disks
  • send src elements to processor that owns mapped
    accumulator for processing

high memory requirement
low memory requirement
22
ADR Applications
  • Visualize Thematic Mapper (TM) Landsat images
  • Global Land Cover Facility
  • Enhanced the capabilities of the GLCF TM meta
    data browser to allow browsing of the raw TM
    images
  • Visualize astronomy data using MPIRE
  • MPIRE/ADR implementation extended the
    functionality of MPIRE to allow out-of-core
    computations
  • MPIRE runs on very large data sets even on
    relatively small numbers of processors.
  • Applications were demonstrated at SC99.

23
ADR Applications
  • Energy and Environment NPACI Alpha project
  • Data repository for flow data, mesh interpolation
    used in coupling flow results to projection,
    transport codes
  • History matching -- examining differences and
    similarities in a set of simulation realizations
  • Virtual Microscope
  • Exploration of large microscopy datasets

24
Pathology
Volume Rendering
Applications
Surface/Groundwater Modeling
Satellite Data Analysis
25
Example ADR and MetaChaosCoupling of Surface
Water Codes
  • Carry out a surface water pollution remediation
    using a chain of flow codes and reactive
    transport codes.
  • Codes run on separate platforms and their
    results are stored in ADR which, along with
    MetaChaos, provides the coupling.
  • Parallelization of Projection/Ground Water Code
    using KeLP

26
Projection Code UTPROJ
  • Couples 3D surface water flow model to
    contaminant and salinity transport models, can be
    used as ground water code
  • Implements conservative velocity projection
    method
  • Improves local mass conservation
  • Projection formulation based on mixed finite
    element method

Upper Chesapeake Bay
Philadelphia CND Canal
Aberdeen Proving Grounds Baltimore US Naval
Academy
Delaware Bay
Atlantic Ocean
Delaware Bay, CND Canal, and Chesapeake Bay
27
Current state of project
ADR
28
Water Contamination Studies
Simulation Time
29

30

31
Example Split Parsim
  • UT Austin code PARSIM models flow and reactive
    transport
  • Applications Bay and estuary, reservoir, blood
    flow
  • Computationally intensive flow calculations
  • Data intensive reactive transport (20
    components)
  • Flow and Reactive Transport run on different
    platforms, coupled using MetaChaos
  • Data archived on ADR in I/O cluster
  • Reactive Transport data analyzed using ADR
    (isosurface contour)

32
ADR Subsets Data, Carries out Iso-surface
Rendering Over Range of Timesteps (vtk client)
33
Other Research
  • DataCutter
  • Supports data subsetting, filters connected by
    streams (coarse grained dataflow).
  • Integrated in NPACI SRB end to end tests
    included spatial subsetting, decompression,
    clipping of 5TB (uncompressed) datasets
  • Middleware for large scale data storage
  • Building large (50TB) disk based clusters
  • Active disk disklet model for placing
    processing near disks
  • Compilers for user-defined functions
  • Data parallel model
  • Users write procedures and customized runtime
    support is generated
  • Interprocedural and slicing analysis

34
New IBM Collaborations
  • Active Network Attached Storage
  • HPSS
  • Assume dedicated storage cluster(s) and zero, one
    or more large SP configurations
  • SDSC
  • Hopkins
  • Florida State

35
HPSS
  • Collaborators Bob Coyne, Otis Graf
  • Stage high end computing and large scale data
    manipulation on a collection of clusters and
    parallel machines linked by a high bandwidth
    local area network
  • Deploy HPSS to use the very large tape store at
    SDSC for tertiary storage but instantiate the
    data cache in the disk cluster at the University
    of Maryland
  • OC-48 network connection (Abilene) will make it
    possible to separate HPSS disk cache and tape
    library
  • Library routines to invoke filters on tape data
    obtained from tape.
  • Library will use IBM client API to open files
    and to bypass disk cache to directly access data
  • DataCutter filters will process data

36
Software for Network Attached Storage
  • Douglas Pase -- Netfinity Network attached
    Storage (NAS)
  • Extend filesystems to support pipelined
    communicating processes to perform computation as
    data is stored or retrieved.
  • Filter data to implement a database operation
    such as a join or datacube, or to support a more
    specialized data mining or data intensive
    scientific calculation
  • Determine whether and how to replicate frequently
    accessed files, or how to change file placement
    or file striping
  • Related work in context of GPFS filesystem (Roger
    Haskin, IBM Almaden)

37
Details on Collaborative Work with Doug Pase
  • Work distributed using Java-based software agents
    or disklets.
  • Software transported from client to a server,
    executed on server.
  • Client would be the application, and the server
    would be the disk or NAS server.
  • Agent processes local data,sends results back to
    the client as needed.
  • Disk or NAS server can maintain its configuration
    as an appliance, while still offering the
    opportunity to move computations to data.
  • The agent server would restrict any agent's
    access to data or other resources appropriately.
  • Close link with Ongoing Maryland work --
    DataCutter, Active Disk and Java based ADR
    compiler

38
Research Group
  • Alan Sussman
  • Tahsin Kurc
  • Umit Catalyurik
  • Chialin Chang
  • Renato Ferreira
  • Mike Beynon
  • Henrique Andrade

Collaborators Mary Wheelers group Scott
Badens group
39
Architecture of Active Data Repository
Client 2 (sequential)
Client 1 (parallel)
Query
Front End
Results
Application Front End
Query Interface Service
Query Submission Service
Query Execution Service
Query Planning Service
Dataset Service
Attribute Space Service
Data Aggregation Service
Indexing Service
Back End
40
ADR Query Execution
Client
Output Handling Phase
Global Combine Phase
Local Reduction Phase
41
DataCutter
42
DataCutter
  • A suite of Middleware for subsetting and
    filtering multi-dimensional datasets stored on
    archival storage systems
  • Integrated with NPACI Storage Resource Broker
    (SRB)
  • Standalone Prototype

43
DataCutter
  • Spatial Subsetting using Range Queries
  • a hyperbox defined in the multi-dimensional space
    underlying the dataset
  • items whose multi-dimensional coordinates fall
    into the box are retrieved.
  • Two-level hierarchical indexing -- summary and
    detailed index files
  • Customizable --
  • Default R-tree index
  • User can add new indexing methods

44
Processing
  • Processing (filtering/aggregations) through
    Filters
  • to reduce the amount of data transferred to the
    client
  • filters can run anywhere, but intended to run
    near (i.e., over local area network) storage
    system
  • Standalone system allows multiple filters placed
    on different platforms
  • SRB release allows only a single filter which can
    be placed anywhere
  • Motivated by Uysals disklet work

45
Filter Framework
  • class MyFilter public AS_Filter_Base
    public int init(int argc, char argv )
    int process(stream_t st) int
    finalize(void)

46
DataCutter -- Subsetting
  • Datasets are partitioned into segments
  • used to index the dataset, unit of retrieval
  • Indexing very large datasets
  • Multi-level hierarchical indexing scheme
  • Summary index files -- to index a group of
    segments or detailed index files
  • Detailed index files -- to index the segments

47
Placement
  • The dynamic assignment of filters to particular
    hosts for execution is placement (mapping)
  • Optimization criteria
  • Communication
  • leverage filter affinity to dataset
  • minimize communication volume on slower
    connections
  • co-locate filters with large communication volume
  • Computation
  • expensive computation on faster, less loaded hosts

48
Integration of DataCutter with the Storage
Resouce Broker
49
Storage Resource Broker (SRB)
  • Middleware between clients and storage resources
  • Remote Access to storage resources.
  • Various types
  • File Systems - UNIX, HPSS, UniTree, DPSS (LBL).
  • DB large objects - Oracle, DB2, Illustra.
  • Uniform client interface (API).

50
Storage Resource Broker (SRB)
  • MCAT - MetaData Catalog
  • Datasets (files) and Collections (directories) -
    inodes and more.
  • Storage resources
  • User information - authentication, access
    privileges, etc.
  • Software package
  • Server, client library, UNIX-like utilities, Java
    GUI
  • Platforms - Solaris, Sun OS, Digital Unix, SGI
    Irix, Cray T90.

51
SRB/DataCutter
  • Support for Range Queries
  • Creation of indices over data sets (composed set
    of data files)
  • Subsetting of data sets
  • Search for files or portions of files that
    intersect a given range query
  • Restricted filter operations on portions of files
    (data segments) before returning them to the
    client (to perform filtering or aggregation to
    reduce data volume)

52
SRB/DataCutter System
Application (SRB client)
File SID
DBLobjID
Range Query
ObjSID
Resource
Storage Resource Broker (SRB)
User
Indexing Service
DataCutter
MCAT
Filtering Service
SRB I/O and MCAT API
Application Meta-data
DB2, Oracle, Illustra, ObjectStore
HPSS, UniTree
UNIX, ftp
Distributed Storage Resources
53
SRB/DataCutter Client Interface
  • Creating and Deleting Index

int sfoCreateIndex(srbConn conn, sfoClass class,
int catType, char
inIndexName, char outIndexName,
char resourceName)
int sfoDeleteIndex(srbConn conn, sfoClass class,
int catType, char
indexName)
54
SRB/DataCutter Client Interface
  • Searching Index -- R-tree index

int sfoSearchIndex(srbConn conn, sfoClass class,
char indexName,
void query, indexSearchResult
myresult, int
maxSegCount)
typedef struct int dim double
min, max rangeQuery
int sfoGetMoreSearchResult(srbConn conn, int
continueIndex,
indexSearchResult myresult,
int
maxSegCount)
55
SRB/DataCutter Client Interface
  • Searching Index -- R-tree index

typedef struct int dim /
bounding box dimensions / double min
/ minimum in each dimension / double max
/ maximum in each dimension / sfoMBR /
Bounding box structure / typedef struct
sfoMBR segmentMBR / bounding box of the
segment / char objID / object
in SRB that contains the segment / char
collectionName / collection where object is
stored / unsigned int offset /
offset of the segment in the object /
unsigned int size / size of segment
/ segmentInfo / segment meta-data
information / typedef struct int
segmentCount / number of segments
returned / segmentInfo segments /
segment meta-data information / int
continueIndex / continuation flag
/ indexSearchResult / search result
structure /
56
Applying Filters
int sfoApplyFilter(srbConn conn, sfoClass class,
char hostName,
int filterID, char filterArg,
int numOfInputSegments,
segmentInfo inputSegments,
filterDataResult
myresult, int
maxSegCount)
int sfoGetMoreFilterResult(srbConn conn, int
continueIndex,
filterDataResult myresult,
int maxSegCount)
57
Applying Filters
typedef struct segmentInfo segInfo / info
on segment data buffer after filter oper. /
char segment / segment data buffer
after filter is applied /
segmentData typedef struct int
segmentDataCount / segments in segmentData
array / segmentData segments /
segmentData array / int
continueIndex / continuation flag /
filterDataResult
58
Application Virtual Microscope
  • Interactive software emulation of high power
    light microscope for processing/visualizing image
    datasets
  • 3-D Image Dataset (100MB to 5GB per focal plane)
  • Client-server system organization
  • Rectangular region queries, multiple data chunk
    reply
  • pipeline style processing

59
Virtual Microscope Client
60
VM Application using SRB/DataCutter
read
Indexing
SRB/DataCutter
Distributed Collection of Workstations
zoom
clip
decompress
Distributed Storage Resources
Local Area Network
read image chunks
read
convert jpeg image chunks into RGB pixels
decompress
view
clip
clip image to query boundaries
Client
sub-sample to the required magnification
zoom
view
stitch image pieces together and display image
61
Experimental Setup
  • UMD 10 node IBM SP (1 4CPU, 3 2CPU, 6 1CPU)
  • HPSS system (10TB tape storage, 500GB disk cache)
  • 4GB JPEG compressed dataset (90GB
    uncompressed),180k x 180k RGB pixels (200 x 200
    jpeg blocks of 900x900 pixels each)
  • 250GB JPEG compressed dataset (5.6TB
    uncompressed), 1.44Mx1.44M RGB pixels (1600x1600
    jpeg blocks)
  • Rtree index based query lookups
  • server host SP 2CPU node
  • Read, Decompress, Clip, Zoom, View distributed
    between client and server

62
Dataset --250 GB (Compressed) All Computation on
Server
63
Breakdown of DataCutter Costs250 GB dataset,
9600x9600 query
64
Effect of Filter Placement 9600x9600 Query Warm
Cache
65
Effect of Dataset Size4.5Kx4.5K QueryServer
does Everything but ViewWarm Cache
66
The Future
  • Integrated suite of tools for handling very deep
    memory hierarchies
  • Common set of tools for grid and disk cache
    computations
  • Programmability
  • Use XML metadata
  • Ongoing data parallel compiler project -- uses
    Java based user defined functions
  • Applications development toolkit (Visual
    DataCutter)
  • Implementation
  • NPACI
  • Private sector (?)
Write a Comment
User Comments (0)
About PowerShow.com