Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter and MetaChaos - PowerPoint PPT Presentation

About This Presentation

Title:

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter and MetaChaos

Description:

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter and MetaChaos Joel Saltz University of Maryland, College Park – PowerPoint PPT presentation

Number of Views:270

Avg rating:3.0/5.0

Slides: 67

Provided by: JoelS152

Learn more at: http://www.spscicomp.org

Category:

more less

Transcript and Presenter's Notes

Title: Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter and MetaChaos

1
Very Large Dataset Access and ManipulationActive
Data Repository (ADR)DataCutter and MetaChaos

Joel Saltz
University of Maryland, College Park
and
Johns Hopkins Medical Institutions
http//www.cs.umd.edu/projects/adr

2
What we do

Develop database tools for interacting with large
multi-scale, multi-resolution datasets
Ad-hoc queries, produce data products, support
visualization of disk and tape based datasets
Query, subset and filter very large archival
datasets
Operating system and middleware for very large
active network attached storage systems
Compilers that allow users to easily specify user
defined data transformations (e.g. using Java
dialect)
Tools targeted at distributed multi-architecture
platforms

3
Tools to Manage Storage Hierarchy

Fast secondary storage (Active Data Repository)
Tools for on-demand data product generation,
interactive data exploration, visualization
Target closely coupled sets of processors/disks
Archival Storage (DataCutter)
Load subset of data from tertiary storage into
disk cache or client
Access data from distributed data collections
Preprocess close to data sources
Stand-alone and Integrated into NPACI Storage
Resource Broker

4
Tool to Couple Applications

MetaChaos
Parallel programs distribute data structures
between processor memories
Separately developed programs will use different
schemes to distribute data
MetaChaos coordinates movement of data between
separately developed, compiled parallel programs
Layered on standard message passing layer such as
MPIch-g, PVM
Garlik integration of MetaChaos with KeLP floor
plans

5
Irregular Multi-dimensional Datasets

Spatial/multi-dimensional multi-scale,
multi-resolution datasets
Applications select portions of one or more
datasets
Selection of data subset makes use of spatial
index (e.g., R-tree, quad-tree, etc.)
Data not used as-is, generally preprocessing is
needed - often to reduce data volumes

6
Querying Irregular Multi-dimensional Datasets

Irregular datasets
Think of disk-based unstructured meshes, data
structures used in adaptive multiple grid
calculations, sensor data
indexed by spatial location (e.g., position on
earth, position of microscope stage)
Spatial query used to specify iterator
computation on data obtained from spatial query
computation aggregates data - resulting data
product size significantly smaller than results
of range query

7
Loading Datasets into ADR

A user
should decompose dataset into data chunks
optionally can distribute chunks across the
disks, and provide an index for accessing them
ADR, given data chunks and associated minimum
bounding rectangles in a set of files
can distribute data chunks across the disks using
a Hilbert-curve based declustering algorithm,
can create an R-tree based index on the dataset.

8
Loading Datasets into ADR

ADR Data Loading Service
Distributes chunks across the disks in the system
(e.g., using Hilbert curve based declustering)
Constructs an R-tree index using bounding boxes
of the data chunks

Disk Farm
9
Data Loading Service

User must decompose the dataset into chunks
For a fully cooked dataset, User
moves the data and index files to disks (via ftp,
for example)
registers the dataset using ADR utility programs
For a half cooked dataset, ADR
computes placement information using a Hilbert
curve-based declustering algorithm,
builds an R-tree index,
moves the data chunks to the disks
registers the dataset

10
Query Execution in Active Data Repository

An ADR Query contains a reference to
the data set of interest,
a query window (a multi-dimensional bounding box
in input datasets attribute space),
default or user defined index lookup functions,
user-defined accumulator,
user-defined projection and aggregation
functions,
how the results are handled (write to disk, or
send back to the client).
ADR handles multiple simultaneous active queries

11
ADR Query Execution
query
Send output to clients
Index lookup
Combine partial output results
Aggregate local input data into output
Generate query plan
Initialize output
12
Dataset Structure

Spatial and temporal resolution may depend on
spatial location
Physical quantities computed and stored vary with
spatial location

13
Processing Irregular DatasetsExample --
Interpolation
Output grid onto which a projection is carried out
Specify portion of raw sensor data
corresponding to some search criterion
14
Active Data Repository (ADR)

Set of services for building parallel databases
of multi-dimensional datasets
enables integration of storage, retrieval and
processing of multi-dimensional datasets on
parallel machines.
can maintain and jointly process multiple
datasets.
provides support and runtime system for common
operations such as
data retrieval,
memory management,
scheduling of processing across a parallel
machine.
customizable for various application specific
processing.

15
Data Processing Scenario
source data elements
16
Data Processing Scenario
result data elements
mapping function
source data elements
17
Data Processing Scenario
result data elements
intermediate data elements (accumulator elements)
reduction function
source data elements
18
Data Processing Scenario
P0
P1
P2
P0
P1
P2
Data elements declustered across disks attached
to processors of distributed memory machines
19
Data Processing

Source and result datasets are multi-dimensional
Result dataset often smaller than source dataset
Perform processing near where source datasets
live
Correctness of the reduction functions does not
depend on the order source elements are processed

20
Order-independent Reduction Functions
P2
P0
P1
P2
P1
P0
combine phase
reduction phase
P0
P1
P2
P0
P1
P2
Correctness of reduction function does not depend
on the order elements are processed
21
Data Processing Strategies

Fully Replicated Accumulator (FRA)
initialization replicate accumulator on all
processors
reduction
read src elements from local disks
process src elements with local accumulator
elements
combine merge replicated accumulator elements
Sparsely Replicated Accumulator (SRA)
initialization only replicate accumulator where
required
Distributed Accumulator (DA)
initialization partition accumulator among
processors
reduction
read src elements on local disks
send src elements to processor that owns mapped
accumulator for processing

high memory requirement
low memory requirement
22
ADR Applications

Visualize Thematic Mapper (TM) Landsat images
Global Land Cover Facility
Enhanced the capabilities of the GLCF TM meta
data browser to allow browsing of the raw TM
images
Visualize astronomy data using MPIRE
MPIRE/ADR implementation extended the
functionality of MPIRE to allow out-of-core
computations
MPIRE runs on very large data sets even on
relatively small numbers of processors.
Applications were demonstrated at SC99.

23
ADR Applications

Energy and Environment NPACI Alpha project
Data repository for flow data, mesh interpolation
used in coupling flow results to projection,
transport codes
History matching -- examining differences and
similarities in a set of simulation realizations
Virtual Microscope
Exploration of large microscopy datasets

24
Pathology
Volume Rendering
Applications
Surface/Groundwater Modeling
Satellite Data Analysis
25
Example ADR and MetaChaosCoupling of Surface
Water Codes

Carry out a surface water pollution remediation
using a chain of flow codes and reactive
transport codes.
Codes run on separate platforms and their
results are stored in ADR which, along with
MetaChaos, provides the coupling.
Parallelization of Projection/Ground Water Code
using KeLP

26
Projection Code UTPROJ

Couples 3D surface water flow model to
contaminant and salinity transport models, can be
used as ground water code
Implements conservative velocity projection
method
Improves local mass conservation
Projection formulation based on mixed finite
element method

Upper Chesapeake Bay
Philadelphia CND Canal
Aberdeen Proving Grounds Baltimore US Naval
Academy
Delaware Bay
Atlantic Ocean
Delaware Bay, CND Canal, and Chesapeake Bay
27
Current state of project
ADR
28
Water Contamination Studies
Simulation Time
29

30

31
Example Split Parsim

UT Austin code PARSIM models flow and reactive
transport
Applications Bay and estuary, reservoir, blood
flow
Computationally intensive flow calculations
Data intensive reactive transport (20
components)
Flow and Reactive Transport run on different
platforms, coupled using MetaChaos
Data archived on ADR in I/O cluster
Reactive Transport data analyzed using ADR
(isosurface contour)

32
ADR Subsets Data, Carries out Iso-surface
Rendering Over Range of Timesteps (vtk client)
33
Other Research

DataCutter
Supports data subsetting, filters connected by
streams (coarse grained dataflow).
Integrated in NPACI SRB end to end tests
included spatial subsetting, decompression,
clipping of 5TB (uncompressed) datasets
Middleware for large scale data storage
Building large (50TB) disk based clusters
Active disk disklet model for placing
processing near disks
Compilers for user-defined functions
Data parallel model
Users write procedures and customized runtime
support is generated
Interprocedural and slicing analysis

34
New IBM Collaborations

Active Network Attached Storage
HPSS
Assume dedicated storage cluster(s) and zero, one
or more large SP configurations
SDSC
Hopkins
Florida State

35
HPSS

Collaborators Bob Coyne, Otis Graf
Stage high end computing and large scale data
manipulation on a collection of clusters and
parallel machines linked by a high bandwidth
local area network
Deploy HPSS to use the very large tape store at
SDSC for tertiary storage but instantiate the
data cache in the disk cluster at the University
of Maryland
OC-48 network connection (Abilene) will make it
possible to separate HPSS disk cache and tape
library
Library routines to invoke filters on tape data
obtained from tape.
Library will use IBM client API to open files
and to bypass disk cache to directly access data
DataCutter filters will process data

36
Software for Network Attached Storage

Douglas Pase -- Netfinity Network attached
Storage (NAS)
Extend filesystems to support pipelined
communicating processes to perform computation as
data is stored or retrieved.
Filter data to implement a database operation
such as a join or datacube, or to support a more
specialized data mining or data intensive
scientific calculation
Determine whether and how to replicate frequently
accessed files, or how to change file placement
or file striping
Related work in context of GPFS filesystem (Roger
Haskin, IBM Almaden)

37
Details on Collaborative Work with Doug Pase

Work distributed using Java-based software agents
or disklets.
Software transported from client to a server,
executed on server.
Client would be the application, and the server
would be the disk or NAS server.
Agent processes local data,sends results back to
the client as needed.
Disk or NAS server can maintain its configuration
as an appliance, while still offering the
opportunity to move computations to data.
The agent server would restrict any agent's
access to data or other resources appropriately.
Close link with Ongoing Maryland work --
DataCutter, Active Disk and Java based ADR
compiler

38
Research Group

Alan Sussman
Tahsin Kurc
Umit Catalyurik
Chialin Chang
Renato Ferreira
Mike Beynon
Henrique Andrade

Collaborators Mary Wheelers group Scott
Badens group
39
Architecture of Active Data Repository
Client 2 (sequential)
Client 1 (parallel)
Query
Front End
Results
Application Front End
Query Interface Service
Query Submission Service
Query Execution Service
Query Planning Service
Dataset Service
Attribute Space Service
Data Aggregation Service
Indexing Service
Back End
40
ADR Query Execution
Client
Output Handling Phase
Global Combine Phase
Local Reduction Phase
41
DataCutter
42
DataCutter

A suite of Middleware for subsetting and
filtering multi-dimensional datasets stored on
archival storage systems
Integrated with NPACI Storage Resource Broker
(SRB)
Standalone Prototype

43
DataCutter

Spatial Subsetting using Range Queries
a hyperbox defined in the multi-dimensional space
underlying the dataset
items whose multi-dimensional coordinates fall
into the box are retrieved.
Two-level hierarchical indexing -- summary and
detailed index files
Customizable --
Default R-tree index
User can add new indexing methods

44
Processing

Processing (filtering/aggregations) through
Filters
to reduce the amount of data transferred to the
client
filters can run anywhere, but intended to run
near (i.e., over local area network) storage
system
Standalone system allows multiple filters placed
on different platforms
SRB release allows only a single filter which can
be placed anywhere
Motivated by Uysals disklet work

45
Filter Framework

class MyFilter public AS_Filter_Base
public int init(int argc, char argv )
int process(stream_t st) int
finalize(void)

46
DataCutter -- Subsetting

Datasets are partitioned into segments
used to index the dataset, unit of retrieval
Indexing very large datasets
Multi-level hierarchical indexing scheme
Summary index files -- to index a group of
segments or detailed index files
Detailed index files -- to index the segments

47
Placement

The dynamic assignment of filters to particular
hosts for execution is placement (mapping)
Optimization criteria
Communication
leverage filter affinity to dataset
minimize communication volume on slower
connections
co-locate filters with large communication volume
Computation
expensive computation on faster, less loaded hosts

48
Integration of DataCutter with the Storage
Resouce Broker
49
Storage Resource Broker (SRB)

Middleware between clients and storage resources
Remote Access to storage resources.
Various types
File Systems - UNIX, HPSS, UniTree, DPSS (LBL).
DB large objects - Oracle, DB2, Illustra.
Uniform client interface (API).

50
Storage Resource Broker (SRB)

MCAT - MetaData Catalog
Datasets (files) and Collections (directories) -
inodes and more.
Storage resources
User information - authentication, access
privileges, etc.
Software package
Server, client library, UNIX-like utilities, Java
GUI
Platforms - Solaris, Sun OS, Digital Unix, SGI
Irix, Cray T90.

51
SRB/DataCutter

Support for Range Queries
Creation of indices over data sets (composed set
of data files)
Subsetting of data sets
Search for files or portions of files that
intersect a given range query
Restricted filter operations on portions of files
(data segments) before returning them to the
client (to perform filtering or aggregation to
reduce data volume)

52
SRB/DataCutter System
Application (SRB client)
File SID
DBLobjID
Range Query
ObjSID
Resource
Storage Resource Broker (SRB)
User
Indexing Service
DataCutter
MCAT
Filtering Service
SRB I/O and MCAT API
Application Meta-data
DB2, Oracle, Illustra, ObjectStore
HPSS, UniTree
UNIX, ftp
Distributed Storage Resources
53
SRB/DataCutter Client Interface

Creating and Deleting Index

int sfoCreateIndex(srbConn conn, sfoClass class,
int catType, char
inIndexName, char outIndexName,
char resourceName)
int sfoDeleteIndex(srbConn conn, sfoClass class,
int catType, char
indexName)
54
SRB/DataCutter Client Interface

Searching Index -- R-tree index

int sfoSearchIndex(srbConn conn, sfoClass class,
char indexName,
void query, indexSearchResult
myresult, int
maxSegCount)
typedef struct int dim double
min, max rangeQuery
int sfoGetMoreSearchResult(srbConn conn, int
continueIndex,
indexSearchResult myresult,
int
maxSegCount)
55
SRB/DataCutter Client Interface

Searching Index -- R-tree index

typedef struct int dim /
bounding box dimensions / double min
/ minimum in each dimension / double max
/ maximum in each dimension / sfoMBR /
Bounding box structure / typedef struct
sfoMBR segmentMBR / bounding box of the
segment / char objID / object
in SRB that contains the segment / char
collectionName / collection where object is
stored / unsigned int offset /
offset of the segment in the object /
unsigned int size / size of segment
/ segmentInfo / segment meta-data
information / typedef struct int
segmentCount / number of segments
returned / segmentInfo segments /
segment meta-data information / int
continueIndex / continuation flag
/ indexSearchResult / search result
structure /
56
Applying Filters
int sfoApplyFilter(srbConn conn, sfoClass class,
char hostName,
int filterID, char filterArg,
int numOfInputSegments,
segmentInfo inputSegments,
filterDataResult
myresult, int
maxSegCount)
int sfoGetMoreFilterResult(srbConn conn, int
continueIndex,
filterDataResult myresult,
int maxSegCount)
57
Applying Filters
typedef struct segmentInfo segInfo / info
on segment data buffer after filter oper. /
char segment / segment data buffer
after filter is applied /
segmentData typedef struct int
segmentDataCount / segments in segmentData
array / segmentData segments /
segmentData array / int
continueIndex / continuation flag /
filterDataResult
58
Application Virtual Microscope

Interactive software emulation of high power
light microscope for processing/visualizing image
datasets
3-D Image Dataset (100MB to 5GB per focal plane)
Client-server system organization
Rectangular region queries, multiple data chunk
reply
pipeline style processing

59
Virtual Microscope Client
60
VM Application using SRB/DataCutter
read
Indexing
SRB/DataCutter
Distributed Collection of Workstations
zoom
clip
decompress
Distributed Storage Resources
Local Area Network
read image chunks
read
convert jpeg image chunks into RGB pixels
decompress
view
clip
clip image to query boundaries
Client
sub-sample to the required magnification
zoom
view
stitch image pieces together and display image
61
Experimental Setup

UMD 10 node IBM SP (1 4CPU, 3 2CPU, 6 1CPU)
HPSS system (10TB tape storage, 500GB disk cache)
4GB JPEG compressed dataset (90GB
uncompressed),180k x 180k RGB pixels (200 x 200
jpeg blocks of 900x900 pixels each)
250GB JPEG compressed dataset (5.6TB
uncompressed), 1.44Mx1.44M RGB pixels (1600x1600
jpeg blocks)
Rtree index based query lookups
server host SP 2CPU node
Read, Decompress, Clip, Zoom, View distributed
between client and server

62
Dataset --250 GB (Compressed) All Computation on
Server
63
Breakdown of DataCutter Costs250 GB dataset,
9600x9600 query
64
Effect of Filter Placement 9600x9600 Query Warm
Cache
65
Effect of Dataset Size4.5Kx4.5K QueryServer
does Everything but ViewWarm Cache
66
The Future

Integrated suite of tools for handling very deep
memory hierarchies
Common set of tools for grid and disk cache
computations
Programmability
Use XML metadata
Ongoing data parallel compiler project -- uses
Java based user defined functions
Applications development toolkit (Visual
DataCutter)
Implementation
NPACI
Private sector (?)