Database Methods for Scientific Computing - PowerPoint PPT Presentation

About This Presentation
Title:

Database Methods for Scientific Computing

Description:

Hard rock. Node Distribution. Partitioned Unstructured Mesh. element. nodes ... Auto navigation New algorithm for constructing octree automatically. ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 33
Provided by: davidoh
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Database Methods for Scientific Computing


1
Database Methods for Scientific Computing
  • David R. OHallaron
  • Associate Professor of CS and ECE
  • Carnegie Mellon University
  • (joint work with Tiankai Tu and Julio Lopez)

2
The Scientific Computing Process
Physical model
Simulation results
Mesh
Mesh generation
Visuali- zation
Solver
3
The Euclid Project
  • Goal Run large-scale physical simulations on
    PCs with limited physical memory.
  • Approach Index and store the input and output
    datasets in databases, and compute on the
    databases directly.
  • Requires research at the intersection of
    scientific computing, algorithms, databases, and
    systems.

Mesh DBs
Physical model DB
Simulation results DB
Mesh generation
Visuali- zation
Solver
4
David OHallaron, Jacobo Bielak, Omar Ghattas
(Carnegie Mellon) Jonathan Shewchuk (UC
Berkeley) Steven Day (SD State)
5
Teora, Italy 1980
6
San Fernando Valley
7
San Fernando Valley (Top View)
Hard rock
epicenter
x
Soft soil
8
San Fernando Valley (Side View)
Soft soil
Hard rock
9
Node Distribution
10
Partitioned Unstructured Mesh
nodes
element
11
Simulation and Visualization
12
Scientific Computing with Euclid
  • Represent physical model, mesh, and simulation
    results on disk in spatial database structures
    called etrees (Euclid trees)
  • Linear octree indexed by standard Morton-based
    locational codes.
  • Disk pages indexed by standard B-tree indexing
    structure.
  • Perform entire process out-of-core by querying
    and updating the etrees.

Mesh node and element etrees
Physical model etree
Simulation results etree
Mesh generation
Visuali- zation
Solver
13
Octrees
Octree mesh generation
Balance requirement for meshes (2-to-1
constraint)
14
Linear Octrees
y
8
7
h
m
6
5
4
e
g
b
j
l
3
d
f
2
a
c
i
k
1
0
x
B-tree index
3
2
4
7
8
5
6
0
1
m
B-tree Pages
15
Addressing Linear Octree Elements
8
y
ds left-lower corner (2, 2)
7
h
m
6
Binary form (010, 010)
5
4
e
g
b
j
l
3
Interleave the bits to obtain Morton code
d
f
2
a
c
i
k
1
010
010
0
x
3
2
4
7
8
5
6
0
1
00 11 00
Morton code Maps n-dimensional points to
one-dimensional scalars Locational code Appends
an octants level to the Morton code of its
left-lower corner
Append level of d to obtain locational code
001100_11
16
Nice Properties of Linear Octrees
m
h
e
g
b
j
l
d
f
a
c
i
k
An addressing scheme that clusters nearby
octants Finding an octant without knowing its
locational code The order imposed by the
locational code is the same as the preorder
traversal of leafs in octree
17
Etree Mesh Generator
Application-specific input
element database
unbalanced octree
balanced octree
construct
transform
balance
etree library
etree library
etree library
node database
18
Etree Library A Framework In C for Manipulating
Etrees on Disk
  • Etree API Octant (insert) and octree (balance)
    level operations.
  • Linear octree Well-known coding scheme to
    assign keys to octants.
  • Auto navigation New algorithm for constructing
    octree automatically.
  • Local balancing New algorithm to speed up
    balancing operation.
  • B-tree Well-known DB indexing structure.

Application (e.g., construct, balance)
Etree API
Etree Library
Linear Octree
Auto Navigation
Local Balancing
B-Tree
19
Mesh Element Etree
root
01
10
11
A
F
G
01
10
11
01
10
11
00
00
B
C
D
E
B-tree page (locational code keys)
0000_01 A
0100_10 B
0101_10 C
0110_10 D
0111_10 E
1000_01 F
1100_01 G
exact hit
aggregate hit
X0101_10
Y1010_10
KEY FACT Leaf nodes and aggregated nodes can be
located within a B-tree page with a fast binary
search, without traversing the edges of the
octree.
00
20
Mesh Node Etree
n(4,4)
k(2,4)
j(1,4)
i(0,4)
c(0,3)
h(2,3)
e(1,3)
b(0,2)
m(4,2)
g(2,2)
d(1,2)
a(0,0)
f(2,0)
l(4,0)
B-tree leaf page 1 (Morton code keys)
B-tree leaf page 2 (Morton code keys)
21
Auto Navigation
  • Navigation octree
  • Guided by an application function
  • An in-memory pointer-based octree
  • Dynamically grows in depth-first fashion
  • Leaf octants are pruned and flushed to disk in
    preorder (in increasing locational code order)
  • Appends the octants to the etree database to
    avoid database search

Octants not yet processed (in memory)
Non-leaf octants being decomposed (in memory)
Leaf octants (flushed to database)
22
Local Balancing
  • Operational steps
  • Partition the entire domain into equal-size
    blocks
  • Perform internal balancing to enforce 2-to-1
    constraint within each block (in a memory
    resident blocking array)
  • Perform boundary balancing to resolve
    interactions between adjacent blocks

Key Fact Interactions between adjacent blocks
are always absorbed by boundary octants and will
not be propagated into the blocks.
23
Some Evaluation Questions
Is etree mesh generation feasible? How does
running time vary with the physical memory
size? What is the performance impact of auto
navigation? What is the performance impact of
local balancing?
24
Evaluation Methodology
Used etree mesh generator to build family of
finite element meshes for San Fernando Valley
earthquake ground motion simulations.
Mesh Elements Nodes Slave nodes
SF10 7,940 12,118 4,432
SF5 76,330 105,886 34,858
SF2 1,838,524 2,213,035 407,336
SF1 13,579,124 15,097,365 1,649,855
SFx A mesh of the 50 km x 50 km x 12 km San
Fernando Valley that resolves seismic waves with
periods of at most x seconds.
25
Evaluation Setup
All experiments conducted on a PIII 1GHz machine
running Linux 2.4.17. Machines physical memory
for the experiments ranged from 128 MB to 880
MB. Before each experiment, two 1.5 GB files were
sequentially scanned to ensure that the operating
systems buffer cache was flushed.
26
Etree Feasibility
All experiments performed with 128 MB physical
memory
Mesh Elements DB size (MB) Time (sec) Thruput (elem/s)
SF10 7,940 2.5 40 199
SF5 76,330 24 186 410
SF2 1,838,524 583 1,637 1,123
SF1 13,579,124 4,300 9,449 1,439
  • Generating a mesh with 13.6 million elements and
    of size 4.3 GB in 2.6 hours seems reasonable
  • The overall throughput increases with mesh size

27
Impact of Physical Memory Size
  • Memory size does not have a significant impact
    on the running time
  • The etree method is not relying on the operating
    systems internal caching mechanism to achieve
    its performance

28
Impact of Auto Navigation
  • Reducing B-tree buffer size does not increase
    the construction time
  • Auto navigation is not sensitive to B-tree
    buffer size

29
Impact of Local Balancing
  • Achieves speedups ranging from 8 (SF1) to 28
    (SF10)
  • Benefits from the one-time scan of the database
    and the efficient array-based neighbor finding
    algorithm

30
Some Related Work
General octree algorithms Samet 90 Octree mesh
Shepard Geoges 91, Bern et al. 90, Young et al.
91, Wang99 Out-of-core octree solver method
Salmon 97 Linear quadtree Gargantini 82, Morton
66 Space filling curve Orenstein 84, Orenstein
86, Faloutsos Roseman 89 Large dataset
processing Freitag Loy 99, Seamons Winslett
96, Ferreira et al. 99, Kurc et al. 01,
Choudhary et al. 99, Parashar Browne 97
31
Summary and Conclusions
  • Euclid project aims to recast entire scientific
    computing process in terms of database ops.
  • Incorporating existing database techniques
    (linear octree and B-tree) with new algorithms
    (auto navigation and local balancing) in a
    unified framework (the etree) can deliver new
    capabilities.
  • On the horizon
  • Caching and prefetching for etree solver
  • Remote access and derived value caching for
    visualization
  • Parallell visualization system based on etrees
  • Unstructured tetrahedral mesh generation using
    R-trees.

32
Etree API
Unix file I/O style, three levels of abstraction
Initialization and cleanup. e.g.,
etree_t etree_open(const char path, int flag,
)
Octant-level operations. e.g.,
int etree_insert(etree_t ep, location_t loc,
void value)
Octree-level operations. e.g.,
int etree_balance(etree_t ep, decom_t baldecom)
Write a Comment
User Comments (0)
About PowerShow.com