Title: Wendy S' K' Doyle, Ann C' Gentile
1Extracting Information from DataEase Data
Analysis Development with FCLib
-
- Wendy S. K. Doyle, Ann C. Gentile
- W. Philip Kegelmeyer
- Sandia National Laboratories, USA
- NECDC06
- October 24, 2006
- Los Alamos, NM
Sandia is a multiprogram laboratory operated by
Sandia Corporation, a Lockheed Martin
Company,for the United States Department of
Energy under contract DE-AC04-94AL85000.
2Data is NOT Information yet
- Information is actionable answers questions and
informs decisions - Data (just the facts, maam) needs to be
interpreted to become information - The distinction can be fuzzy
- Data analysis helps turn data into information
3Data Analysis Example
- Overall Process create a model, stress it,
evaluate (repeat) - Model can with a hatch attached by screws
- Stress crush at an angle
- Evaluate Is the damage within acceptable limits?
4Data vs. Information
- Damage Information
- How many screws break?
- How many screws are damaged?
- How many tears (dead element regions)?
- Do any tears penetrate?
- How much gapping?
5Data vs. Information
- Use FCLib to convert data to information
- The analyst can now act on the information.
6Data Analysis Development
- Issues
- Analysis development is iterative
- Data size is growing
- Humans are better at creating information than
computers - Problems with current methods
- Specialized and one-off tools cant be extended
- By hand methods are slow, hard to quantify
have poor repeatability - Our solution is to help the human create
information - Automate as much as possible
- But also allow flexibility
7Feature Characterization Library (FCLib)
- FCLib Toolkit of data analysis building blocks
- C Library
- Open Source (BSD) http//fclib.ca.sandia.gov
- FCLib Features
- Provide variety of simple building blocks that
can be composed into complex analyses - Support feature-based analysis (feature region
of interest) - Minimize low-level processing (automation)
- Generality - applicable to a variety of science
domains - Current File Formats ExodusII, SAF LSDyna
- Provide an elegant interface - simple but not
constrained
8Presentation Outline
- Introduction
- FCLib Overview
- Example 1 Damage Assessment in Weapon Drop
Simulation - Example 2 Support for Machine Learning
- Example 3 Feature Tracking
- Summary Conclusions
- Future Work
9Key Features of the Data Model
- Spatio-temporal data stored on unstructured
meshes
- Minimal set of abstract data objects
- Types are deeper in the interface
- Subsets are full-fledged data objects
- Mesh ownership of data
- Time-varying data available per variable instead
of per step
10Analysis Building Blocks I
- Mesh Topology
- Get mesh entity children/parents/neighbors
- ,
, - Segment into separate connected components
- Get skin
- Mesh Geometry (Spatial)
- Edge lengths/surface areas/region volumes
- Get mesh entities within bounding box or sphere
- ,
11Analysis Building Blocks II
- Variable
- Variable Math
- Statistics Min/Max/Mean/StandardDeviation/Sum
- Decompose vectors into normal tangent
components - Threshold find subset
- Subset
- Set operations AND, OR or XOR to get new subset
- Time
- Feature tracking - track regions of interest
(ROIs) over time - Build your own!
12How to Build an Analysis
- Explore stress in compressed regions
- 01 / get handles to objects of interest /
- 02 fc_loadDataset(some_data.ex2, dataset)
- 03 fc_getMesh(dataset, some_mesh, mesh)
- 04 fc_getVariable(mesh, pressure,
pressureVar) - 05 fc_getVariable(mesh, stress, stressVar)
- 06
- 07 / analysis string together building blocks
/ - 08 fc_threshold(pressureVar, gt, 0., subset)
- 09 fc_getSubsetVariableMeanSdev(subset,
stressVar, - 10 mean, sdev)
- 11 printf(Stress g /- g\n, mean, sdev)
13Example 1 Damage Assessmentin Weapon Drop
Simulation
- Three tools were used by W80 analysts on 150
datasets - screwBreaks
- tears
- gaplines
- Usage modes
- Generate a human readable report.
- Use single measure for sensitivity plots.
- Use single measure to drive optimization of drop
angle to produce maximum damage. - Tools were developed by FCLib developers in close
communication with analysts.
14About The Data
- Example Data
- 2.5 Gb
- 5 parts, 640,000 elements
- Largest 400,000 elems
- 18 screws
- 13 time steps
- Real Data
- 50 Gb
- 100 parts, 1.8M elements
- Largest 350,000 elems
- 80 screws
- 4-30 time steps
15How Many Damaged or Broken Screws?
- Visual inspection
- fclib/bin/screwBreaks d3plot
16screwBreaks Output
-
- Mesh Screws Screw 0 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 1 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 2 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 3 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 4 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 5 Step 12 BR
0.31 ( 1.89/ 6.00) - Mesh Screws Screw 6 Step 12
still broken - Mesh Screws Screw 7 Step 12 BR
0.27 ( 1.64/ 6.00) - Mesh Screws Screw 8 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 9 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 10 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 11 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 12 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 13 Step 12 BR
0.00 ( 0.00/ 6.00) - Mesh Screws Screw 14 Step 12 BR
0.15 ( 0.93/ 6.00) - Mesh Screws Screw 15 Step 12
still broken - Mesh Screws Screw 16 Step 12 BR
0.27 ( 1.64/ 6.00) - Mesh Screws Screw 17 Step 12 BR
0.00 ( 0.00/ 6.00)
Status of each screw at each time step (only last
step shown)
Screw is broken.
Screw is damaged (BR breakage ratio)
Summary for mesh
Summary for dataset
17How Bad are the Tears?
- Visual inspection
- fclib/bin/tears -d displacement d3plot
18tears Output
- Tear characterizations for dataset 'd3plot'
- Tears criteria 'elem_death' lt 0
- Time step index 12
- 5 mesh(es)
- Mesh 0 'Shell' has 18 dead element region(s)
- Mesh 1 'Plate' has 26 dead element region(s)
- Mesh 2 'Cover Plate' has 0 dead element
region(s) - Mesh 3 'Horseshoe Plate' has 0 dead element
region(s) - Mesh 4 'Screws' has 6 dead element region(s)
- Combining of dead elem regions not requested
- Found 50 tears
- Sorting tears by region diameter (largest first)
... - Tear 0
- numDeadElementRegions 1
- meshIDs 0
- meshNames 'Shell'
- numCell 280
- region volume 35.8475
- region diameter 19.8277
Input details
Per mesh summary
Tear details
19More tears Output
- File tear-regions.bb gt bounding boxes of the
tears
20How Much Gapping?
- Visual inspection
- fclib/bin/gaplines -d displ d3plot Shell Cover
Plate 0.1
21gaplines Output
- Dataset 'd3plot'
- Meshes 'Shell' and 'Cover Plate'
- Displ 'displacement'
- Min Dist 0.1
- Number of gap lines found 12482
- Number of sets of sides involved 2
- Stats for set 1 ('Shell_shape0_side18-Cover
Plate_shape0_side2') - numGapline 10628
- Step Gap Length
. - ID Value num min max
mean stdev - 0 0.000000 10628 0.000000 0.028636
0.003368 0.005055 - ...
- 12 0.003000 10628 0.002291 6.851685
1.527860 1.449041 - Step Normal Component of Gap
Length . - ID Value num min max
mean stdev - ...
- 12 0.003000 10628 0.002291 6.851685
1.527860 1.449041
Input details
Result Summary
Stats reported for each side and overall
Gap length stats
Resolved with respect to face normals
22More gaplines Output
- File gaplines.ex2 gt the gaplines and length
variables - 2 adj. surfaces found gt Cover Plate sits in
recess in Shell - Location and amount of gapping easily visualized
23Example 2 Support for Machine Learning (ML)
- FCLib used to manipulate data to and from ML
algorithms - At least 6 of the tools written by ML researchers
- varSmooth - smooth a variable by averaging values
within given radius - Reduces noise and other artifacts of ML
algorithms - connect_comp_regions gen_overlap_matrix -
extract and match up regions from different
datasets - Generate accuracy measures for different ML
algorithms - Crossval - breaks the dataset into multiple
partitions - Generate random samples for cross validation
training
24Example 3 Feature TrackingWhats Going on in
the Ridges?
t0
t5
t10
t15
t25
t35
t20
t30
25Find Track Features
- 01 / get handles to objects of interest /
- 02 fc_loadDataset(cancrush.saf, dataset)
- 03 fc_getMeshByName(datatset, can, mesh)
- 04 fc_getSeqVariable(mesh, eqps, numStep,
stressVar) - 06
- 07 / create a feature group and populate with
features / - 08 fc_createFeatureGroup(featureGroup)
- 09 for (i 0 i lt numStep i)
- 10 fc_threshold(stressVari, gt, 1.0,
subset) - 11 fc_segment(subset, 0, numROI, ROIs)
- 12 fc_trackStep(i, numROI, ROIs,
featureGroup) - 13
- 14 fc_writeFeatureGraph(featureGroup,
graph.dot)
26Reading the Feature Graph
t0
Feature 4
F8
F9
F6
t25
t25
t35
27Analyze the Features
- 01 / loop over each feature /
- 02 fc_featureGroupGetNumFeature(featureGroup,
numFeature) - 03 for (i 0 i lt numFeature i)
- 04 fc_getFeatureROIs(featureGroup, i,
- 05 numROI, stepIDs,
ROIs) - 06 / loop over each time step the feature
exists / - 07 for (j 0 j lt numROI j)
- 08 / do analysis /
- 09 fc_getVariableSubsetMinMax(seqVarstepIDs
j, - 10 ROIsi, minsij,
maxsij) - 11 / print stats (not shown) /
- 12
- 13
28Interpret Results Repeat
Feature Graph
29Summary Conclusions
- FCLib is an open source toolkit of analysis
building blocks. - FCLib has been used by weapons analysts and
machine learning researchers to create data
analysis tools - FCLib makes development of data analysis easier
- Automates low level processing
- Can assemble analyses from building blocks
- Developed building blocks can be reused
- Analyses are tweakable and extensible
- Analyses are quantitative and repeatable
30Future Directions
- Make FCLib more usable, looking into
- Scriptable (Python?)
- GUI interface (like AVS Express or LabView)
- Output database (xml) to support varied query
modes - Grow FCLib community
- Create space for users to share code
- More building blocks! More data analyses!
- Dataset comparisons (very hard)
31Acknowledgements
- Jay Dike, Tim Shelton Tim Kostka - SNL,
Analysts - Ken Buch - SNL, Machine Learning (ML)
- Robert Banfield, Larry Hall, et. al. - Univ. of
S. Florida, ML - This research was supported by ASCs Pre and Post
Processing Environments (PPPE) Data Discovery
(DD) Program. - Contact Wendy Doyle ltwkoegle_at_sandia.govgt
- FCLib available at http//fclib.ca.sandia.gov