Title: Mining Sequence Patterns from Wind Tunnel Experimental Data
1Mining Sequence Patterns from Wind Tunnel
Experimental Data
- Zhenyu Liu, Wesley W. Chu, Adam Huang, Chris
Folk, Chih-Ming Ho - wwc,vicliu_at_cs.ucla.edu, pohao,chrisf,chihming_at_
ucla.edu - Computer Science Department Mechanical and
Aerospace Engineering Department - University of California
- Los Angeles, California
2Outline
- Problem statement
- Scientific experimental data characteristics
- Conventional mining methods
- Decision tree
- Association rules
- Mining of sequence patterns
- Conclusion
3Delta Wing Aircraft Control via MEMS Actuators
- Vortices symmetry is broken by the actuation of
MEMS actuators, resulting in desirable
aerodynamics loadings
Werle, 1958
4Problem Statement
- Inputs angle of attack, stream velocity,
actuation angle - Output rolling moment
- Problem discover knowledge on input-output
relationship
5Scientific Experimental Data Characteristics
- The output is highly dependent on all inputs. A
subset of inputs is inadequate to predict the
output. - Sequences of input-output relationships contained
(e.g. the rolling moment w.r.t. the actuation
angle)
6Conventional MethodsDecision Tree Generation
- High coverage but low accuracy (an error rate of
46.35 in predicting the original dataset using
the decision tree) - Reason
- The decision tree generation algorithm uses
univariate-split strategy to induce the
input-output relationship - Each single input has low prediction power over
the output.
Angle ofAttack
0,5,10
15,20,25,30,35
MZ in-0.00024,0.00028
ActuationAngle
60,100
40,80,120,140
MZ in-0.0124,0.00537
Angle ofAttack
15,20
25,30,35
MZ in-0.00024,0.00028
MZ in-0.01179,-0.0038
7Conventional Methods Rule Induction
- Acceptable accuracy
- Low input state space coverage (which is 25) and
insufficient for flight control applications
8Conventional Methods The Cause of The Low
Coverage
- The output variable in scientific dataset cannot
be summarized using a subset of the inputs - Rules induced from the sensitive input regions
(large angle of attack in this case) cannot have
both high confidence and high support
9Mining of Sequence Patterns
- Extract sequences from the dataset
- Bottom-up sequence clustering (binary hierarchy)
based on Euclidean distance measure - Sequence pattern extraction from the binary
cluster hierarchy based on the variance measure - Rule induction from sequence patterns
10Mining of Sequence Patterns1. Sequence
Extraction
- Merging the output with one of the input to form
a composite output variable. More specifically - A dataset D with inputs X1, , Xn and an output Y
- A predicate p defined on the inputs
- A sequence of Y w.r.t. Xi (1? i? n) characterized
by p is a set of 2-item tuples lty1, xi1gt, ,
ltym, ximgt calculated by ? Y, Xi(?p(D))
a sequence characterized byp aoa20, vel10
11Mining of Sequence Patterns 2. Bottom-up
Sequence Clustering
- Using the Euclidean distance measure to generate
a binary cluster hierarchy
12Mining of Sequence Patterns 3. Sequence Pattern
Extraction
- From the hierarchy, merge branches with variances
below a user-specified threshold (0.35 in this
example)
13Mining of Sequence Patterns 4. Rule Induction
from Sequence Patterns
- Cluster 8 as an example
- Sample rules generated
- IF angle of attack 35? THEN the rolling moment
curve with actuation angle follows mean(cluster
8), confidence 100, variance 0.243029 - Rules have higher coverage and confidence, and
the accuracy (of the mean) is controllable
through the variance measure
cluster 8
wvar
0.243029
aoa 35 vel 10
aoa 35 vel 15
aoa 35 vel 20
The average of sequences in cluster 8, or,
mean(cluster 8)
All sequences in cluster 8
14Conclusion
- Developed a mining technique to discover
relationship for highly correlated input-output
pairs - Conventional methods (decision tree or rule
induction) fail to generate knowledge with both
high coverage and accuracy - Developed a new technique for mining sequence
patterns from the wind tunnel experimental data - Sequence extraction
- Sequence clustering (binary hierarchy) based on
the Euclidean distance measure - Sequence patterns extracted from the binary
cluster hierarchy using variance measure - Rule induction from sequence patterns
- Rules generated
- Nontrivial to experimenters
- Useful for flight control
15Directly Applying Conventional Mining Methods
- Conventional methods works with categorical
variables. - The first step will be a discretization of the
output variable - 6 partitions generated -0.01179, -0.00380,
-0.00380, -0.00138, -0.00138, -0.00024,
-0.00024, 0.00028, 0.00028, 0.00124,
0.00124, 0.00537 - Then apply decision tree or rule induction to the
discretized dataset