STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES - PowerPoint PPT Presentation

1 / 116
About This Presentation
Title:

STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES

Description:

STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES Veera Sundararaghavan and Prof. Nicholas Zabaras Materials Process Design and Control Laboratory – PowerPoint PPT presentation

Number of Views:376
Avg rating:3.0/5.0
Slides: 117
Provided by: zab75
Category:

less

Transcript and Presenter's Notes

Title: STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES


1
STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES
Veera Sundararaghavan and Prof. Nicholas Zabaras
Materials Process Design and Control
Laboratory Sibley School of Mechanical and
Aerospace Engineering188 Frank H. T. Rhodes
Hall Cornell University Ithaca, NY
14853-3801 Email vs85_at_cornell.edu,
zabaras_at_cornell.edu URL http//mpdc.mae.cornell.
edu/
2
WHAT IS STATISTICAL LEARNING
  • Statistical learning is all about automating the
    process of searching for patterns from large
    scale statistics.
  • Which patterns are interesting?
  • Mathematical techniques for associating input
    data with desired attributes and identifying
    correlations
  • A powerful tool for designing materials

3
FOR MICROSTRUCTURES?
  • Properties of a material are affected by the
    underlying microstructure
  • Microstructural attributes related to specific
    properties
  • Examples Correlation functions -gt Elastic moduli
  • Orientation distribution -gtYield stress
    in polycrystals
  • Attributes evolve during processing (thermo
    mechanical, chemical, solidification etc.)
  • Can we identify specific patterns in these
    relationships?
  • Is it possible to probabilistically predict the
    best microstructure and the best processing paths
    for optimizing properties based on available
    structural attributes?

4
TERMINOLOGY
  • Microstructure can be represented in terms of
    typical attributes
  • Examples are volume fractions, probability
    functions, shape/size attributes, orientation of
    grains, cluster functions, lineal measures and so
    on
  • All these attributes affect physical properties
  • Attributes evolve during processing of a
    microstructure
  • Attributes are represented in a discrete (vector)
    form as features
  • features are represented as a vector xk, k
    1,,n where n is the dimensionality of the
    feature
  • Every different feature is represented as xk(i)
    where superscript denotes the ith feature that we
    are interested in

5
TERMINOLOGY
Given a data set of computational or experimental
microstructures, can we learn the functional
differences between them based on features? We
denote microstructures that are similar in
attributes in terms of a class representation
y, y 1..k where k is number of classes.
Classes are formed into hierarchies Each level
represented by feature x(i). Structure based
classes are affiliated with process and
properties powerful tool for exploring complex
microstructure design space
6
APPLICATIONS
7
MICROSTRUCTURE LIBRARIES FOR REPRESENTATION
Input microstructure
Sundararaghavan Zabaras, Acta Materialia, 2004
Pre-processing
Identify and add new classes
Feature Detection
Employ lower-order features
Classifier
quantification and mining associations
8
MICROSTRUCTURE RECONSTRUCTION
Sundararaghavan and Zabaras, Computational
Materials Sci, 2005
Process
Pattern recognition
Microstructure evolution models
2D Imaging techniques
Feature extraction
Reverse engineer process parameters
Database
vision
Microstructure Analysis (FEM/Bounding theory)
3D realizations
9
STATISTICAL LEARNING TOOLBOX
Training samples
NUMERICAL SIMULATION OF MATERIAL RESPONSE
Update data In the library
  • Multi-length
  • scale analysis
  • Polycrystalline
  • plasticity

STATISTICAL LEARNING TOOLBOX
Image
  • Functions
  • Classification methods
  • Identify new classes

ODF
Associate data with a class update classes
Process controller
Pole figures
10
DESIGNING PROCESSES FOR MICROSTRUCTURES
Sundararaghavan and Zabaras, Acta Materialia, 2005
DATABASE
Process sequence-2 New process parameters ODF
history Reduced basis
Process sequence-1 Process parameters ODF
history Reduced basis
New dataset added
Desired texture/property
Classifier
Adaptive basis selection
Process
Reduced basis
Optimization
Probable Process sequences Initial parameters
Stage - 1
Stage - 2
Optimum parameters
Materials Process Design and Control Laboratory
11
THIS LECTURE WILL COVER.
  • This lecture we will try to go into the math
    behind statistical learning and learn two really
    useful techniques Support Vector Machines and
    Bayesian Clustering.
  • Applications to microstructure representation,
    reconstruction and process design will be shown
  • We will skim over the physics and some important
    computational tools behind these problems

12
STATISTICAL LEARNING TECHNIQUES
13
STATISTICAL LEARNING TECHNIQUES
This lecture
14
STATISTICAL LEARNING TECHNIQUES
This lecture
Function approximation Useful for prediction in
regions that are computationally unreachable (not
covered in this lecture)
15
PRELIMINARIES OF SUPERVISED CLASSIFIERS
Low strength
denotes 1 denotes -1
Two class problem The classes for the test
specimens are known apriori Aim To predict the
strength of a new microstructure
Pore density
High strength
Volume fraction
16
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
How would you classify this data?
17
OCCAMS RAZOR
plurality should not be assumed without necessity
William of Ockham, Surrey (England) 1285-1347
AD, theologian
  • Simpler models are more likely to be correct than
    complex ones
  • Nature prefers simplicity.
  • principle of uncertainty maximization

18
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
How would you classify this data?
19
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
Any of these would be fine.. ..but which is best?
20
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
Define the margin of a linear classifier as the
width that the boundary could be increased by
before hitting a datapoint.
21
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
The maximum margin linear classifier is the
linear classifier with the, um, maximum
margin. This is the simplest kind of SVM (Called
an LSVM)
Support Vectors are those datapoints that the
margin pushes up against
Linear SVM
22
SUPPORT VECTOR MACHINES
M Margin Width
x
Predict Class 1 zone
x-
How do we compute M in terms of w and b?
Predict Class -1 zone
wxb1
wxb0
Claim x x- l w for some value of l.
Why?
wxb-1
  • Plus-plane x w . x b 1
  • Minus-plane x w . x b -1
  • The vector w is perpendicular to the Plus Plane.
    Why?

Let u and v be two vectors on the Plus Plane.
What is w . ( u v ) ?
And so of course the vector w is also
perpendicular to the Minus Plane
23
SUPPORT VECTOR MACHINES
Computing the margin width
M Margin Width
x
Predict Class 1 zone
x-
Predict Class -1 zone
wxb1
w . (x - l w) b 1 gt w . x - b l w .w
1 gt -1 l w .w 1 gt
wxb0
wxb-1
  • What we know
  • w . x b 1
  • w . x- b -1
  • x x- l w
  • x - x- M
  • Its now easy to get M in terms of w and b

24
SUPPORT VECTOR MACHINES
M
Predict Class 1 zone
wxb1
Predict Class -1 zone
wxb0
wxb-1
Minimize w.w What are the constraints? w . xk b
gt 1 if yk 1 w . xk b lt -1 if yk -1
Learning the Maximum Margin Classifier
25
SUPPORT VECTOR MACHINES
This is going to be a problem! What should we
do? Minimize w.w C (distance of error points
to their correct place)
26
SUPPORT VECTOR MACHINES
M
e11
e2
wxb1
e7
wxb0
wxb-1
Minimize
Constraints? w . xk b gt 1-ek if yk 1 w . xk
b lt -1ek if yk -1 ek gt 0 for all k
27
SUPPORT VECTOR MACHINES
What can be done about this?
Harder 1-dimensional dataset
28
SUPPORT VECTOR MACHINES
Quadratic Basis Functions
x0
29
SUPPORT VECTOR MACHINES WITH KERNELS
F x ? f(x)
Minimize
Constraints? w . F(xk) b gt 1-ek if yk 1 w .
F(xk) b lt -1ek if yk -1 ek gt 0 for all k
30
SUPPORT VECTOR MACHINES QUADRATIC PROGRAMMING
Maximize
where
Subject to these constraints
Then define
Datapoints with ak gt 0 will be the support vectors
Then classify with f(x,w,b) sign(w. (x) - b)
31
p 3
B
Class-B
C
Class-A
C
A
B
A
Class-C
32
MULTIPLE FEATURES
HIERARCHICAL LIBRARIES (a.k.a) DIVISIVE
CLUSTERING
33
DYNAMIC MICROSTRUCTURE LIBRARY CONCEPTS
Space of all possible microstructures
A class of microstructures (eg. equiaxial grains)
New class partition
Hierarchical sub-classes (eg. medium grains)
Expandable class partitions (retraining)
distance measures
New class
Dynamic Representation
New microstructure added
Axis for representation
Updated representation
Materials Process Design and Control Laboratory
34
QUANTIFICATION OF DIVERSE MICROSTRUCTURE
A Common Framework for Quantification of Diverse
Microstructure
Qualitative representation
Equiax grains Grain size small
Lower order descriptor approach
Grain size distribution
No. of grains
Grain size number
Equiaxial grain microstructure space
Quantitative approach
Microstructure represented by a set of numbers
Representation space of all possible polyhedral
microstructures
1.4 2.6 4.0 0.9 ..
Materials Process Design and Control Laboratory
35
BENEFITS
  • A data-abstraction layer for describing
    microstructural information.
  • An unbiased representation for comparing
    simulations and experiments AND for evaluating
    correlation between microstructure and
    properties.
  • A self-organizing database of valuable
    microstructural information which can be
    associated with processes and properties.
  • Data mining Process sequence selection for
    obtaining desired properties
  • Identification of multiple process paths leading
    to the same microstructure
  • Adaptive selection of basis for reduced order
    microstructural simulations.
  • Hierarchical libraries for 3D microstructure
    reconstruction in real-time by matching multiple
    lower order features.
  • Quality control Allows machine inspection and
    unambiguous quantitative specification of
    microstructures.

Materials Process Design and Control Laboratory
36
PRINCIPAL COMPONENT ANALYSIS
  • Let be n images.
  • Vectorize input images
  • Create an average image
  • Generate training images
  1. Create correlation matrix (Lmn)
  2. Find eigen basis (vi) of the correlation matrix
  3. Eigen microstructures (ui) are generated from the
    basis (vi) as
  4. Any new face image ( ) can be transformed to
    eigen face components through n coefficients
    (wk) as,

Reduced basis
Data Points
Representation coefficients
Materials Process Design and Control Laboratory
37
REQUIREMENTS OF A REPRESENTATION SCHEME
A set of numbers which completely represents a
microstructure within its class
REPRESENTATION SPACE OF A PARTICULAR
MICROSTRUCTURE
Must differentiate other cases (must be
statistically representative)
2.7 3.6 1.2 0.1 ..
8.4 2.1 5.7 1.9 ..
Need for a technique that is autonomous,
applicable to a variety of microstructures,
computationally feasible and provides complete
representation
Materials Process Design and Control Laboratory
38
PCA REPRESENTATION OF MICROSTRUCTURE AN EXAMPLE
Input Microstructures
Eigen-microstructures
Representation coefficients (x 0.001)
0.0125 1.3142 -4.23 4.5429 -1.6396
-0.8406 0.8463 -3.0232 0.3424 2.6752
3.943 -4.2162 -0.6817 -9718 1.9268
1.1796 -1.3354 -2.8401 6.2064 -3.2106
5.8294 5.2287 -3.7972 -3.6095 -3.6515
Basis 1
Image-1 quantified by 5 coefficients over the
eigen-microstructures
Basis 5
Materials Process Design and Control Laboratory
39
EIGEN VALUES AND RECONSTRUCTION OVER THE BASIS
Significant eigen values capture most of the
image features
4
2
3
1
Reconstruction of microstructures over fractions
of the basis
1.Reconstruction with 100 basis
3. Reconstruction with 60 basis
2. Reconstruction with 80 basis
4. Reconstruction with 40 basis
Materials Process Design and Control Laboratory
40
INCREMENTAL PCA METHOD
  • For updating the representation basis when new
    microstructures are added in real-time.
  • Basis update is based on an error measure of the
    reconstructed microstructure over the existing
    basis and the original microstructure

Newly added data point
Updated Basis
IPCA Given the Eigen basis for 9
microstructures, the update in the basis for the
10th microstructure is based on a PCA of 10 x 1
coefficient vectors instead of a 16384 x 1 size
microstructures.
Materials Process Design and Control Laboratory
41
ROSE OF INTERSECTIONS FEATURE ALGORITHM
(Saltykov, 1974)
Identify intercepts of lines with grain
boundaries plotted within a circular domain
Total number of intercepts of lines at each angle
is given as a polar plot called rose of
intersections
Count the number of intercepts over several lines
placed at various angles.
Materials Process Design and Control Laboratory
42
GRAIN SHAPE FEATURE EXAMPLES
Materials Process Design and Control Laboratory
43
GRAIN SIZE PARAMETER
Several lines are superimposed on the
microstructure and the intercept length of the
lines with the grain boundaries are recorded
(Vander Voort, 1993)
The intercept length (x-axis) versus number of
lines (y-axis) histogram is used as the measure
of grain size.
Materials Process Design and Control Laboratory
44
GRAIN SIZE FEATURE EXAMPLES
Materials Process Design and Control Laboratory
45
SVM TRAINING FORMAT
GRAIN FEATURES GIVEN AS INPUT TO SVM TRAINING
ALGORITHM
Class Feature number Feature value Feature number Feature value
1 1 23.32 2 21.52
2 1 24.12 2 31.52
Data point
CLASSIFICATION SUCCESS
Total images Number of classes Number of Training images Highest success rate Average success rate
375 11 40 95.82 92.53
375 11 100 98.54 95.80
Materials Process Design and Control Laboratory
46
CLASS HIERARCHY


Level 1 Grain shapes
Class 2
Class 1
Level 2 Subclasses based on grain sizes

Class 1(a)
Class 1(b)
Class 1(c)
Class 2(a)
Class 2(b)
Class 2(c)
New classes Distance of image feature from the
average feature vector of a class
Materials Process Design and Control Laboratory
47
IPCA QUANTIFICATION WITHIN CLASSES
Class-j Microstructures (Equiaxial grains,
medium grain size)
Class-i Microstructures (Elongated 45 degrees,
small grain size)
The Library Quantification and image
representation
Representation Matrix Image -1 Image-2 Image-3
Component in basis vector 1 123 23 38
2 91 54 -85
3 -54 90 12
Average Image 21 23 24
Eigen Basis
0.9 0.84 0.23..
0.54 0.21 0.74..
Materials Process Design and Control Laboratory
48
REPRESENTATION FORMAT FOR MICROSTRUCTURE
Date 1/12 0223PM, Basis updated Shape Class 3,
(Oriented 40 degrees, elongated) Size Class 1,
(Large grains) Coefficients in the basis2.42,
12.35, -4.14, 1.95, 1.96, -1.25
Improvement of microstructure representation due
to classification
Reconstruction with 6 coefficients (24 basis) A
class with 25 images
Improvement in reconstruction 6 coefficients (10
of basis) Class of 60 images
Original image
Reconstruction over 15 coefficients
Materials Process Design and Control Laboratory
49
  • A DYNAMIC LIBRARY APPROACH
  • Classify microstructures based on lower order
    descriptors.
  • Create a common basis for representing images in
    each class at the last level in the class
    hierarchy.
  • Represent 3D microstructures as coefficients over
    a reduced basis in the base classes.
  • Dynamically update the basis and the
    representation for new microstructures

Does not decay to zero
COMMON-BASIS FOR MICROSTRUCTURE REPRESENTATION
50
Basis Components
Reconstruct using two basis components
X 5.89

X 14.86
Project onto basis
Pixel value round-off
Representation using just 2 coefficients
(5.89,14.86)
51
  1. Creation of 3D microstructure models from 2D
    images
  2. 3D imaging requires time and effort. Need to
    address realtime methodologies for generating 3D
    realizations.
  3. Make intelligent use of available information
    from computational models and experiments.

Database
2D Imaging techniques
Pattern recognition
vision
Microstructure Analysis
52
Methods available are optimization based
Features of 2D image are matched to that of a 3D
microstructure by posing an optimization
problem. 1) Does not make use of available
information (experimental/simulated data) 2)
Cannot perform reconstructions in real-time. Need
to take into account the processes that create
these microstructure (Oren and Bakke, 2003) for
correctly modeling the geometric connectivity.
  • Key assumptions employed for 3D image
    reconstruction from a single 2D image
  • Randomness Assumption (Ohser and Mucklich
    2000).
  • Grains in a polyhedral microstructure are
    assumed to be of the similar shapes but of
    different sizes.
  • Two phase microstructures can be characterized
    using rotationally-invariant probability functions

53
  • PATTERN RECOGNITION A DATA-DRIVEN OPTIMIZATION
    TOOL
  • Feature matching for reconstruction of 3D
    microstructures

Real-time
Datasets microstructures from experiments or
physical models
DATABASE CREATION
Extraction of statistical features from the
database
FEATURE EXTRACTION
Creation of a microstructure class hierarchy
Classification methods
TRAINING
Prediction of 3D reconstruction, process paths,
etc,
PREDICTION
54
  • Algorithm (1 Monte Carlo Step)
  • Calculation of the free energy of a randomly
    selected node (Hi)
  • Random choice of a new crystallographic
    orientation for the node
  • New calculation of the free energy of the
    element (Hf)
  • The orientation that minimizes the energy
    (min(Hf,Hi)) is chosen.

Potts Hamiltonian (H)
Ns Total No. of nodes
Nn(i) No. of neighbors of node i
Classes of microstructures based on grain size
feature
Microstructure Database
55
Slice
Intercept lengths of parallel network of lines
with the grain boundaries are recorded at several
angles
The intercept length (x-axis) versus number of
lines (y-axis) histogram is the measure of grain
size (Heyn intercept histogram).
56
FEATURE BASED CLASSIFICATION
3D Microstructures
Heyn int. Histogram
Rose of intersections
3D Microstructures
Class - 1
Class - 1
Class - 2
Class - 2
Class - 3
Class - 4
LEVEL - 1
LEVEL - 2
57
RECONSTRUCTION OF POLYHEDRAL MICROSTRUCTURE
Polarized light micrographs of Aluminum alloy
AA3002 representing the rolling plane (Wittridge
Knutsen 1999)
A reconstructed 3D image
Comparison of the average feature of 3D class and
the 2D image
58
The stereological integral equation for
estimating the 3D grain size distribution from a
2D image for polyhedral microstructures
Na,Fa(s) density of grains and grain size
distribution in 2D image Nv,Fv(u) density of
grains and grain size distribution in 3D
microstructure rotation average of the
size of a particle with maximum size 1 Gu(s)
Size distribution function of the section
profiles under the condition that a random size
U equals the 3D particle mean size (u). Remark
Sizes are defined as the maximum calliper
diameter of a grain
59
STEREOLOGICAL DISTRIBUTIONS (GEOMETRICAL)
3D reconstruction
2D grain profile
3D grain
Na,Fa(s) density of grains and grain size
distribution in 2D image Nv,Fv(u) density of
grains and grain size distribution in 3D
microstructure
60
Rotationally invariant probability functions
(SiN ) can be interpreted as the probability of
finding the N vertices of a polyhedron separated
by relative distances x1, x2,..,xN in phase i
when tossed, without regard to orientation, in
the microstructure.
MC Sampling Computing the three point
probability function of a 3D microstructure(40x40x
40 mic) S3(r,s,t), r s t 2, 5000 initial
points, 4 samples at each initial point.
61
  • Microstructure is represented using voxels.
  • Probability of solidification (P) depends on
  • 1) Net weight (w) of the No. of neighbors of a
    solid voxel
  • If w gt 8.6568 voxel solidifies (P 1)
  • If 3.8284 lt w lt 8.6568, P 0.1
  • If weight lt 3.8284, the voxel remains liquid (P
    0)
  • 2) The solute concentration A linear probability
    distribution with P 0 at critical concentration
    and P 1 when concentration is 0.

Final state
When a voxel solidifies, liquid is expelled to
its neighbors, creating solute concentration
(ci,j,k) gradients. Movement of solute to
minimize concentration gradients is modeled using
ficks law
Where (i,j,k) is a voxel coordinate, n is the
time step and D is the diffusion coefficient
62
TWO PHASE MICROSTRUCTURE CLASS HIERARCHY
Feature vector Three point probability function
Feature Autocorrelation function
3D Microstructures
3D Microstructures
Class - 1
g
r mm
Class - 2
LEVEL - 2
LEVEL - 1
63
EXAMPLE 3D RECONSTRUCTION USING SVMS
Ag-W composite (Umekawa 1969)
A reconstructed 3D microstructure
Autocorrelation function
3 point probability function
64
MICROSTRUCTURE ELASTIC PROPERTIES
3D image derived through pattern recognition
Experimental image
65
WHAT IS MICROSTRUCTURE DESIGN
Direct problem

Known operating conditions
Use finite elements, experiments etc.
Initial Microstructure
Property?
Design for best processes
Design problems
processing sequence?
Final microstructure/ property
Initial microstructure
Design for best microstructure
Known operating conditions
Known property limits
Microstructure?
66
SUPERVISED VS UNSUPERVISED LEARNING
  • Supervised classification for design
  • Classify microstructures based on known process
    sequence classes
  • Given a desired microstructure, identify the
    processing stages required through classification
  • Drawback Identifies a unique process sequence,
    but we that find many processing paths to lead to
    similar properties!
  • UNSUPERVISED CLASSIFICATION
  • Identify classes purely based on structural
    attributes
  • Associate processes and properties through
    databases
  • Explores the structural attribute space for
    similarities and unearths non-unique processing
    paths leading to similar microstructural
    properties

67
K MEANS
Suppose the coordinates of points drawn randomly
from this dataset are transmitted. You can
install decoding software at the receiver. Youre
only allowed to send two bits per point. Itll
have to be a lossy transmission. Loss Sum
Squared Error between decoded coords and original
coords. What encoder/decoder will lose the least
information?
68
K MEANS
Idea One
Break into a grid, decode each bit-pair as the
middle of each grid-cell
00
01
  • Questions
  • What are we trying to optimize?
  • Are we sure it will find an optimal clustering?

11
10
Break into a grid, decode each bit-pair as the
centroid of all data in that grid-cell
69
K MEANS
Find the cluster centers C1,C2,,Ck such that
the sum of the 2-norm distance squared between
each feature xi , i 1,..,n and its nearest
cluster center ch is minimized.
Cost Function
Cost function minimized by transmitting centroids
70
THE EXPECTATION-MAXIMIZATION (EM) ALGORITHM
  • What properties can be changed for centers c1 ,
    c2 , , ck have when distortion is not
    minimized?
  • Expectation step Compute expected centers
  • Change encoding so that xi is encoded by its
    nearest center
  • Maximization step Compute maximum likelihood
    values of centers
  • (2) Set each Center to the centroid of points it
    owns.
  • Theres no point applying either operation twice
    in succession.
  • But it can be profitable to alternate.
  • And thats K-means!

EM algorithm will be dealt with later
71
K-MEANS
  1. Ask user how many clusters theyd like. (e.g.
    k5)

72
K-MEANS
  1. Ask user how many clusters theyd like. (e.g.
    k5)
  2. Randomly guess k cluster Center locations

73
K-MEANS
  1. Ask user how many clusters theyd like. (e.g.
    k5)
  2. Randomly guess k cluster Center locations
  3. Each datapoint finds out which Center its
    closest to. (Thus each Center owns a set of
    datapoints)

74
K-MEANS
  1. Ask user how many clusters theyd like. (e.g.
    k5)
  2. Randomly guess k cluster Center locations
  3. Each datapoint finds out which Center its
    closest to.
  4. Each Center finds the centroid of the points it
    owns

75
K-MEANS
  1. Ask user how many clusters theyd like. (e.g.
    k5)
  2. Randomly guess k cluster Center locations
  3. Each datapoint finds out which Center its
    closest to.
  4. Each Center finds the centroid of the points it
    owns
  5. and jumps there
  6. Repeat until terminated!

often unknown (is dependent on the features used
for microstructure representation)
76
SHORTCOMINGS OF K-MEANS AND REMEDIES
  • K-MEANS gives hyper-spherical clusters Not
    always the case with data
  • Number of classes must be known apriori Beats
    the reasoning for unsupervised clusters we do
    not know anything about the classes in the data
  • May converge to local optima not so bad
  • We will discuss about new strategies to get
    improved clusters of microstructural features
  • Gaussian mixture models and Bayesian clustering
  • Later, an improved k-means algorithm called
    X-means which uses a Bayesian information
    criterion

77
PROBABILITY PRELIMINARIES
  • A is a Boolean-valued random variable if A
    denotes an event, and there is some degree of
    uncertainty as to whether A occurs.
  • Examples
  • A You win the toss
  • A Probability of failure of a structure

Discrete Random Variables
0 lt P(A) lt 1 P(True) 1 P(False) 0 P(A or B)
P(A) P(B) - P(A and B)
P(A) P(A) 1
P(B) P(B A) P(B A)
78
PROBABILITY PRELIMINARIES
Definition of Conditional Probability
P(A B) P(AB)
----------- P(B)
Corollary The Chain Rule
P(A B) P(AB) P(B)
Bayes Rule
  • P(A B) P(AB) P(B)
  • P(BA) ----------- ---------------
  • P(A) P(A)

79
PROBABILITY PRELIMINARIES
  • MLE (Maximum Likelihood Estimator)

What if Y v itself is very unlikely?
Class of data argmaxi P(data class i)
  • MAP (Maximum A-Posteriori Estimator)

Includes P(Y v) information through Bayes rule
(P(Y v) is called as prior)
Class of data argmaxi P(class i data)
80
PROBABILITY PRELIMINARIES
  • MAP (Maximum A-Posteriori Estimator)

81
PROBABILITY PRELIMINARIES
Bayes Classifiers in a nutshell
1. Learn the distribution over inputs for each
value Y. 2. This gives P(X1, X2, Xm Yvi
). 3. Estimate P(Yvi ). as fraction of records
with Yvi . 4. For a new prediction
82
NAÏVE BAYES CLASSIFIER
In the case of the naive Bayes Classifier this
can be simplified
The independent features assumption
83
NAÏVE BAYES CLASSIFIER IS AN SVM?
The naïve Bayes classifier
Notation change
New Bayes classifier
84
NAÏVE BAYES CLASSIFIER IS AN SVM?
Bayes classifier with feature weighting
A two class classifier Decision function given
by the sign of fWBC given by
wj 1 (for naïve Bayes) But, features may be
correlated!
85
NAÏVE BAYES CLASSIFIER IS AN SVM?
SVM classifier!
Feature space of a naïve Bayes classifier
86
INTRO TO BAYESIAN UNSUPERVISED CLASSIFICATION
Gaussian Mixture Models
Assume that each feature is generated as Pick a
class at random. Choose class i with probability
P(wi). The feature is sampled from a Gaussian
distribution N(mi, Si )
87
GAUSSIAN MIXTURE MODEL
Probabilistic extension of K-MEANS
  • There are k components. The ith component is
    called yi
  • Component yi has an associated mean vector mi
  • Each component generates data from a Gaussian
    with mean mi and covariance matrix Si

m2
m3
Assuming features in each class can be modeled by
a Gaussian distribution, identify the parameters
(means,variances etc.) of the distributions
88
GAUSSIAN MIXTURE MODEL
  • We have x1 x2 xn features of a microstructure
  • We have P(y1) .. P(yk). We have s.
  • We can define, for any x , P(xyi , µ1, µ2 .. µk)
  • Can we define P(x µ1, µ2 .. µk) ?
  • Can we define P(x1, x2, .. xn µ1, µ2 .. µk) ?

89
GAUSSIAN MIXTURE MODEL
Given a guess at µ1, µ2 .. µ k, We can obtain the
probability of the unlabeled data given those µs.
Inverse Problem Find ms given the points
x1,x2,xk
The normal max likelihood trick Set d log
Prob (.) 0 d µi and
solve for µis. Using gradient descent, Slow but
doable Use a much faster and recently very
popular method
90
EM ALGORITHM REVISITED
  • We have unlabeled microstructural features x1 x2
    xR
  • We know there are k classes
  • We know P(y1), P(y2), P(y3), , P(yk)
  • We dont know µ1 µ2 .. µk
  • We can write P( data µ1. µk)

Maximize this likelihood
91
GAUSSIAN MIXTURE MODEL
This is n nonlinear equations in µjs.
If, for each xi we knew that for each yj the prob
that µj was in class yj is P(yjxi,µ1µk) Then
we would easily compute µj. If we knew each µj
then we could easily compute P(yjxi,µ1µj) for
each yj and xi.
92
GAUSSIAN MIXTURE MODEL
  • Iterate. On the tth iteration let our
    estimates be
  • µ1(t), µ2(t) µc(t)
  • E-step
  • Compute expected classes of all datapoints for
    each class

Just evaluate a Gaussian at xk
M-step. Compute Max. like µ given our datas
class membership distributions
93
GAUSSIAN MIXTURE MODEL DENSITY ESTIMATION
Complex PDF of the feature space
Features in 2D
Classification Probabilistic quantification of
results Ambiguity Anomaly detection Very
popular in Genome mapping
94
DATABASE FOR POLYCRYSTAL MICROSTRUCTURES
Multi-scale microstructure evolution models
Driven by distance based (or) Probabilistic
clustering
Statistical Learning
Database
Feature Extraction
Meso-scale database COMPONENTS
Divisive Clustering
Class hierarchies
Class Prediction
Materials Process Design and Control Laboratory
95
DATABASE FOR POLYCRYSTAL MICROSTRUCTURES
Multi-scale microstructure evolution models
Statistical Learning
Database
Feature Extraction
Meso-scale database COMPONENTS
Divisive Clustering
Class hierarchies
Class Prediction
Materials Process Design and Control Laboratory
96
DATABASE FOR POLYCRYSTAL MICROSTRUCTURES
Multi-scale microstructure evolution models
Statistical Learning
Database
Feature Extraction
Meso-scale database COMPONENTS
Divisive Clustering
Class hierarchies
Class Prediction
Materials Process Design and Control Laboratory
97
ORIENTATION DISTRIBUTION FUNCTION
ORIENTATION DISTRIBUTION FUNCTION A(r,t)
  • Determines the volume fraction of crystals
    within
  • a region R' of the fundamental region R
  • Probability of finding a crystal orientation
    within
  • a region R' of the fundamental region
  • Characterizes texture evolution

ODF EVOLUTION EQUATION EULERIAN DESCRIPTION
reorientation velocity
Any macroscale property lt ? gt can be expressed
as an expectation value if the corresponding
single crystal property ? ( ,t) is known.
Materials Process Design and Control Laboratory
98
FEATURES OF AN ODF ORIENTATION FIBERS
Fibers h1,2,3, y 1,0,1
Sample Axis y
For a particular (h), the pole figure takes
values P(h,y) at locations y on a unit sphere.
angle
Point y (1,0,1)
1,2,3 Pole Figure
Crystal Axis h
Integrated over all fibers corresponding to
crystal direction h and sample direction y
Points (r) of a (h,y) fiber in the fundamental
region
Materials Process Design and Control Laboratory
99
SIGNIFICANCE OF ORIENTATION FIBERS
Important fiber families lt110gt uniaxial
compression, plane strain compression and simple
shear. lt111gt Torsion, lt100gt,lt411gt fibers
Tension a fiber (ND lt110gt ) b fiber FCC
metals under plane strain compression
close affiliation with processes
z-axis lt110gt fiber BB
Uniaxial (z-axis) Compression Texture
z-axis lt111gt fiber CC
Predictable fiber development
z-axis lt100gt fiber AA
Materials Process Design and Control Laboratory
100
LIBRARY FOR TEXTURES
Uni-axial (z-axis) Compression Texture
110 fiber family
Feature
q fiber path corresponding to crystal direction
h and sample direction y
z-axis lt110gt fiber (BB)
Materials Process Design and Control Laboratory
101
SUPERVISED CLASSIFICATION USING SUPPORT VECTOR
MACHINES
Multi-stage classification with each class
affiliated with a unique process
Tension (T)
Stage 1
Stage 2
Stage 3
Identifies a unique processing sequence Fails to
capture the non-uniqueness in the solution
Given ODF/texture
Materials Process Design and Control Laboratory
102
UNSUPERVISED CLASSIFICATION
Find the cluster centers C1,C2,,Ck such that
the sum of the 2-norm distance squared between
each feature xi , i 1,..,n and its nearest
cluster center Ch is minimized.
Each class is affiliated with multiple processes
Cost function
Feature Space
DATABASE OF ODFs
Clusters
Identify clusters
Materials Process Design and Control Laboratory
103
ODF CLASSIFICATION
  • Automatic class-discovery without class labels.
  • Hierarchical Classification model
  • Association of classes with processes, to
    facilitate data-mining
  • Can be used to identify multiple process routes
    for obtaining a desired ODF

ODF 2,12,32,97
One ODF, several process paths
Data-mining for Process information with ODF
Classification
Materials Process Design and Control Laboratory
104
PROCESS PARAMETERS LEADING TO DESIRED PROPERTIES
ODF Classification
Database for ODFs
Property Extraction
Identify multiple solutions
Velocity Gradient
Different processes, Similar properties
Materials Process Design and Control Laboratory
105
K-MEANS ALGORITHM FOR UNSUPERVISED CLASSIFICATION
  • Lloyds Algorithm
  • Start with k randomly initialized centers
  • Change encoding so that xi is owned by its
    nearest center.
  • Reset each center to the centroid of the points
    it owns.
  • Alternate steps 1 and 2 until converged.
  • User needs to provide k, the number of
    clusters.

But, No. of clusters is unknown for the texture
classification problem
Materials Process Design and Control Laboratory
106
SCHWARZ CRITERION FOR IDENTIFYING NUMBER OF
CLUSTERS
Maximum likelihood of the variance assuming
Gaussian data distribution
Probability of a point in cluster i
Log-likelihood of the data in a cluster
Materials Process Design and Control Laboratory
107
CENTROID SPLIT TESTS
  • X-MEANS algorithm
  • Start with k clusters found through k-means
    algorithm
  • Split each centroid into two centroids, and move
    the new centroids along a distance proportional
    to the cluster size in an arbitrarily chosen
    direction
  • Run local k-means (k 2) in each cluster
  • Accept split cluster in each region if BIC(k
    1) lt BIC(k 2)
  • Test for various initial values of k and
    select the k with maximum overall BIC

Materials Process Design and Control Laboratory
108
COMPARISON OF K-MEANS AND X-MEANS
Local Optimum produced by the kmeans algorithm
with k 4
Cluster configuration produced by k-means with k
6 Over-estimates the natural number of clusters
Configuration produced by the x-means algorithm
Input range of k 2 to 15. x-means found 4
clusters from the data-set based on the Bayesian
Information Criterion
Materials Process Design and Control Laboratory
109
MULTIPLE PROCESS ROUTES
Desired Youngs Modulus distribution
Stage 1 Tension a 0.9495
Stage 1 Tension a 0.9699
Stage 2 Rotation-1 a -0.2408
Stage 2 Shear-1 a 0.3384
Classification
Magnetic hysteresis loss distribution
Stage 1 Shear-1 a 0.9580
Stage 1 Shear -1 a 0.9454
Stage 2 Plane strain compression (a
-0.1597 )
Stage 2 Rotation-1 (a -0.2748)
Materials Process Design and Control Laboratory
110
LIMITATIONS OF STATISTICAL LEARNING BASED DESIGN
SOLUTIONS
  • Classification alone does not yield the final
    design solution
  • Why? Since it is impossible to explore the
    infinite design space within a database of
    reasonable size.
  • Use statistical learning for providing initial
    class of solutions
  • Use local optimization schemes (details not
    given in this presentation) to identify the exact
    solutions

Response surface
Objective to be minimized
Stat Learning Design solutions
Microstructure attributes
111
DESIGN FOR DESIRED ODF A MULTI STAGE PROBLEM
Desired ODF
Optimal- Reduced order control
Stage 1 Plane strain compression (a1 0.9472)
Stage 2 Compression (a2 -0.2847)
Initial guess, a1 0.65, a2 -0.1
Full order ODF based on reduced order control
parameters
Materials Process Design and Control Laboratory
112
DESIGN FOR DESIRED MAGNETIC PROPERTY
Crystal lt100gt direction. Easy direction of
magnetization zero power loss
h
External magnetization direction
Stage 1 Shear 1 (a1 0.9745)
Stage 2 Tension (a2 0.4821)
Materials Process Design and Control Laboratory
113
DESIGN FOR DESIRED YOUNGS MODULUS
Stiffness of F.C.C Cu in crystal frame
Elastic modulus is found using the polycrystal
average ltCgt over the ODF as,
Stage 1 Shear (a1 -0.03579)
Stage 2 Tension (a2 0.17339)
Materials Process Design and Control Laboratory
114
WHAT WE SHOULD KNOW
Appreciate the uses and understand the
limitations of statistical learning applied to
materials
  • How to learn microstructure/process/property
    relationships given computational and
    experimental data
  • Be happy with probabilistic tools Bayesian
    analytics and Gaussian mixture models
  • Understand simple tools like K-MEANS that can be
    readily used.
  • Understand SVMs as a versatile statistical
    learning tool For both feature selection and
    classification
  • Apply statistical learning to perform real-time
    decisions under high degrees of uncertainty

115
USEFUL REFERENCES
  • Andrew Moores Statistical learning course
    online
  • http//www-2.cs.cmu.edu/awm/tutorials/
  • Books
  • R.O. Duda, P.E. Hart and D.G. Stork, Pattern
    classification (2nd ed), John Wiley and Sons, New
    York (2001).
  • Example papers on microstructure/materials
    related applications for the tools presented in
    this talk
  • V. Sundararaghavan and N. Zabaras, "A dynamic
    material library for the representation of single
    phase polyhedral microstructures", Acta
    Materialia, Vol. 52/14, pp. 4111-4119, 2004
  • V. Sundararaghavan and N. Zabaras,
    "Classification of three-dimensional
    microstructures using support vector machines",
    Computational Materials Science, Vol. 32, pp.
    223-239, 2005
  • V. Sundararaghavan and N. Zabaras, "On the
    synergy between classification of textures and
    deformation process sequence selection", Acta
    Materialia, Vol. 53/4, pp. 1015-1027, 2005
  • T J Sabin, C A L Bailer-Jones and P J Withers,
    Accelerated learning using Gaussian process
    models to predict static recrystallization in an
    AlMg alloy, Modelling Simul. Mater. Sci. Eng. 8
    (2000) 687706
  • C. A. L. Bailer-Jones, H. K. D. H. Bhadeshia and
    D. J. C. MacKay, Gaussian Process Modelling of
    Austenite Formation in Steel, Materials Science
    and Technology, Vol. 15, 1999, 287-294.

116
THANK YOU
Write a Comment
User Comments (0)
About PowerShow.com