STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES

About This Presentation

Title:

STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES

Description:

STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES Veera Sundararaghavan and Prof. Nicholas Zabaras Materials Process Design and Control Laboratory – PowerPoint PPT presentation

Number of Views:380

Avg rating:3.0/5.0

Slides: 117

Provided by: zab75

Category:

more less

Transcript and Presenter's Notes

Title: STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES

1
STATISTICAL LEARNING METHODS FOR MICROSTRUCTURES
Veera Sundararaghavan and Prof. Nicholas Zabaras
Materials Process Design and Control
Laboratory Sibley School of Mechanical and
Aerospace Engineering188 Frank H. T. Rhodes
Hall Cornell University Ithaca, NY
14853-3801 Email vs85_at_cornell.edu,
zabaras_at_cornell.edu URL http//mpdc.mae.cornell.
edu/
2
WHAT IS STATISTICAL LEARNING

Statistical learning is all about automating the
process of searching for patterns from large
scale statistics.
Which patterns are interesting?
Mathematical techniques for associating input
data with desired attributes and identifying
correlations
A powerful tool for designing materials

3
FOR MICROSTRUCTURES?

Properties of a material are affected by the
underlying microstructure
Microstructural attributes related to specific
properties
Examples Correlation functions -gt Elastic moduli
Orientation distribution -gtYield stress
in polycrystals
Attributes evolve during processing (thermo
mechanical, chemical, solidification etc.)
Can we identify specific patterns in these
relationships?
Is it possible to probabilistically predict the
best microstructure and the best processing paths
for optimizing properties based on available
structural attributes?

4
TERMINOLOGY

Microstructure can be represented in terms of
typical attributes
Examples are volume fractions, probability
functions, shape/size attributes, orientation of
grains, cluster functions, lineal measures and so
on
All these attributes affect physical properties
Attributes evolve during processing of a
microstructure
Attributes are represented in a discrete (vector)
form as features
features are represented as a vector xk, k
1,,n where n is the dimensionality of the
feature
Every different feature is represented as xk(i)
where superscript denotes the ith feature that we
are interested in

5
TERMINOLOGY
Given a data set of computational or experimental
microstructures, can we learn the functional
differences between them based on features? We
denote microstructures that are similar in
attributes in terms of a class representation
y, y 1..k where k is number of classes.
Classes are formed into hierarchies Each level
represented by feature x(i). Structure based
classes are affiliated with process and
properties powerful tool for exploring complex
microstructure design space
6
APPLICATIONS
7
MICROSTRUCTURE LIBRARIES FOR REPRESENTATION
Input microstructure
Sundararaghavan Zabaras, Acta Materialia, 2004
Pre-processing
Identify and add new classes
Feature Detection
Employ lower-order features
Classifier
quantification and mining associations
8
MICROSTRUCTURE RECONSTRUCTION
Sundararaghavan and Zabaras, Computational
Materials Sci, 2005
Process
Pattern recognition
Microstructure evolution models
2D Imaging techniques
Feature extraction
Reverse engineer process parameters
Database
vision
Microstructure Analysis (FEM/Bounding theory)
3D realizations
9
STATISTICAL LEARNING TOOLBOX
Training samples
NUMERICAL SIMULATION OF MATERIAL RESPONSE
Update data In the library

Multi-length
scale analysis
Polycrystalline
plasticity

STATISTICAL LEARNING TOOLBOX
Image

Functions
Classification methods
Identify new classes

ODF
Associate data with a class update classes
Process controller
Pole figures
10
DESIGNING PROCESSES FOR MICROSTRUCTURES
Sundararaghavan and Zabaras, Acta Materialia, 2005
DATABASE
Process sequence-2 New process parameters ODF
history Reduced basis
Process sequence-1 Process parameters ODF
history Reduced basis
New dataset added
Desired texture/property
Classifier
Adaptive basis selection
Process
Reduced basis
Optimization
Probable Process sequences Initial parameters
Stage - 1
Stage - 2
Optimum parameters
Materials Process Design and Control Laboratory
11
THIS LECTURE WILL COVER.

This lecture we will try to go into the math
behind statistical learning and learn two really
useful techniques Support Vector Machines and
Bayesian Clustering.
Applications to microstructure representation,
reconstruction and process design will be shown
We will skim over the physics and some important
computational tools behind these problems

12
STATISTICAL LEARNING TECHNIQUES
13
STATISTICAL LEARNING TECHNIQUES
This lecture
14
STATISTICAL LEARNING TECHNIQUES
This lecture
Function approximation Useful for prediction in
regions that are computationally unreachable (not
covered in this lecture)
15
PRELIMINARIES OF SUPERVISED CLASSIFIERS
Low strength
denotes 1 denotes -1
Two class problem The classes for the test
specimens are known apriori Aim To predict the
strength of a new microstructure
Pore density
High strength
Volume fraction
16
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
How would you classify this data?
17
OCCAMS RAZOR
plurality should not be assumed without necessity
William of Ockham, Surrey (England) 1285-1347
AD, theologian

Simpler models are more likely to be correct than
complex ones
Nature prefers simplicity.
principle of uncertainty maximization

18
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
How would you classify this data?
19
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
Any of these would be fine.. ..but which is best?
20
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
Define the margin of a linear classifier as the
width that the boundary could be increased by
before hitting a datapoint.
21
SUPPORT VECTOR MACHINES
f(x,w,b) sign(w. x - b)
denotes 1 denotes -1
The maximum margin linear classifier is the
linear classifier with the, um, maximum
margin. This is the simplest kind of SVM (Called
an LSVM)
Support Vectors are those datapoints that the
margin pushes up against
Linear SVM
22
SUPPORT VECTOR MACHINES
M Margin Width
x
Predict Class 1 zone
x-
How do we compute M in terms of w and b?
Predict Class -1 zone
wxb1
wxb0
Claim x x- l w for some value of l.
Why?
wxb-1

Plus-plane x w . x b 1
Minus-plane x w . x b -1
The vector w is perpendicular to the Plus Plane.
Why?

Let u and v be two vectors on the Plus Plane.
What is w . ( u v ) ?
And so of course the vector w is also
perpendicular to the Minus Plane
23
SUPPORT VECTOR MACHINES
Computing the margin width
M Margin Width
x
Predict Class 1 zone
x-
Predict Class -1 zone
wxb1
w . (x - l w) b 1 gt w . x - b l w .w
1 gt -1 l w .w 1 gt
wxb0
wxb-1

What we know
w . x b 1
w . x- b -1
x x- l w
x - x- M
Its now easy to get M in terms of w and b

24
SUPPORT VECTOR MACHINES
M
Predict Class 1 zone
wxb1
Predict Class -1 zone
wxb0
wxb-1
Minimize w.w What are the constraints? w . xk b
gt 1 if yk 1 w . xk b lt -1 if yk -1
Learning the Maximum Margin Classifier
25
SUPPORT VECTOR MACHINES
This is going to be a problem! What should we
do? Minimize w.w C (distance of error points
to their correct place)
26
SUPPORT VECTOR MACHINES
M
e11
e2
wxb1
e7
wxb0
wxb-1
Minimize
Constraints? w . xk b gt 1-ek if yk 1 w . xk
b lt -1ek if yk -1 ek gt 0 for all k
27
SUPPORT VECTOR MACHINES
What can be done about this?
Harder 1-dimensional dataset
28
SUPPORT VECTOR MACHINES
Quadratic Basis Functions
x0
29
SUPPORT VECTOR MACHINES WITH KERNELS
F x ? f(x)
Minimize
Constraints? w . F(xk) b gt 1-ek if yk 1 w .
F(xk) b lt -1ek if yk -1 ek gt 0 for all k
30
SUPPORT VECTOR MACHINES QUADRATIC PROGRAMMING
Maximize
where
Subject to these constraints
Then define
Datapoints with ak gt 0 will be the support vectors
Then classify with f(x,w,b) sign(w. (x) - b)
31
p 3
B
Class-B
C
Class-A
C
A
B
A
Class-C
32
MULTIPLE FEATURES
HIERARCHICAL LIBRARIES (a.k.a) DIVISIVE
CLUSTERING
33
DYNAMIC MICROSTRUCTURE LIBRARY CONCEPTS
Space of all possible microstructures
A class of microstructures (eg. equiaxial grains)
New class partition
Hierarchical sub-classes (eg. medium grains)
Expandable class partitions (retraining)
distance measures
New class
Dynamic Representation
New microstructure added
Axis for representation
Updated representation
Materials Process Design and Control Laboratory
34
QUANTIFICATION OF DIVERSE MICROSTRUCTURE
A Common Framework for Quantification of Diverse
Microstructure
Qualitative representation
Equiax grains Grain size small
Lower order descriptor approach
Grain size distribution
No. of grains
Grain size number
Equiaxial grain microstructure space
Quantitative approach
Microstructure represented by a set of numbers
Representation space of all possible polyhedral
microstructures
1.4 2.6 4.0 0.9 ..
Materials Process Design and Control Laboratory
35
BENEFITS

A data-abstraction layer for describing
microstructural information.
An unbiased representation for comparing
simulations and experiments AND for evaluating
correlation between microstructure and
properties.
A self-organizing database of valuable
microstructural information which can be
associated with processes and properties.
Data mining Process sequence selection for
obtaining desired properties
Identification of multiple process paths leading
to the same microstructure
Adaptive selection of basis for reduced order
microstructural simulations.
Hierarchical libraries for 3D microstructure
reconstruction in real-time by matching multiple
lower order features.
Quality control Allows machine inspection and
unambiguous quantitative specification of
microstructures.

Materials Process Design and Control Laboratory
36
PRINCIPAL COMPONENT ANALYSIS

Let be n images.
Vectorize input images
Create an average image
Generate training images

Create correlation matrix (Lmn)
Find eigen basis (vi) of the correlation matrix
Eigen microstructures (ui) are generated from the
basis (vi) as
Any new face image ( ) can be transformed to
eigen face components through n coefficients
(wk) as,

Reduced basis
Data Points
Representation coefficients
Materials Process Design and Control Laboratory
37
REQUIREMENTS OF A REPRESENTATION SCHEME
A set of numbers which completely represents a
microstructure within its class
REPRESENTATION SPACE OF A PARTICULAR
MICROSTRUCTURE
Must differentiate other cases (must be
statistically representative)
2.7 3.6 1.2 0.1 ..
8.4 2.1 5.7 1.9 ..
Need for a technique that is autonomous,
applicable to a variety of microstructures,
computationally feasible and provides complete
representation
Materials Process Design and Control Laboratory
38
PCA REPRESENTATION OF MICROSTRUCTURE AN EXAMPLE
Input Microstructures
Eigen-microstructures
Representation coefficients (x 0.001)
0.0125 1.3142 -4.23 4.5429 -1.6396
-0.8406 0.8463 -3.0232 0.3424 2.6752
3.943 -4.2162 -0.6817 -9718 1.9268
1.1796 -1.3354 -2.8401 6.2064 -3.2106
5.8294 5.2287 -3.7972 -3.6095 -3.6515
Basis 1
Image-1 quantified by 5 coefficients over the
eigen-microstructures
Basis 5
Materials Process Design and Control Laboratory
39
EIGEN VALUES AND RECONSTRUCTION OVER THE BASIS
Significant eigen values capture most of the
image features
4
2
3
1
Reconstruction of microstructures over fractions
of the basis
1.Reconstruction with 100 basis
3. Reconstruction with 60 basis
2. Reconstruction with 80 basis
4. Reconstruction with 40 basis
Materials Process Design and Control Laboratory
40
INCREMENTAL PCA METHOD

For updating the representation basis when new
microstructures are added in real-time.
Basis update is based on an error measure of the
reconstructed microstructure over the existing
basis and the original microstructure

Newly added data point
Updated Basis
IPCA Given the Eigen basis for 9
microstructures, the update in the basis for the
10th microstructure is based on a PCA of 10 x 1
coefficient vectors instead of a 16384 x 1 size
microstructures.
Materials Process Design and Control Laboratory
41
ROSE OF INTERSECTIONS FEATURE ALGORITHM
(Saltykov, 1974)
Identify intercepts of lines with grain
boundaries plotted within a circular domain
Total number of intercepts of lines at each angle
is given as a polar plot called rose of
intersections
Count the number of intercepts over several lines
placed at various angles.
Materials Process Design and Control Laboratory
42
GRAIN SHAPE FEATURE EXAMPLES
Materials Process Design and Control Laboratory
43
GRAIN SIZE PARAMETER
Several lines are superimposed on the
microstructure and the intercept length of the
lines with the grain boundaries are recorded
(Vander Voort, 1993)
The intercept length (x-axis) versus number of
lines (y-axis) histogram is used as the measure
of grain size.
Materials Process Design and Control Laboratory
44
GRAIN SIZE FEATURE EXAMPLES
Materials Process Design and Control Laboratory
45
SVM TRAINING FORMAT
GRAIN FEATURES GIVEN AS INPUT TO SVM TRAINING
ALGORITHM
Class Feature number Feature value Feature number Feature value
1 1 23.32 2 21.52
2 1 24.12 2 31.52
Data point
CLASSIFICATION SUCCESS
Total images Number of classes Number of Training images Highest success rate Average success rate
375 11 40 95.82 92.53
375 11 100 98.54 95.80
Materials Process Design and Control Laboratory
46
CLASS HIERARCHY

Level 1 Grain shapes
Class 2
Class 1
Level 2 Subclasses based on grain sizes

Class 1(a)
Class 1(b)
Class 1(c)
Class 2(a)
Class 2(b)
Class 2(c)
New classes Distance of image feature from the
average feature vector of a class
Materials Process Design and Control Laboratory
47
IPCA QUANTIFICATION WITHIN CLASSES
Class-j Microstructures (Equiaxial grains,
medium grain size)
Class-i Microstructures (Elongated 45 degrees,
small grain size)
The Library Quantification and image
representation
Representation Matrix Image -1 Image-2 Image-3
Component in basis vector 1 123 23 38
2 91 54 -85
3 -54 90 12
Average Image 21 23 24
Eigen Basis
0.9 0.84 0.23..
0.54 0.21 0.74..
Materials Process Design and Control Laboratory
48
REPRESENTATION FORMAT FOR MICROSTRUCTURE
Date 1/12 0223PM, Basis updated Shape Class 3,
(Oriented 40 degrees, elongated) Size Class 1,
(Large grains) Coefficients in the basis2.42,
12.35, -4.14, 1.95, 1.96, -1.25
Improvement of microstructure representation due
to classification
Reconstruction with 6 coefficients (24 basis) A
class with 25 images
Improvement in reconstruction 6 coefficients (10
of basis) Class of 60 images
Original image
Reconstruction over 15 coefficients
Materials Process Design and Control Laboratory
49

A DYNAMIC LIBRARY APPROACH
Classify microstructures based on lower order
descriptors.
Create a common basis for representing images in
each class at the last level in the class
hierarchy.
Represent 3D microstructures as coefficients over
a reduced basis in the base classes.
Dynamically update the basis and the
representation for new microstructures

Does not decay to zero
COMMON-BASIS FOR MICROSTRUCTURE REPRESENTATION
50
Basis Components
Reconstruct using two basis components
X 5.89

X 14.86
Project onto basis
Pixel value round-off
Representation using just 2 coefficients
(5.89,14.86)
51

Creation of 3D microstructure models from 2D
images
3D imaging requires time and effort. Need to
address realtime methodologies for generating 3D
realizations.
Make intelligent use of available information
from computational models and experiments.

Database
2D Imaging techniques
Pattern recognition
vision
Microstructure Analysis
52
Methods available are optimization based
Features of 2D image are matched to that of a 3D
microstructure by posing an optimization
problem. 1) Does not make use of available
information (experimental/simulated data) 2)
Cannot perform reconstructions in real-time. Need
to take into account the processes that create
these microstructure (Oren and Bakke, 2003) for
correctly modeling the geometric connectivity.

Key assumptions employed for 3D image
reconstruction from a single 2D image
Randomness Assumption (Ohser and Mucklich
2000).
Grains in a polyhedral microstructure are
assumed to be of the similar shapes but of
different sizes.
Two phase microstructures can be characterized
using rotationally-invariant probability functions

PATTERN RECOGNITION A DATA-DRIVEN OPTIMIZATION
TOOL
Feature matching for reconstruction of 3D
microstructures

Real-time
Datasets microstructures from experiments or
physical models
DATABASE CREATION
Extraction of statistical features from the
database
FEATURE EXTRACTION
Creation of a microstructure class hierarchy
Classification methods
TRAINING
Prediction of 3D reconstruction, process paths,
etc,
PREDICTION
54

Algorithm (1 Monte Carlo Step)
Calculation of the free energy of a randomly
selected node (Hi)
Random choice of a new crystallographic
orientation for the node
New calculation of the free energy of the
element (Hf)
The orientation that minimizes the energy
(min(Hf,Hi)) is chosen.

Potts Hamiltonian (H)
Ns Total No. of nodes
Nn(i) No. of neighbors of node i
Classes of microstructures based on grain size
feature
Microstructure Database
55
Slice
Intercept lengths of parallel network of lines
with the grain boundaries are recorded at several
angles
The intercept length (x-axis) versus number of
lines (y-axis) histogram is the measure of grain
size (Heyn intercept histogram).
56
FEATURE BASED CLASSIFICATION
3D Microstructures
Heyn int. Histogram
Rose of intersections
3D Microstructures
Class - 1
Class - 1
Class - 2
Class - 2
Class - 3
Class - 4
LEVEL - 1
LEVEL - 2
57
RECONSTRUCTION OF POLYHEDRAL MICROSTRUCTURE
Polarized light micrographs of Aluminum alloy
AA3002 representing the rolling plane (Wittridge
Knutsen 1999)
A reconstructed 3D image
Comparison of the average feature of 3D class and
the 2D image
58
The stereological integral equation for
estimating the 3D grain size distribution from a
2D image for polyhedral microstructures
Na,Fa(s) density of grains and grain size
distribution in 2D image Nv,Fv(u) density of
grains and grain size distribution in 3D
microstructure rotation average of the
size of a particle with maximum size 1 Gu(s)
Size distribution function of the section
profiles under the condition that a random size
U equals the 3D particle mean size (u). Remark
Sizes are defined as the maximum calliper
diameter of a grain
59
STEREOLOGICAL DISTRIBUTIONS (GEOMETRICAL)
3D reconstruction
2D grain profile
3D grain
Na,Fa(s) density of grains and grain size
distribution in 2D image Nv,Fv(u) density of
grains and grain size distribution in 3D
microstructure
60
Rotationally invariant probability functions
(SiN ) can be interpreted as the probability of
finding the N vertices of a polyhedron separated
by relative distances x1, x2,..,xN in phase i
when tossed, without regard to orientation, in
the microstructure.
MC Sampling Computing the three point
probability function of a 3D microstructure(40x40x
40 mic) S3(r,s,t), r s t 2, 5000 initial
points, 4 samples at each initial point.
61

Microstructure is represented using voxels.
Probability of solidification (P) depends on
1) Net weight (w) of the No. of neighbors of a
solid voxel
If w gt 8.6568 voxel solidifies (P 1)
If 3.8284 lt w lt 8.6568, P 0.1
If weight lt 3.8284, the voxel remains liquid (P
0)
2) The solute concentration A linear probability
distribution with P 0 at critical concentration
and P 1 when concentration is 0.

Final state
When a voxel solidifies, liquid is expelled to
its neighbors, creating solute concentration
(ci,j,k) gradients. Movement of solute to
minimize concentration gradients is modeled using
ficks law
Where (i,j,k) is a voxel coordinate, n is the
time step and D is the diffusion coefficient
62
TWO PHASE MICROSTRUCTURE CLASS HIERARCHY
Feature vector Three point probability function
Feature Autocorrelation function
3D Microstructures
3D Microstructures
Class - 1
g
r mm
Class - 2
LEVEL - 2
LEVEL - 1
63
EXAMPLE 3D RECONSTRUCTION USING SVMS
Ag-W composite (Umekawa 1969)
A reconstructed 3D microstructure
Autocorrelation function
3 point probability function
64
MICROSTRUCTURE ELASTIC PROPERTIES
3D image derived through pattern recognition
Experimental image
65
WHAT IS MICROSTRUCTURE DESIGN
Direct problem

Known operating conditions
Use finite elements, experiments etc.
Initial Microstructure
Property?
Design for best processes
Design problems
processing sequence?
Final microstructure/ property
Initial microstructure
Design for best microstructure
Known operating conditions
Known property limits
Microstructure?
66
SUPERVISED VS UNSUPERVISED LEARNING

Supervised classification for design
Classify microstructures based on known process
sequence classes
Given a desired microstructure, identify the
processing stages required through classification
Drawback Identifies a unique process sequence,
but we that find many processing paths to lead to
similar properties!
UNSUPERVISED CLASSIFICATION
Identify classes purely based on structural
attributes
Associate processes and properties through
databases
Explores the structural attribute space for
similarities and unearths non-unique processing
paths leading to similar microstructural
properties

67
K MEANS
Suppose the coordinates of points drawn randomly
from this dataset are transmitted. You can
install decoding software at the receiver. Youre
only allowed to send two bits per point. Itll
have to be a lossy transmission. Loss Sum
Squared Error between decoded coords and original
coords. What encoder/decoder will lose the least
information?
68
K MEANS
Idea One
Break into a grid, decode each bit-pair as the
middle of each grid-cell
00
01

Questions
What are we trying to optimize?
Are we sure it will find an optimal clustering?

11
10
Break into a grid, decode each bit-pair as the
centroid of all data in that grid-cell
69
K MEANS
Find the cluster centers C1,C2,,Ck such that
the sum of the 2-norm distance squared between
each feature xi , i 1,..,n and its nearest
cluster center ch is minimized.
Cost Function
Cost function minimized by transmitting centroids
70
THE EXPECTATION-MAXIMIZATION (EM) ALGORITHM

What properties can be changed for centers c1 ,
c2 , , ck have when distortion is not
minimized?
Expectation step Compute expected centers
Change encoding so that xi is encoded by its
nearest center
Maximization step Compute maximum likelihood
values of centers
(2) Set each Center to the centroid of points it
owns.
Theres no point applying either operation twice
in succession.
But it can be profitable to alternate.
And thats K-means!

EM algorithm will be dealt with later
71
K-MEANS

Ask user how many clusters theyd like. (e.g.
k5)

72
K-MEANS

Ask user how many clusters theyd like. (e.g.
k5)
Randomly guess k cluster Center locations

73
K-MEANS

Ask user how many clusters theyd like. (e.g.
k5)
Randomly guess k cluster Center locations
Each datapoint finds out which Center its
closest to. (Thus each Center owns a set of
datapoints)

74
K-MEANS

Ask user how many clusters theyd like. (e.g.
k5)
Randomly guess k cluster Center locations
Each datapoint finds out which Center its
closest to.
Each Center finds the centroid of the points it
owns

75
K-MEANS

Ask user how many clusters theyd like. (e.g.
k5)
Randomly guess k cluster Center locations
Each datapoint finds out which Center its
closest to.
Each Center finds the centroid of the points it
owns
and jumps there
Repeat until terminated!

often unknown (is dependent on the features used
for microstructure representation)
76
SHORTCOMINGS OF K-MEANS AND REMEDIES

K-MEANS gives hyper-spherical clusters Not
always the case with data
Number of classes must be known apriori Beats
the reasoning for unsupervised clusters we do
not know anything about the classes in the data
May converge to local optima not so bad
We will discuss about new strategies to get
improved clusters of microstructural features
Gaussian mixture models and Bayesian clustering
Later, an improved k-means algorithm called
X-means which uses a Bayesian information
criterion

77
PROBABILITY PRELIMINARIES

A is a Boolean-valued random variable if A
denotes an event, and there is some degree of
uncertainty as to whether A occurs.
Examples
A You win the toss
A Probability of failure of a structure

Discrete Random Variables
0 lt P(A) lt 1 P(True) 1 P(False) 0 P(A or B)
P(A) P(B) - P(A and B)
P(A) P(A) 1
P(B) P(B A) P(B A)
78
PROBABILITY PRELIMINARIES
Definition of Conditional Probability
P(A B) P(AB)
----------- P(B)
Corollary The Chain Rule
P(A B) P(AB) P(B)
Bayes Rule

P(A B) P(AB) P(B)
P(BA) ----------- ---------------
P(A) P(A)

79
PROBABILITY PRELIMINARIES

MLE (Maximum Likelihood Estimator)

What if Y v itself is very unlikely?
Class of data argmaxi P(data class i)

MAP (Maximum A-Posteriori Estimator)

Includes P(Y v) information through Bayes rule
(P(Y v) is called as prior)
Class of data argmaxi P(class i data)
80
PROBABILITY PRELIMINARIES

MAP (Maximum A-Posteriori Estimator)

81
PROBABILITY PRELIMINARIES
Bayes Classifiers in a nutshell
1. Learn the distribution over inputs for each
value Y. 2. This gives P(X1, X2, Xm Yvi
). 3. Estimate P(Yvi ). as fraction of records
with Yvi . 4. For a new prediction
82
NAÏVE BAYES CLASSIFIER
In the case of the naive Bayes Classifier this
can be simplified
The independent features assumption
83
NAÏVE BAYES CLASSIFIER IS AN SVM?
The naïve Bayes classifier
Notation change
New Bayes classifier
84
NAÏVE BAYES CLASSIFIER IS AN SVM?
Bayes classifier with feature weighting
A two class classifier Decision function given
by the sign of fWBC given by
wj 1 (for naïve Bayes) But, features may be
correlated!
85
NAÏVE BAYES CLASSIFIER IS AN SVM?
SVM classifier!
Feature space of a naïve Bayes classifier
86
INTRO TO BAYESIAN UNSUPERVISED CLASSIFICATION
Gaussian Mixture Models
Assume that each feature is generated as Pick a
class at random. Choose class i with probability
P(wi). The feature is sampled from a Gaussian
distribution N(mi, Si )
87
GAUSSIAN MIXTURE MODEL
Probabilistic extension of K-MEANS

There are k components. The ith component is
called yi
Component yi has an associated mean vector mi
Each component generates data from a Gaussian
with mean mi and covariance matrix Si

m2
m3
Assuming features in each class can be modeled by
a Gaussian distribution, identify the parameters
(means,variances etc.) of the distributions
88
GAUSSIAN MIXTURE MODEL

We have x1 x2 xn features of a microstructure
We have P(y1) .. P(yk). We have s.
We can define, for any x , P(xyi , µ1, µ2 .. µk)
Can we define P(x µ1, µ2 .. µk) ?
Can we define P(x1, x2, .. xn µ1, µ2 .. µk) ?

89
GAUSSIAN MIXTURE MODEL
Given a guess at µ1, µ2 .. µ k, We can obtain the
probability of the unlabeled data given those µs.
Inverse Problem Find ms given the points
x1,x2,xk
The normal max likelihood trick Set d log
Prob (.) 0 d µi and
solve for µis. Using gradient descent, Slow but
doable Use a much faster and recently very
popular method
90
EM ALGORITHM REVISITED

We have unlabeled microstructural features x1 x2
xR
We know there are k classes
We know P(y1), P(y2), P(y3), , P(yk)
We dont know µ1 µ2 .. µk
We can write P( data µ1. µk)

Maximize this likelihood
91
GAUSSIAN MIXTURE MODEL
This is n nonlinear equations in µjs.
If, for each xi we knew that for each yj the prob
that µj was in class yj is P(yjxi,µ1µk) Then
we would easily compute µj. If we knew each µj
then we could easily compute P(yjxi,µ1µj) for
each yj and xi.
92
GAUSSIAN MIXTURE MODEL

Iterate. On the tth iteration let our
estimates be
µ1(t), µ2(t) µc(t)
E-step
Compute expected classes of all datapoints for
each class

Just evaluate a Gaussian at xk
M-step. Compute Max. like µ given our datas
class membership distributions
93
GAUSSIAN MIXTURE MODEL DENSITY ESTIMATION
Complex PDF of the feature space
Features in 2D
Classification Probabilistic quantification of
results Ambiguity Anomaly detection Very
popular in Genome mapping
94
DATABASE FOR POLYCRYSTAL MICROSTRUCTURES
Multi-scale microstructure evolution models
Driven by distance based (or) Probabilistic
clustering
Statistical Learning
Database
Feature Extraction
Meso-scale database COMPONENTS
Divisive Clustering
Class hierarchies
Class Prediction
Materials Process Design and Control Laboratory
95
DATABASE FOR POLYCRYSTAL MICROSTRUCTURES
Multi-scale microstructure evolution models
Statistical Learning
Database
Feature Extraction
Meso-scale database COMPONENTS
Divisive Clustering
Class hierarchies
Class Prediction
Materials Process Design and Control Laboratory
96
DATABASE FOR POLYCRYSTAL MICROSTRUCTURES
Multi-scale microstructure evolution models
Statistical Learning
Database
Feature Extraction
Meso-scale database COMPONENTS
Divisive Clustering
Class hierarchies
Class Prediction
Materials Process Design and Control Laboratory
97
ORIENTATION DISTRIBUTION FUNCTION
ORIENTATION DISTRIBUTION FUNCTION A(r,t)

Determines the volume fraction of crystals
within
a region R' of the fundamental region R
Probability of finding a crystal orientation
within
a region R' of the fundamental region
Characterizes texture evolution

ODF EVOLUTION EQUATION EULERIAN DESCRIPTION
reorientation velocity
Any macroscale property lt ? gt can be expressed
as an expectation value if the corresponding
single crystal property ? ( ,t) is known.
Materials Process Design and Control Laboratory
98
FEATURES OF AN ODF ORIENTATION FIBERS
Fibers h1,2,3, y 1,0,1
Sample Axis y
For a particular (h), the pole figure takes
values P(h,y) at locations y on a unit sphere.
angle
Point y (1,0,1)
1,2,3 Pole Figure
Crystal Axis h
Integrated over all fibers corresponding to
crystal direction h and sample direction y
Points (r) of a (h,y) fiber in the fundamental
region
Materials Process Design and Control Laboratory
99
SIGNIFICANCE OF ORIENTATION FIBERS
Important fiber families lt110gt uniaxial
compression, plane strain compression and simple
shear. lt111gt Torsion, lt100gt,lt411gt fibers
Tension a fiber (ND lt110gt ) b fiber FCC
metals under plane strain compression
close affiliation with processes
z-axis lt110gt fiber BB
Uniaxial (z-axis) Compression Texture
z-axis lt111gt fiber CC
Predictable fiber development
z-axis lt100gt fiber AA
Materials Process Design and Control Laboratory
100
LIBRARY FOR TEXTURES
Uni-axial (z-axis) Compression Texture
110 fiber family
Feature
q fiber path corresponding to crystal direction
h and sample direction y
z-axis lt110gt fiber (BB)
Materials Process Design and Control Laboratory
101
SUPERVISED CLASSIFICATION USING SUPPORT VECTOR
MACHINES
Multi-stage classification with each class
affiliated with a unique process
Tension (T)
Stage 1
Stage 2
Stage 3
Identifies a unique processing sequence Fails to
capture the non-uniqueness in the solution
Given ODF/texture
Materials Process Design and Control Laboratory
102
UNSUPERVISED CLASSIFICATION
Find the cluster centers C1,C2,,Ck such that
the sum of the 2-norm distance squared between
each feature xi , i 1,..,n and its nearest
cluster center Ch is minimized.
Each class is affiliated with multiple processes
Cost function
Feature Space
DATABASE OF ODFs
Clusters
Identify clusters
Materials Process Design and Control Laboratory
103
ODF CLASSIFICATION

Automatic class-discovery without class labels.
Hierarchical Classification model
Association of classes with processes, to
facilitate data-mining
Can be used to identify multiple process routes
for obtaining a desired ODF

ODF 2,12,32,97
One ODF, several process paths
Data-mining for Process information with ODF
Classification
Materials Process Design and Control Laboratory
104
PROCESS PARAMETERS LEADING TO DESIRED PROPERTIES
ODF Classification
Database for ODFs
Property Extraction
Identify multiple solutions
Velocity Gradient
Different processes, Similar properties
Materials Process Design and Control Laboratory
105
K-MEANS ALGORITHM FOR UNSUPERVISED CLASSIFICATION

Lloyds Algorithm
Start with k randomly initialized centers
Change encoding so that xi is owned by its
nearest center.
Reset each center to the centroid of the points
it owns.
Alternate steps 1 and 2 until converged.

User needs to provide k, the number of
clusters.

But, No. of clusters is unknown for the texture
classification problem
Materials Process Design and Control Laboratory
106
SCHWARZ CRITERION FOR IDENTIFYING NUMBER OF
CLUSTERS
Maximum likelihood of the variance assuming
Gaussian data distribution
Probability of a point in cluster i
Log-likelihood of the data in a cluster
Materials Process Design and Control Laboratory
107
CENTROID SPLIT TESTS

X-MEANS algorithm
Start with k clusters found through k-means
algorithm
Split each centroid into two centroids, and move
the new centroids along a distance proportional
to the cluster size in an arbitrarily chosen
direction
Run local k-means (k 2) in each cluster
Accept split cluster in each region if BIC(k
1) lt BIC(k 2)
Test for various initial values of k and
select the k with maximum overall BIC

Materials Process Design and Control Laboratory
108
COMPARISON OF K-MEANS AND X-MEANS
Local Optimum produced by the kmeans algorithm
with k 4
Cluster configuration produced by k-means with k
6 Over-estimates the natural number of clusters
Configuration produced by the x-means algorithm
Input range of k 2 to 15. x-means found 4
clusters from the data-set based on the Bayesian
Information Criterion
Materials Process Design and Control Laboratory
109
MULTIPLE PROCESS ROUTES
Desired Youngs Modulus distribution
Stage 1 Tension a 0.9495
Stage 1 Tension a 0.9699
Stage 2 Rotation-1 a -0.2408
Stage 2 Shear-1 a 0.3384
Classification
Magnetic hysteresis loss distribution
Stage 1 Shear-1 a 0.9580
Stage 1 Shear -1 a 0.9454
Stage 2 Plane strain compression (a
-0.1597 )
Stage 2 Rotation-1 (a -0.2748)
Materials Process Design and Control Laboratory
110
LIMITATIONS OF STATISTICAL LEARNING BASED DESIGN
SOLUTIONS

Classification alone does not yield the final
design solution
Why? Since it is impossible to explore the
infinite design space within a database of
reasonable size.
Use statistical learning for providing initial
class of solutions
Use local optimization schemes (details not
given in this presentation) to identify the exact
solutions

Response surface
Objective to be minimized
Stat Learning Design solutions
Microstructure attributes
111
DESIGN FOR DESIRED ODF A MULTI STAGE PROBLEM
Desired ODF
Optimal- Reduced order control
Stage 1 Plane strain compression (a1 0.9472)
Stage 2 Compression (a2 -0.2847)
Initial guess, a1 0.65, a2 -0.1
Full order ODF based on reduced order control
parameters
Materials Process Design and Control Laboratory
112
DESIGN FOR DESIRED MAGNETIC PROPERTY
Crystal lt100gt direction. Easy direction of
magnetization zero power loss
h
External magnetization direction
Stage 1 Shear 1 (a1 0.9745)
Stage 2 Tension (a2 0.4821)
Materials Process Design and Control Laboratory
113
DESIGN FOR DESIRED YOUNGS MODULUS
Stiffness of F.C.C Cu in crystal frame
Elastic modulus is found using the polycrystal
average ltCgt over the ODF as,
Stage 1 Shear (a1 -0.03579)
Stage 2 Tension (a2 0.17339)
Materials Process Design and Control Laboratory
114
WHAT WE SHOULD KNOW
Appreciate the uses and understand the
limitations of statistical learning applied to
materials

How to learn microstructure/process/property
relationships given computational and
experimental data
Be happy with probabilistic tools Bayesian
analytics and Gaussian mixture models
Understand simple tools like K-MEANS that can be
readily used.
Understand SVMs as a versatile statistical
learning tool For both feature selection and
classification
Apply statistical learning to perform real-time
decisions under high degrees of uncertainty

115
USEFUL REFERENCES

Andrew Moores Statistical learning course
online
http//www-2.cs.cmu.edu/awm/tutorials/
Books
R.O. Duda, P.E. Hart and D.G. Stork, Pattern
classification (2nd ed), John Wiley and Sons, New
York (2001).
Example papers on microstructure/materials
related applications for the tools presented in
this talk
V. Sundararaghavan and N. Zabaras, "A dynamic
material library for the representation of single
phase polyhedral microstructures", Acta
Materialia, Vol. 52/14, pp. 4111-4119, 2004
V. Sundararaghavan and N. Zabaras,
"Classification of three-dimensional
microstructures using support vector machines",
Computational Materials Science, Vol. 32, pp.
223-239, 2005
V. Sundararaghavan and N. Zabaras, "On the
synergy between classification of textures and
deformation process sequence selection", Acta
Materialia, Vol. 53/4, pp. 1015-1027, 2005
T J Sabin, C A L Bailer-Jones and P J Withers,
Accelerated learning using Gaussian process
models to predict static recrystallization in an
AlMg alloy, Modelling Simul. Mater. Sci. Eng. 8
(2000) 687706
C. A. L. Bailer-Jones, H. K. D. H. Bhadeshia and
D. J. C. MacKay, Gaussian Process Modelling of
Austenite Formation in Steel, Materials Science
and Technology, Vol. 15, 1999, 287-294.