Title: COMP201 Java Programming
1COMP 538 Introduction of Bayesian networks
Lecture 16 Wrap-Up
2Recap
- Latent class models
- Clustering
- Clustering criterion conditional independence
- Drawback Assumption too strong
- Hierarchical latent class (HLC) models
- Identifiability issues regularity, equivalence
- Hill climbing algorithm
3Today
- Phylogenetic (evolution) trees
- Closely related to HLC models
- An example of viewing existing models in the
framework of BN - Another example HMM
- Interesting because
- Ease understanding
- Techniques in one field applied to another
- Structural EM for phylogenetic trees
- Dynamic BNs for speech understanding
- Development of general purpose algorithms
- Bayesian networks for classification
- Hand waving only
4Phylogenetic Tree Outline
- Introduction to phylogenetic trees
- Probabilistic models of evolution
- Tree reconstruction
5Phylogenetic Trees
- Assumption
- All organisms on Earth have a common ancestor
- This implies that any set of species is related.
- Phylogeny
- The relationship between any set of species.
- Phylogenetic tree
- Usually, the relationship can be represented by a
tree which is called a phylogenetic (evolution)
tree - this is not always true
6Phylogenetic Trees
Current-day species at bottom
7Phylogenetic Trees
- TAXA (sequences) identify species
- Edge lengths represent evoluation time
- Assumption bifurcating tree toplogy
8Probabilistic Models of Evolution
- Characterize relationship between taxa using
substitution probability - P(x y, t) probability that ancestral sequence
y evolves into sequence x along an edge of length
t - P(X7), P(X5X7, t5), P(X6X7, t6), P(S1X5, t1),
P(S2X5, t2), .
9Probabilistic Models of Evolution
- What should P(xy, t) be?
- Two assumptions of commonly used models
- There are only substitutions, no
insertions/deletions (aligned) - One-to-one correspondence between sites in
different sequences - Each site evolves independently and identically
- P(xy, t) Pi1 to m P(x(i) y(i), t)
- m is sequence length
AAGGCAT
10Probabilistic Models of Evolution
- What should P(x(i)y(i), t) be?
- Jukes-Cantor (Character Evolution) Model 1969
- Rate of substitution a (Constant or parameter?)
- Multiplicativity (lack of memory)
11Tree Reconstruction
- Given collection of current-day taxa
- Find tree
- Tree topology T
- Edge lengths t
- Maximum likelihood
- Find tree to maximize P(data tree)
AGGGCAT, TAGCCCA, TAGACTT, AGCACAA, AGCGCTT
12Tree Reconstruction
- When restricted to one particular site, a
phylogenetic tree is an HLC model where - The structure is a binary tree and variables
share the same state space. - The conditional probabilities are from the
character evolution model, parameterized by edge
lengths instead of usual parameterization. - The model is the same for different sites
13Tree Reconstruction
- Current-day Taxa AGGGCAT, TAGCCCA, TAGACTT,
AGCACAA, AGCGCTT - Samples for HLC model. One Sample per site. The
samples are i.i.d. - 1st site (A, T, T, A, A),
- 2nd site (G, A, A, G, G),
- 3rd site (G, G, G, C, C),
14Tree Reconstruction
- Finding ML phylogenetic tree Finding ML HLC
model - Model space
- Model structures binary tree where all variables
share the same state space, which is known. - Parameterization one parameter for each edge.
(In general, P(xy) has xy-1 parameters).
15Bayesian Networks for Classification
- The problem
- Given data
- Find mapping
- (A1, A2, , An) - C
- Possible solutions
- ANN
- Decision tree (Quinlan)
16Bayesian Networks for Classification
- Naïve Bayes model
- From data, learn
- P(C), P(AiC)
- Classification
- arg max_c P(CcA1a1, , Anan)
- Very good in practice
17Bayesian Networks for Classification
- Drawback of NB
- Attributes mutually independent given class
variable - Often violated, leading to doubling counting.
- Fixes
- General BN classifiers
- Tree augmented Naïve Bayes (TAN) models
- Hierarchical NB
18Bayesian Networks for Classification
- General BN classifier
- Treat class variable just as another variable
- Learn a BN.
- Classify the next instance based on values of
variables in the Markov blanket of the class
variable. - Pretty bad because it does not utilize all
available information
19Bayesian Networks for Classification
- TAN model
- Friedman, N., Geiger, D., and Goldszmidt, M.
(1997). Bayesian networks classifiers. Machine
Learning, 29131-163. - Capture dependence among attributes using a tree
structure. - During learning,
- First learn a tree among attributes use
Chow-Liu algorithm - Add class variable and estimate parameters
- Classification
- arg max_c P(CcA1a1, , Anan)
20Bayesian Networks for Classification
- Hierarchical Naïve Bayes models
- N. L. Zhang, T. D. Nielsen, and F. V. Jensen
(2002). Latent variable discovery in
classification models. Artificial Intelligence in
Medicine, to appear. - Capture dependence among attributes using latent
variables - Detect interesting latent structures besides
classification - Currently, slow