Supervised Learning

About This Presentation

Title:

Supervised Learning

Description:

... Support Vector Machines, Decision Trees. Identify relationships between the features and ... Decision Trees (Quinlan 1996) Can interpret predictions of ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 33

Provided by: michae327

Learn more at: https://pages.cs.wisc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Supervised Learning

1
Lecture 6

Supervised Learning

2
Unsupervised Learning

Learning From Unlabeled Data
Clustering, Correlation, PCA
Identify relationships between the features

3
Supervised Learning

Learning From Labeled Data
Neural Networks, Support Vector Machines,
Decision Trees
Identify relationships between the features and
the categories

4
Supervised Learning

Given
Examples whose feature values are known and
Whose categories are known
Do
Predict the categories of examples whose feature
values are known but whose categories are not

5
4 ways of representing gene-chip data
From Molla, et. aI AI Magaizine 25, 2004
6
Typical Methodology

N-Fold Cross validation
Split labeled data into N (usually 10) folds
Train on All but 1 (N-1) folds of the data
Test on the left-out fold (ignoring the category
label)
Repeat until all N folds have been tested
Note This methodology can be (and is) used to
test predictive different statistical models as
well.

7
Example

Probe Picking

8
Oligonucleotide Microarrays

Specific probes synthesized atknown spot on
chips surface
Probes complementary to RNA of genes to be
measured
Typical gene (1kb) MUCH longer than typical
probe (24 bases)

9
Probes Good vs. Bad
Blue Probe
Red Sample
good probe
bad probe
10
Probe-Picking Method Needed

Hybridization characteristics differ between
probes
Probe set represents very small subset of gene
Accurate measurement of expression requires good
probe set

11
Related Work

Use known hybridization characteristics
Lockhardt et al. 1996
Melting point (Tm) predictions
Kurata and Suyama 1999
Li and Stormo 2001
Stable secondary structure
Kurata and Suyama 1999

12
Our Approach

Apply established machine-learning algorithms
Train on categorized examples
Test on examples with category hidden
Choose features to represent probes
Categorize probes as good or bad

13
The Features
14
The Data

Tilings of 8 genes (from E. coli B. subtilus)
Every possible probe (10,000 probes)
Genes known to be expressed in sample

Gene Sequence GTAGCTAGCATTAGCATGGCCAGTCATG
Complement CATCGATCGTAATCGTACCGGTCAGTAC
Probe 1 CATCGATCGTAATCGTACCGGTCA
Probe 2 ATCGATCGTAATCGTACCGGTCAG
Probe 3 TCGATCGTAATCGTACCGGTCAGT

15
Our Microarray
16
Defining our Categories
Low Intensity BAD Probes (45)
Mid-Intensity Not Used in Training Set (23)
High Intensity GOOD Probes (32)
Frequency
0
.05
.15
1.0
Normalized Probe Intensity
17
The Machine Learning Techniques

Naïve Bayes (Mitchell 1997)
Neural Networks (Rumelhart et al. 1995)
Decision Trees (Quinlan 1996)
Can interpret predictions of each learner
probabilistically

18
Naïve Bayes

Assumes conditional independence between features
Make judgments about test set examples based on
conditional probability estimates made on
training set

19
Naïve Bayes
For each example in the test set, evaluate the
following
20
Neural Network(1-of-n encoding with probe length
3)
Weights
A1 C1 G1 T1
Example probe sequence CAG
Good or Bad

A2 C2 G2 T2
ACTIVATION
A3 C3 G3 T3
ERROR

21
Decision Tree
fracC
High
Low
fracT
fracG
Automatically builds a tree of rules
High
Low

fracTC
Low
High
Low
High
fracG

fracAC

High
Low
Low
High
n14

Good Probe
C
G
T
A

Bad Probe
22
Decision Tree
The information gain of a feature, F, is
23
Information Gain per Feature
Probe Composition Features
Normalized Information Gain
Base Position Features
Base Position
Dimer Position
24
Cross-Validation

Leave-one-out testing
For each gene (of the 8)
Train on all but this gene
Test on this gene
Record result
Forget what was learned
Average results across 8 test genes

25
Typical Probe-Intensity Prediction Across Short
Region
Actual
Normalized Probe Intensity
Starting Nucleotide Position for 24-mer Probe
26
Typical Probe-Intensity Prediction Across Short
Region
Neural Network
Naïve Bayes
Decision Tree
Actual
Normalized Probe Intensity
Starting Nucleotide Position for 24-mer Probe
27
Probe-Picking Results
Perfect Selector
Number of probes selected with intensity gt 90th
percentile
Number of probes selected
28
Probe-Picking Results
Perfect Selector
Neural Network
Number of probes selected with intensity gt 90th
percentile
Naïve Bayes
Primer Melting Point
Decision Tree
Number of probes selected
29
A couple of final notes for the class
30
What is Principal Component Analysis (PCA)?in 1
silde
31
PCA Tutorial