Designing Mestimators for expression analysis: PLIER

1 / 69

About This Presentation

Title:

Designing Mestimators for expression analysis: PLIER

Description:

Widely agreed that replicate observations of probes (PM,MM) are approximately log-normal ... (e2) ('log transform')- no solution for MM PM, always worse fit than ... – PowerPoint PPT presentation

Number of Views:61

Avg rating:3.0/5.0

Slides: 70

Provided by: earlhu

Learn more at: http://mbi.osu.edu

more less

Transcript and Presenter's Notes

Title: Designing Mestimators for expression analysis: PLIER

1
Designing M-estimators for expression analysis
PLIER

Earl Hubbell
Principal Statistician
Affymetrix

2
Outline

Drive our intuition (basic data)
Formalize the intuition
Check functionality
Look at results
Bonus tricks and stunts

3
Wafers, Chips, and Features
Chips / wafer
4
Expression Probes
Sequence
Probes
Perfect Match Mismatch
Chip
5
Components of Stray Signal
6
Components of Bound Target Signal and Noise
7
Hybridization is mostly linear, with some stray
signal saturation
8
One probe (pair) PM-MM reduces bias
9
Probes not very informative about concentration
near background!
10
Probes have systematic differences
11
Affinity compensates for first-order probe
differences
12
Likelihood summarizes knowledge of expression
13
A pause before jumping into equations

statistics, whatever their mathematical
sophistication and elegance, cannot make bad
variables into good ones.
H.T. Reynolds, Analysis of Nominal Data

14
Fun with Statistics

Money What should I estimate?
M-estimators Statistics by Optimization
Model Linking Intensity to Concentration
Mismatches Faking Subtraction
Mayhem Does it work?
More Tricks!

15
Estimator Goals

Handle zero/near-zero concentrations
Handle arithmetic noise at low end
Minimum bias (avoid sample trouble)
can always variance stabilize later
Resist outliers
Avoid lots of parameters!

16
How to estimate?

(-5373473)/3 280.3 (Mean)
280.3 is the value minimizing (x5)2(x-373)2
(x-473)2
median(-5,373,473) 373
373 is the value minimizing x5x-373x-473

17
M-estimator

Optimizes some function of the data sum(f(y,xi))
for y
y is then an estimate of some interesting
property of the data (we hope)
Looks like Maximum Likelihood estimates (but
can tune for utility)

18
Designing the M-estimator PLIER

M-estimator minimizes some function of the data
and the estimator(s)
Our case sum( f(PM,MM, a,c,z) )
Choose f to model reasonable error
Choose tail of f to handle outliers
PLIER Probe Logarithmic Intensity ERror

19
Assumptions (approximations?)

Concentration never negative! cgt0
linear link between true signal concentration
Tac
Background (not constant) adds to signal I
TB
Background same for PM and MM
Multiplicative intensity error log(I)
normal(log(TB),s2)

20
Assumption Multiplicative Error

Widely agreed that replicate observations of
probes (PM,MM) are approximately log-normal
I.e. PM varies by 10 of PM
Does not imply that derived quantities (PM-MM or
PM-B) are also log-normal!
I.e. PM-MM varies by 7 of (PMMM) not by 10 of
(PM-MM)!

21
No obvious need for arithmetic noise for raw
intensities
22
Simplified model PM-MM

PM acMM
MM a2cB
If B can vary wildly (experiment-experiment,
probe-probe) , left with
PM-MM ac
Incorporating multiplicative error
e1PM-e2MM ac

23
Key concept good fits have small multiplicative
errors

Trying to minimize log(e1)2log(e2)2
The actual minimum is a complicated function, so
we(I) dont want to solve for it
And we dont have to - M-estimators can be chosen
for computational convenience
Therefore, let log(e1)2log(e2)2

24
How good is the fit? 2 possible

log(e1)log(e2) (log transform)- no solution
for MMgtPM, always worse fit than
log(e1)-log(e2) (PLIER)
gt e acsqrt((ac)24PMMM)/2PM
log(e) exists for any PM,MMgt0, any a,c
effective error model changes from arithmetic
near zero to multiplicative far from zero

25
PM - MM Goodness of fit
26
Define center of f

Residual rlog(e)
Under log-normal assumption, fit for least r2
But we should fix the tails (where outliers show
up, and the approximation breaks down)

27
Robustness

Want to discount outliers compared to
sum-of-squares
Off-the-shelf Geman-McClure transformation
f(r,z) r2/(1r2/z)
Looks like least-squares for r small
bounds influence of residual to at most z

28
Transformation f(r) and its Influence Function
29
PLIER Goodness of fit for 11 probe pairs
30
PLIER on a t-shirt

y ac
e ysqrt(y24PMMM)/2PM
r log(e)
f(r,z) r2/(1r2/z)
argmin(sum(f(r,z))) over all a,c gt0
yields PLIER estimate of affinity and
concentration

31
Optimizing (finding minima)

Many ways to find best fit
Easiest to explain is cyclic coordinate ascent
aka polishing the data
Can start anywhere (but best to start with a good
guess)

32
Finding affinity/concentration Dont I need to
know one to start?
33
Observed PM/MM values
34
Compare observed to predicted (find where to
improve predictions)
35
Polishing the table

Guess initial values (a 1.0, c0)
Find best concentrations (with current
affinities)
Find best affinities (for current concentrations)
Repeat until minimized (or bored)
remember values non-negative!

36
How does it work on real data?

Gold-standard data generated by spiking in known
transcript
Example is one of the transcripts (6th)
Look at residuals to find outliers

37
Latin Square Experimental Design
Spike Groups
Exp
Affymetrix Confidential
38
Model fitA1.0, C0.0
C-values
A-values
39
Residuals Fit Concentration
40
Residuals Fit Probes
41
Fit concentration and affinities - data fits
except for outliers clearly revealed
42
What are the outliers?
43
Final results (value)
44
Know everything (approx)
1.0 2.0 4.0
Likelihood
0 .25 .5
Estimate
Estimate
45
Trick P/A calls by fit
46
Trick models are good for residuals!
47
Optimization (harder to illustrate)

Current implementation uses descent optimization
(Newton-like)
Start with a good initial guess (median polish)
Improve by descent
Try jumps to escape local minima

48
Evaluating performance

MvA plots (unbiased/biased)
Receiver Operating Characteristic (ROC)
Area Under Curve (AUC) (global/stratified)
Benchmark results

49
MvA plots

Scatterplot turned 45
Plotting A vs B
M log(A)-log(B)
A (log(A)log(B))/2 average
Allows easy visualization of changes

50
MVA (bias added for stabilization)
51
Receiver Operating Characteristic

ROC curves measure separation of distributions
for two states
Changed or unchanged between pair(s) of
experiments
Depends on the variation of the signal within an
experiment, and the separation between the two
states
Note that just measuring variation or just
measuring separation can be misleading!
One popular method of defining changed is a
fold-change threshold
ROC curves can be summarized by area under
curve

52
Overall performance good (ROC)
53
Specific performance regimes are of interest

Low, medium, high concentrations
Relatively small fold-changes (2-fold, 4-fold)
Thresholds defined by fold-change
Thresholds defined by change relative to
variation (t-like statistic)

54
PLIER (16) has excellent discrimination at the
low end for differential change
FC
TSTAT
55
Output characteristics of some standard methods

MAS 5.0 Not variance stabilized(), some bias,
runs on single chips
PLIER Not variance stabilized(), minimal bias,
reduced variance, runs on multiple chips
RMA Variance stable, noticeable bias, low
variance, runs on multiple chips
()Can always apply stabilizing transformation

56
PLIER unbiased - Can always add bias to
stabilize variance
57
MVA plots show the distribution of variation in
the data, but also show (some) of the compression
of the dynamic range
58
Drawing fold-change lines leads to ROC curves
which are heavily biased towards results with
stable variance
59
Differential change can be detected using the
variation in the data by intensity
60
Thresholds based on the variance (t-stat analog)
yield a better estimate of the relative
performance of methods
61
Works fine on U133 too
62
Bonus M-estimator tricks

Handle PM-only (PMMM, PM-B) just fine by
replacing error model in f
Play Bayesian games (affinity penalties,
concentration penalties)

63
M-estimator PM only

PM-B ac
background estimate perfect
ePM-B ac
e (acB)/PM
proceed in the same framework using e
Note that B can be zero for (acgt0)

64
PM-only global background biased
65
Can play Bayesian games

Probe affinities likely to be log-normal
distributed
Add a penalty term to avoid overweighting any
single probe
Good when insufficient data
sum(log(e)2) (penalty)sum(log(a)2)
Can do the same for concentration

66
Bayesian prior on probes
Avoid overweighting probes in problem cases
Doesnt affect real data
67
PLIER

M-estimators form a very flexible framework for
analysis
Can handle PM-B, PM-MM, PM-only approaches in
same framework
Handles zero/near-zero concentration affinities
in model directly
Seems to produce good results

68
PLIER obtaining an implementation

PLIER algorithm SDK is now available under a GPL
open source license.
The code is available as C without windows
dependencies. Documentation is included at the
site.
All of us at Affymetrix hope that releasing PLIER
in this manner promotes all of the values that
the Bioconductor community embraces.
http//www.affymetrix.com/support/developer/index.
affx

69
Thanks