Designing Mestimators for expression analysis: PLIER

1 / 69
About This Presentation
Title:

Designing Mestimators for expression analysis: PLIER

Description:

Widely agreed that replicate observations of probes (PM,MM) are approximately log-normal ... (e2) ('log transform')- no solution for MM PM, always worse fit than ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 70
Provided by: earlhu
Learn more at: http://mbi.osu.edu

less

Transcript and Presenter's Notes

Title: Designing Mestimators for expression analysis: PLIER


1
Designing M-estimators for expression analysis
PLIER
  • Earl Hubbell
  • Principal Statistician
  • Affymetrix

2
Outline
  • Drive our intuition (basic data)
  • Formalize the intuition
  • Check functionality
  • Look at results
  • Bonus tricks and stunts

3
Wafers, Chips, and Features
Chips / wafer
4
Expression Probes
Sequence
Probes
Perfect Match Mismatch
Chip
5
Components of Stray Signal
6
Components of Bound Target Signal and Noise
7
Hybridization is mostly linear, with some stray
signal saturation
8
One probe (pair) PM-MM reduces bias
9
Probes not very informative about concentration
near background!
10
Probes have systematic differences
11
Affinity compensates for first-order probe
differences
12
Likelihood summarizes knowledge of expression
13
A pause before jumping into equations
  • statistics, whatever their mathematical
    sophistication and elegance, cannot make bad
    variables into good ones.
  • H.T. Reynolds, Analysis of Nominal Data

14
Fun with Statistics
  • Money What should I estimate?
  • M-estimators Statistics by Optimization
  • Model Linking Intensity to Concentration
  • Mismatches Faking Subtraction
  • Mayhem Does it work?
  • More Tricks!

15
Estimator Goals
  • Handle zero/near-zero concentrations
  • Handle arithmetic noise at low end
  • Minimum bias (avoid sample trouble)
  • can always variance stabilize later
  • Resist outliers
  • Avoid lots of parameters!

16
How to estimate?
  • (-5373473)/3 280.3 (Mean)
  • 280.3 is the value minimizing (x5)2(x-373)2
    (x-473)2
  • median(-5,373,473) 373
  • 373 is the value minimizing x5x-373x-473

17
M-estimator
  • Optimizes some function of the data sum(f(y,xi))
    for y
  • y is then an estimate of some interesting
    property of the data (we hope)
  • Looks like Maximum Likelihood estimates (but
    can tune for utility)

18
Designing the M-estimator PLIER
  • M-estimator minimizes some function of the data
    and the estimator(s)
  • Our case sum( f(PM,MM, a,c,z) )
  • Choose f to model reasonable error
  • Choose tail of f to handle outliers
  • PLIER Probe Logarithmic Intensity ERror

19
Assumptions (approximations?)
  • Concentration never negative! cgt0
  • linear link between true signal concentration
    Tac
  • Background (not constant) adds to signal I
    TB
  • Background same for PM and MM
  • Multiplicative intensity error log(I)
    normal(log(TB),s2)

20
Assumption Multiplicative Error
  • Widely agreed that replicate observations of
    probes (PM,MM) are approximately log-normal
  • I.e. PM varies by 10 of PM
  • Does not imply that derived quantities (PM-MM or
    PM-B) are also log-normal!
  • I.e. PM-MM varies by 7 of (PMMM) not by 10 of
    (PM-MM)!

21
No obvious need for arithmetic noise for raw
intensities
22
Simplified model PM-MM
  • PM acMM
  • MM a2cB
  • If B can vary wildly (experiment-experiment,
    probe-probe) , left with
  • PM-MM ac
  • Incorporating multiplicative error
  • e1PM-e2MM ac

23
Key concept good fits have small multiplicative
errors
  • Trying to minimize log(e1)2log(e2)2
  • The actual minimum is a complicated function, so
    we(I) dont want to solve for it
  • And we dont have to - M-estimators can be chosen
    for computational convenience
  • Therefore, let log(e1)2log(e2)2

24
How good is the fit? 2 possible
  • log(e1)log(e2) (log transform)- no solution
    for MMgtPM, always worse fit than
  • log(e1)-log(e2) (PLIER)
  • gt e acsqrt((ac)24PMMM)/2PM
  • log(e) exists for any PM,MMgt0, any a,c
  • effective error model changes from arithmetic
    near zero to multiplicative far from zero

25
PM - MM Goodness of fit
26
Define center of f
  • Residual rlog(e)
  • Under log-normal assumption, fit for least r2
  • But we should fix the tails (where outliers show
    up, and the approximation breaks down)

27
Robustness
  • Want to discount outliers compared to
    sum-of-squares
  • Off-the-shelf Geman-McClure transformation
    f(r,z) r2/(1r2/z)
  • Looks like least-squares for r small
  • bounds influence of residual to at most z

28
Transformation f(r) and its Influence Function
29
PLIER Goodness of fit for 11 probe pairs
30
PLIER on a t-shirt
  • y ac
  • e ysqrt(y24PMMM)/2PM
  • r log(e)
  • f(r,z) r2/(1r2/z)
  • argmin(sum(f(r,z))) over all a,c gt0
  • yields PLIER estimate of affinity and
    concentration

31
Optimizing (finding minima)
  • Many ways to find best fit
  • Easiest to explain is cyclic coordinate ascent
    aka polishing the data
  • Can start anywhere (but best to start with a good
    guess)

32
Finding affinity/concentration Dont I need to
know one to start?
33
Observed PM/MM values
34
Compare observed to predicted (find where to
improve predictions)
35
Polishing the table
  • Guess initial values (a 1.0, c0)
  • Find best concentrations (with current
    affinities)
  • Find best affinities (for current concentrations)
  • Repeat until minimized (or bored)
  • remember values non-negative!

36
How does it work on real data?
  • Gold-standard data generated by spiking in known
    transcript
  • Example is one of the transcripts (6th)
  • Look at residuals to find outliers

37
Latin Square Experimental Design
Spike Groups
Exp
Affymetrix Confidential
38
Model fitA1.0, C0.0
C-values
A-values
39
Residuals Fit Concentration
40
Residuals Fit Probes
41
Fit concentration and affinities - data fits
except for outliers clearly revealed
42
What are the outliers?
43
Final results (value)
44
Know everything (approx)
1.0 2.0 4.0
Likelihood
0 .25 .5
Estimate
Estimate
45
Trick P/A calls by fit
46
Trick models are good for residuals!
47
Optimization (harder to illustrate)
  • Current implementation uses descent optimization
    (Newton-like)
  • Start with a good initial guess (median polish)
  • Improve by descent
  • Try jumps to escape local minima

48
Evaluating performance
  • MvA plots (unbiased/biased)
  • Receiver Operating Characteristic (ROC)
  • Area Under Curve (AUC) (global/stratified)
  • Benchmark results

49
MvA plots
  • Scatterplot turned 45
  • Plotting A vs B
  • M log(A)-log(B)
  • A (log(A)log(B))/2 average
  • Allows easy visualization of changes

50
MVA (bias added for stabilization)
51
Receiver Operating Characteristic
  • ROC curves measure separation of distributions
    for two states
  • Changed or unchanged between pair(s) of
    experiments
  • Depends on the variation of the signal within an
    experiment, and the separation between the two
    states
  • Note that just measuring variation or just
    measuring separation can be misleading!
  • One popular method of defining changed is a
    fold-change threshold
  • ROC curves can be summarized by area under
    curve

52
Overall performance good (ROC)
53
Specific performance regimes are of interest
  • Low, medium, high concentrations
  • Relatively small fold-changes (2-fold, 4-fold)
  • Thresholds defined by fold-change
  • Thresholds defined by change relative to
    variation (t-like statistic)

54
PLIER (16) has excellent discrimination at the
low end for differential change
FC
TSTAT
55
Output characteristics of some standard methods
  • MAS 5.0 Not variance stabilized(), some bias,
    runs on single chips
  • PLIER Not variance stabilized(), minimal bias,
    reduced variance, runs on multiple chips
  • RMA Variance stable, noticeable bias, low
    variance, runs on multiple chips
  • ()Can always apply stabilizing transformation

56
PLIER unbiased - Can always add bias to
stabilize variance
57
MVA plots show the distribution of variation in
the data, but also show (some) of the compression
of the dynamic range
58
Drawing fold-change lines leads to ROC curves
which are heavily biased towards results with
stable variance
59
Differential change can be detected using the
variation in the data by intensity
60
Thresholds based on the variance (t-stat analog)
yield a better estimate of the relative
performance of methods
61
Works fine on U133 too
62
Bonus M-estimator tricks
  • Handle PM-only (PMMM, PM-B) just fine by
    replacing error model in f
  • Play Bayesian games (affinity penalties,
    concentration penalties)

63
M-estimator PM only
  • PM-B ac
  • background estimate perfect
  • ePM-B ac
  • e (acB)/PM
  • proceed in the same framework using e
  • Note that B can be zero for (acgt0)

64
PM-only global background biased
65
Can play Bayesian games
  • Probe affinities likely to be log-normal
    distributed
  • Add a penalty term to avoid overweighting any
    single probe
  • Good when insufficient data
  • sum(log(e)2) (penalty)sum(log(a)2)
  • Can do the same for concentration

66
Bayesian prior on probes
Avoid overweighting probes in problem cases
Doesnt affect real data
67
PLIER
  • M-estimators form a very flexible framework for
    analysis
  • Can handle PM-B, PM-MM, PM-only approaches in
    same framework
  • Handles zero/near-zero concentration affinities
    in model directly
  • Seems to produce good results

68
PLIER obtaining an implementation
  • PLIER algorithm SDK is now available under a GPL
    open source license.  
  • The code is available as C without windows
    dependencies.  Documentation is included at the
    site. 
  • All of us at Affymetrix hope that releasing PLIER
    in this manner promotes all of the values that
    the Bioconductor community embraces.
  • http//www.affymetrix.com/support/developer/index.
    affx

69
Thanks
  • David Kulp
  • Sejal Shah
  • Simon Cawley
  • David Finkelstein
  • Mike Lelivelt
  • Teresa Webster
  • Rui Mei
  • Suzanne Dee
  • Stefan Bekiranov
  • Xiaojun Di
  • Alex Cheung
  • Steve Lincoln
  • Many, many others!
Write a Comment
User Comments (0)