Analysis of time-course gene expression data - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

Analysis of time-course gene expression data

Description:

Analysis of time-course gene expression data Shyamal D. Peddada Biostatistics Branch National Inst. Environmental Health Sciences (NIH) Research Triangle Park, NC – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 69
Provided by: unitsMuoh8
Learn more at: https://miamioh.edu
Category:

less

Transcript and Presenter's Notes

Title: Analysis of time-course gene expression data


1
Analysis of time-course gene expression data
  • Shyamal D. PeddadaBiostatistics Branch
  • National Inst. Environmental
  • Health Sciences (NIH)Research Triangle Park, NC

2
Outline of the talk
  • Some objectives for performing long series
    time-course experiments
  • Single cell-cycle experiment
  • A nonlinear regression model
  • Phase angle of a cell cycle gene
  • Inference
  • Open research problems
  • Multiple cell-cycle experiments
  • Coherence between multiple cell-cycle
    experiments
  • Illustration
  • Open research problems

3
Objectives
  • Some genes play an important role during the
    cell division cycle process. They are known as
    cell-cycle genes.
  • Objectives Investigate various characteristics
    of cell-cycle and/or circadian genes such as
  • Amplitude of initial expression
  • Period
  • Phase angle of expression (angle of maximum
    expression for a cell cycle gene)

4
Phases in cell division cycle
5
A brief description
  • G1 phase
  • "GAP 1". For many cells, this phase is the
    major period of cell growth during its lifespan.
  • S ("Synthesis) phase
  • DNA replication occurs.

6
A brief description
  • G2 phase
  • "GAP 2 Cells prepare for M phase. The G2
    checkpoint prevents cells from entering mitosis
    when DNA was damaged since the last division,
    providing an opportunity for DNA repair and
    stopping the proliferation of damaged cells.
  • M (Mitosis) phase
  • Nuclear (chromosomes separate) and cytoplasmic
    (cytokinesis) division occur. Mitosis is further
    divided into 4 phases.

7
Single, long series experiment
8
Whitfield et al. (Molecular Biology of the Cell,
2002)
  • Basic design is as follows
  • Experimental units Human cancer cells (HeLa)
  • Microarray platform cDNA chips used with approx
    43000 probes (i.e. roughly 29000 genes)
  • 3 different patterns of time points (i.e. 3
    different experiments)
  • One of the goals of these experiments was to
    identify periodically expressed genes.

9
Whitfield et al. (Molecular Biology of the Cell,
2002)
  • Experiment 1 (26 time points)
  • Hela cancer cells arrested in the S-phase using
    double thymidine block.
  • Sampling times after arrest (hrs)
  • 0 1 2 3 4 5 6 7 8 9 10 11 12 14 15 16 18 20 22
    24 26 28 32 36 40 44.

10
Whitfield et al. (2002)
  • Experiment 2 (47 time points)
  • Hela cancer cells arrested in the S-phase using
    double thymidine block.
  • Sampling times after arrest (hrs)
  • every hour between 0 and 46.

11
Whitfield et al. (2002)
  • Experiment 3 (19 time points)
  • Hela cancer cells arrested arrested in the
    M-phase using thymidine and then by nocodazole.
  • Sampling times after arrest (hrs)
  • 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
    36.

12
Whitfield et al. (2002)Phase marker genes
  • Cell Cycle Phase Genes
  • ------------------ -------
  • G1/S CCNE1, CDC6, PCNA,E2F1
  • S RFC4, RRM2
  • G2 CDC2, TOP2A, CCNA2, CCNF
  • G2/M STK15, CCNB1, PLK, BUB1
  • M/G1 VEGFC, PTTG1, CDKN3, RAD21

13
Questions
  • Can we describe the gene expression of a
    cell-cycle gene as a function of time?
  • Can we determine the phase angle for a given
    cell-cycle gene? i.e. can we quantify the
    previous table in terms of angles on a circle?
  • What is the period of expression for a given
    gene?
  • Can we test the hypothesis that all cell-cycle
    genes share the same time period?
  • Etc.

14
Profile of PCNA based on experiment 2 data
15
Some important observations
  1. Gene expression has a sinusoidal shape
  2. Gene expression for a given gene is an average
    value of mRNA levels across a large number of
    cells
  3. Duration of cell cycle varies stochastically
    across cells
  4. Initially cells are synchronized but over time
    they fall out of synchrony
  5. Gene expression of a cell-cycle gene is expected
    to decrease/decay over time. This is because
    of items 2 and 4 listed above!

16
Random Periods Model (PNAS, 2004)
  • a and b background drift parameters
  • K the initial amplitude
  • T the average period
  • the attenuation parameter
  • the phase angle

17
Fitted curves for some phase marker genes
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Whitfield et al. (2002)Phase marker genes
  • Phase Genes Phase angles (radians)
  • -------- ------- ------------------------
  • G1/S CCNE1, CDC6, PCNA,E2F1 0.56,
    5.96, 5.87, 5.83
  • S RFC4, RRM2 5.47, 5.36
  • G2 CDC2, TOP2A, CCNA2, CCNF 4.24, 3.74, 3.55,
    3.25
  • G2/M STK15, CCNB1, PLK, BUB1 3.06,
    2.67, 2.61, 2.51
  • M/G1 VEGFC, PTTG1, CDKN3, RAD21 2.66, 2.40,
    2.25, 1.81

22
A hypothesis of biological interest
  • Do all cell cycle genes have same T and same
    but the other 4 parameters are gene specific?
  • i.e.

23
An Important Feature
  • Correlated data
  • Temporal correlation within gene
  • Gene-to-gene correlations

24
Test Statistic
  • Wald statistic for heteroscedastic linear and
    non-linear models
  • Zhang, Peddada and Rogol (2000)
  • Shao (1992)
  • Wu (1986)

25
The Null Distribution
  • Due to the underlying correlation structure
  • Asymptotic approximation is not
    appropriate.
  • Use moving-blocks bootstrap technique on the
    residuals of the nonlinear model.
  • Kunsch (1989)

26
Moving-blocks Bootstrap
  • Step 1 Fit the null model to the data and
    compute the residuals.
  • Step 2 Draw a simple random sample (with
    replacement) from all possible blocks , of a
    specific size, of consecutive residuals.

27
Moving-blocks Bootstrap
  • Step 3 Add these residuals to the fitted curve
    under the null hypothesis to obtain the bootstrap
    data set
  • Step 4 Using the bootstrap data fit the model
    under the alternate hypothesis and compute the
    Wald statistic.

28
Moving-blocks Bootstrap
  • Step 5 Repeat the above steps a large number of
    times.
  • Step 6 The bootstrap p-value is the proportion
    of the above Wald statistics that exceed the Wald
    statistic determined from the actual data.

29
Analysis of experiment 2
  • The bootstrap p-value for testing
  • using Experiment 2 data of Whitfield et al.
    (2002) is 0.12.
  • Thus our model is biologically plausible.

30
Statistical inferences on the phase angle
Multiple experiments
31
Some questions of interest
  • How to evaluate or combine results from multiple
    cell division cycle experiments?
  • Are the results consistent across experiments?
  • How to evaluate this?
  • What could be a possible criterion?

32
Data
  • RPM estimate of phase angle of a
    cell-cycle gene g
  • from the experiment.

33
Representation using a circle
  • Consider 4 cell cycle genes A, B, C, D. The
    vertical line in the circle denotes the reference
    line. The angles are measured in a
    counter-clockwise.
  • Thus the sequential order
  • of expression in this
  • example is A, B, D, C.

A
B
C
D
34
Coherence in multiple cell-cycle experiments
  • A group of cell cycle genes are said to be
    coherent across experiments if their sequential
    order of the phase angles is preserved across
    experiments.

B
A
D
B
Exp 2
D
A
C
D
C
C
Exp 3
B
A
Exp 1
35
Geometric Representation
  • We shall represent phase angles from multiple
    cell cycle experiments using concentric circles.
  • Each circle represents an experiment.
  • Same gene from a pair of experiments is connected
    by a line segment.
  • A figure with non-intersecting lines indicates
    perfect coherence.
  • If there is no coherence at all then there will
    be many intersecting lines.

36
Example Perfectly Coherent
37
Example Perfectly Coherent
38
Example No coherence
39
Estimated Phase Angles
  • Due to statistical errors in estimation, the
    estimated phase angles from multiple cell cycle
    experiments need not preserve the sequential
    order even though the true phase angles are in a
    sequential order.

40
How to evaluate coherence?
41
Some background on regression for circular data
42
Experiment B
Experiment A
Question Can we determine a rotation matrix A
such that we can rotate the circle representing
Experiment A to obtain the circle representing
Experiment B?
43
Angle of rotation for a rigid body
  • Yes! By solve the following minimization problem

44
Determination of Coherence Across k Experiments
45
The Basic Idea
  • Consider a rigid body rotating in a plane.
    Suppose the body is perfectly rigid with no
    deformations.
  • Let denote the 2x2 rotation
    matrices from
  • experiment i to i1 (k1 1). Then
  • Alternatively

46
The Basic Idea
  • Equivalently, if
  • Then under perfect rigid body motion we should
    have

47
Problem!
  • In the present context we do NOT necessarily have
    a rigid body!
  • Not all experiments are performed with same
    precision.
  • The time axis may not be constant across
    experiments.
  • Number of time points may not be same across
    experiments.
  • Etc.

48
Example Not a rigid motion but perfectly
coherent
49
Consequence
  • Rotation matrix A alone may not be enough to
    bring two circles to congruence!
  • An additional association/scaling parameter may
    be needed as see in the previous figure!

50
Circular-Circular regression model for a pair of
experiments (Downs and Mardia, 2002)
  • For , let
    denote a pair of
  • angular variables.
  • Suppose is von-Mises distributed
    with
  • mean direction and concentration
    parameter

51
Circular-Circular Regression Model (Downs and
Mardia, 2002)
The regression model is given by the link function
52
Back to the toy examples
53
(No Transcript)
54
(No Transcript)
55
Determination Of Coherence
  • Suppose we have K experiments, labeled as
  • 1, 2, 3, , K. Let denote the angle of
    rotation
  • for the regression of i on j for a group of g
    genes.
  • Compute
  • Note .

56
Determination Of Coherence
  • We expect under no coherence
  • to be stochastically larger than
  • under coherence.

57
Comparison of Cumulative Distribution Functions
Blue line Coherence Pink line No Coherence
58
Determination Of Coherence
  • For a given data compute
  • Generate the bootstrap distribution of
  • under the null hypothesis of no coherence.

59
Bootstrap P-value For Coherence
  • Let denote the angle of rotation
    using
  • the bootstrap sample. Then the P-value is

60
Illustration Whitfield et al. data
  • There are 3 experiments. The phase angles of each
    gene was estimated using Liu et al., (2004)
    model.
  • A total of 47 common cell-cycling genes were
    selected from the three experiments.

61
Estimates
  • The estimated values of interest are
  • Note that

62
Conclusion
  • Since the bootstrap P-value lt 0.05, we conclude
    that the three experiments are coherent.

63
(No Transcript)
64
Statistical inferences on the phase angle- Some
open problems
65
Estimation subject to inequality constraints
  • It is reasonable to hypothesize that for a normal
    cell division cycle, the p phase marker genes
    must express in an order around the unit circle.
  • Thus they must satisfy

66
Open problems- data from single experiment
  • How to estimate the phase angles subject to the
    simple order restriction?
  • More generally - wow to estimate the phase angles
    subject isotropic simple order restriction?
  • How to test the above hypothesis? What are the
    null and alternative hypotheses?

67
Open problems data from multiple experiments
  • How do we estimate the phase angles from multiple
    experiments under the order restriction on the
    phase angles of cell cycle genes?
  • What are the statistical errors associated with
    such an estimator?
  • How to construct confidence intervals and test
    hypotheses?

68
Acknowledgments
  • Delong Liu (former Post-doc at NIEHS)
  • David Umbach (NIEHS)
  • Leping Li (NIEHS)
  • Clare Weinberg (NIEHS)
  • Pat Crocket (Constella Group)
  • Cristina Rueda (Univ. of Valladolid, Spain)
  • Miguel Fernandez (Univ. of Valladolid, Spain)
Write a Comment
User Comments (0)
About PowerShow.com