Title: Differential Gene Expression: Ischemic vs. Nonischemic
1Differential Gene Expression Ischemic vs.
Nonischemic
- Jing Hu
- Dongmei Li
- Shuyan Wan
- Richard Yamada
- Jeong-Me Yoon
- Zailong Wang (Mentor)
2Outline for Our Talk
- Introduction and summary of previous work
(Richard) - Exploratory Analysis of Data (Jeong-Mi)
- Statistical Methods (Shuyan)
- Selected Gene Analysis (Jing)
- Conclusions and Further Work (Dongmei)
3Human Heart Function
4Arteries
5What is Ischemic Cardiomyopathy?
- Ischemic Lack of Blood and Oxygen
- Cardio Refers to the Heart
- Myopathy Muscle Related Disease
- ischemic cardiomyopathy is a medical term that
doctors use to describe patients who have
congestive heart failure that is a result of
coronary artery disease. (coronary arteries are
blocked)
6Ischemic Cardiac Myopathy
- Risk Factors genetics, smoking, high fat diet,
obesity, and prior heart problems - Incidence 1 in 100, typically male, starting
with middle age - Symptoms include chest pain, shortness of
breath, irregular/rapid pulse, and sensation of
feeling the heart beat - Treatment Regimens ACE inhibitors, beta
blockers, angioplasty (to improve blood flow to
the damaged or weakened heart muscle), and heart
transplant (severe cases)
7The Basic Scientific Question
- What kinds of changes occur in cardiac
transcription profiles brought about by heart
failure? - 2 ways to go about attacking the question
Molecular Biology (hypothesis based) vs High
Thru-put techniques (i.e. microarrays followed by
confirmation of gene expression with qPCR)
8Differential Expression between ischemic and
non-ischemic cardiomyopathy patients
- Gene expression analysis of ischemic and
nonischemic cardiomyopathy shared and distinct
gene in the development of heart failure - M. Kittleson, K. Minhas, R. Irizarry, S. Ye,
G. Edness, E. Breton, J. Conte, G. Tamselli, J.
Garcia, and J. Hare. Physiol. Genomics,
21299-307, 2005
9Methods of Kittleson et al
- 31 cardiomyopathy vs. 6 normal patients (clinical
characteristics were reasonably similar within
groups) - Tissue taken from cardio-myopathy patients at the
time of LVAD or cardiac transplantation - Identified differentially expressed genes in 2
comparisons NICM (hypertrophic, valvular,
alcholic) vs NF hearts and ICM vs NF using
significance analysis of microarrays - Identified genes with FDR lt 5 and absolute fold
change greater than 2.0
10Conclusions of Kittleson et al
- No hypothesis, but the microarray experiment was
used to generate hypothesis - Types of genes differentially expressed (41
total) cell growth maintenance(9), signal
transduction(7), metabolism(3), cell
adhesion/cell communication(2), binding(2), and
catalytic activity(2), nucleus(3), other (13)
11Conclusions of Kittleson et al
- Predominance of fatty acid metabolic genes
genesis of NICM might be metabolic in nature - Predominance of abnormalities in catalytic
activity with ICM (serine proteinase inhibitors) - TNFRSF11B (member of TNF receptor subfamily) is
significantly downregulated in ICM
12Experimental Procedure for Data that We are Using
- Collected myocardial samples from patients
undergoing cardiac transplantation whose failure
arises from ischemic cardiomyopathy and from
"normal" organ donors whose hearts cannot be used
for transplants - The transcriptional profile of the mRNA in these
samples was measured with gene array technology. - Changes in transcriptional profiles can be
correlated with the physiologic profile of
heart-failure hearts acquired at the time of
transplantation.
13Working Hypothesis?
- Because of the results of Kittleson et al, we can
generate a simple working hypothesis -
- Our differentially expressed genes, using our
methods of statistical analysis of the data,
should roughly be the same as what Kittleson et
al obtained in their paper.
14Exploratory Analysis of Data
- Goal identify genes whose expression levels
are - differentially expressed between Ischemic and
- Normal.
- Affymetrix Data with Two Population
- 54,675 genes are expressed
- for 32
Ischemic samples - 14 Normal
samples - How do we compare?
15- Pre-processing
-
- Only obtain the expression measurement of the
data (ie., put it into exprSet) using the
default of justRMA method - bgcorrect.method rma
- normalized.method quantiles
- summary.method liwong
16- Histogram of Ischemic/Normal
- The distribution is skewed right.
- The range is between 4 to 14.
- Both histograms have similar shapes.
- Boxplot of Ischemic/Normal
- There are many outliers from the upper
values. - The intensity of Ischemic is higher
than Normal. - Histogram of MAD (Median Absolute Deviation)
- Cut-off Method by MAD
- Apply MAD gt 0.1.
- We can filter out 675 genes from a
total of 54675 genes. - Quantile-Quantile plot
- A visual aid for identifying genes with
unusual test - statistics.
- It shows the large deviation at the right
tail.
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22- t-Test for
- Mean difference between Ischemic and Normal
- H0 H1
- We are testing 54675 genes simultaneously and
adjust for multiple testing when assessing the
statistical significance of the observed
associations to control the false positive rate.
23Multiple Hypothesis Testing
- Motivation
- To identify as many differentially expressed
genes as possible, while incurring a relatively
low proportion of false positives. - H0 No differential gene expression (between
Ischemic and normal group) - Large multiplicity problem more than fifty
thousand hypotheses are tested simultaneously. - How can we control the false positive rate
genomewide? FDR or pFDR.
24Table1. Possible outcomes from thresholding m
genes for significance (m p-values with some
cutoff point applied).
Called significant (reject H0) Called not significant (accept H0) Total
True null (H0 is true) F ( of false positives) m0 - F m0
True alternative (Ha is true) T ( of true positives) m1 - F m1
Total S ( of sign. features) m - S m
25False Discovery Rate
- FDR E(F/S)
- In case S0, defined to be E(F/SSgt0)P(Sgt0) or
define F/S0 if S0. - Alternatively, define pFDRE(F/SSgt0). When m is
large, P(Sgt0) is approx. 1 and FDR is approx.
equal to pFDR. - FDR is a measure of the overall accuracy of a set
of significant features.
26Linear Step-Up Procedure
27Steps
- Select desired limit q on E(FDR)
28FDR Adjusted P-Values
- For an individual hypothesis,
29Data inter-dependencies
- Between genes
- Between measurement errors of expression levels
- co-regulation - spatial effects
- RNA source
- normalization process
- pooled variability estimation
Multiple testing of such data will produce
correlated test statistics !
30Correlated Test Statistics
Positive Dependency (Benjamini Yekutieli, 2001
and Yekutieli, 2002).
- The linear step-up procedure controls the FDR for
positive dependent test statistics.
- This condition is satisfied by
- positively correlated one-sided normal and t
test statistics.
- absolute values of normal and t test
statistics, when all null hypotheses are true.
31BH and BY procedure
- BH
- adjusted p-values for the Benjamini Hochberg
(1995) step-up FDR controlling procedure
(independent and positive regression dependent
test statistics). - BY
- adjusted p-values for the Benjamini Yekutieli
(2001) step-up FDR controlling procedure (general
dependency structures).
32Our Results
- rawp BH BY
- 0 17577 17577 17577
- 1e-04 38400 37960 35207
- 2e-04 39239 38833 35935
- 3e-04 39717 39334 36373
- 4e-04 40053 39690 36714
- 5e-04 40321 39972 36966
- 6e-04 40565 40174 37166
- 7e-04 40786 40389 37370
- 8e-04 40948 40569 37513
- 9e-04 41096 40739 37661
- 0.01 41226 40885 37781
33Plot of sorted adjusted p-values
34Plot of adjusted p-values vs. test statistics
35Gene Selection Analysis
- Further select genes based on the fold change
between two conditions (Ischemic vs. Normal) - The fold change for each gene is calculated as
the average expression over all Ischemic samples
divided by the average expression over all normal
samples.
36 37Fold change cutoff value
- There are 1495 genes with Log2(fold change) gt 1,
and 26 genes with Log2(fold change) lt -1 - There are only 43 genes with Log2(fold change) gt
2, and 3 genes with Log2(fold change) lt -2 - We choose the first option
38(No Transcript)
39Discussion
- Among the 54,675 mRNA transcripts present on the
Affymetrix microarray platform, 675 housekeeping
genes were filtered out. - By selecting the adjusted P-value less than
0.0001, only 35,207 genes are left for the
analysis of fold change. - After fold change selection, only 1521 genes are
left for further selection. - Finally, 74 up-regulated genes and 26
down-regulated genes are selected from the
microarray analysis for further biological
verification and study.
40Summary of the Selected Genes
- Of the 100 genes, there are 53 genes that have
known biological functions. The functions of the
other 47 genes are unknown.
41Gene Classification
- Based on the biological process of the genes, the
100 genes can be classified in several
categories.
42Biological Function Classification
43(No Transcript)
44(No Transcript)
45(No Transcript)
46Differentially Expressed Genes to ISC-Normal
Comparisons
- Among the 100 genes that are differentially
expressed between ischemic and normal, the
majority fell into cell adhesion, cell growth and
maintenance, signal transduction, muscle
contraction and development, immune response and
regulation of transcription. - Most of the genes are up-regulated in above
process except one or two genes in the process of
cell growth and maintenance and cell adhesion. - Few genes belong to metabolism, inflammatory
response, acute phase response and oncogenesis.
47An important gene for Ischemic Cardiomyopathy
- Serine proteinase inhibitors has an anti-ischemic
protective effect and has been previously
observed in pigs subject to experimentally
induced myocardial ischemia (Khan 2004)
Aprotinin reduces reperfusion injury after
regional ischemia and cardioplegic arrest.
Protease inhibition may represent a molecular
strategy to prevent postoperative myocardial
injury after surgical revascularization with
cardiopulmonary bypass. - It was hypothesized to ben an important gene in
Kittlesons paper (Physiol. Genomics, 2004).
48The significance of the results
- The gene differentiation analysis find out the
genes that either up-regulated or down-regulated
in ischemic patients, which can correlated with
clinical parameters in heart failure patients and
supported ongoing efforts to incorporate
expression profiling-based biomarkers in
determining prognosis and response to therapy in
heart failure.
49Comparison with Kittleson et. al.s Paper
- Although only one common gene is found in the
analysis, it is consistent considering the sample
size difference, the tissue difference and the
statistical analysis method difference. - However, most of the genes identified from the
analysis fell in the same categories of the
biological functions.
50Limitation
- Because circumstances causing a donor heart to be
ineligible for cardiac transplantation, such as
infection or prolonged hypotension, can also
affect gene expression, a normal functional
unused donor heart is not the same as a normal
heart.
51Future Work
- First, the gene expression profile of these 100
genes need to be verified by the Northern Blot or
Real-Time RT-PCR (qPCR). - After verification, some high fold change unknown
function genes can be chosen to study their
functions for biologists.
52Acknowledgements
- MBI (Prof. Friedman and staff)
- Professors Shili Lin and Joseph Verducci
- Dr. Zailong Wang
- Dr. Nusrat Rabbee