Study of Gene Expression: Statistics, Biology, and Microarrays - PowerPoint PPT Presentation

About This Presentation
Title:

Study of Gene Expression: Statistics, Biology, and Microarrays

Description:

Study of Gene Expression: Statistics, Biology, and Microarrays Ker-Chau Li Statistics Department UCLA kcli_at_stat.ucla.edu PART I. Cellular Biology Macromolecules: DNA ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 37
Provided by: statUcla7
Learn more at: http://www.stat.ucla.edu
Category:

less

Transcript and Presenter's Notes

Title: Study of Gene Expression: Statistics, Biology, and Microarrays


1
(No Transcript)
2
Study of Gene ExpressionStatistics, Biology,
and Microarrays
  • Ker-Chau Li
  • Statistics Department
  • UCLA
  • kcli_at_stat.ucla.edu

3
PART I. Cellular Biology
  • Macromolecules DNA, mRNA, protein

4
Why Biology?
5
Human Genome Project
Begun in 1990, the U.S. Human Genome Project is a
13-year effort coordinated by the U.S. Department
of Energy and the National Institutes of Health.
The project originally was planned to last 15
years, but effective resource and technological
advances have accelerated the expected completion
date to 2003. Project goals are to   identify
all the approximate 30,000 genes in human DNA,
determine the sequences of the 3 billion chemical
base pairs that make up human DNA, store this
information in databases, improve tools for
data analysis, transfer related technologies
to the private sector, and address the
ethical, legal, and social issues (ELSI) that may
arise from the project.   Recent Milestones
June 2000 completion of a working draft of the
entire human genome February 2001 analyses of
the working draft are published
Human Genome Program, U.S. Department of Energy,
Genomics and Its Impact on Medicine and Society
A 2001 Primer, 2001
6
Future Challenges What We Still Dont Know
Gene number, exact locations, and functions
Gene regulation DNA sequence organization
Chromosomal structure and organization
Noncoding DNA types, amount, distribution,
information content, and functions
Coordination of gene expression, protein
synthesis, and post-translational events
Interaction of proteins in complex molecular
machines Predicted vs experimentally determined
gene function Evolutionary conservation among
organisms Protein conservation (structure and
function) Proteomes (total protein content and
function) in organisms Correlation of SNPs
(single-base DNA variations among individuals)
with health and disease Disease-susceptibility
prediction based on gene sequence variation
Genes involved in complex traits and multigene
diseases Complex systems biology including
microbial consortia useful for environmental
restoration Developmental genetics, genomics
Human Genome Program, U.S. Department of Energy,
Genomics and Its Impact on Medicine and Society
A 2001 Primer, 2001
7
Medicine and the New Genomics
  • Gene Testing
  • Gene Therapy
  • Pharmacogenomics

Anticipated Benefits
  • improved diagnosis of disease
  • earlier detection of genetic predispositions to
    disease
  • rational drug design
  • gene therapy and control systems for drugs
  • personalized, custom drugs

Human Genome Program, U.S. Department of Energy,
Genomics and Its Impact on Medicine and Society
A 2001 Primer, 2001
8
Anticipated Benefits
Molecular Medicine improved diagnosis of
disease earlier detection of genetic
predispositions to disease rational drug
design gene therapy and control systems for
drugs pharmacogenomics "custom drugs" Microbial
Genomics rapid detection and treatment of
pathogens (disease-causing microbes) in
medicine new energy sources (biofuels)
environmental monitoring to detect pollutants
protection from biological and chemical warfare
safe, efficient toxic waste cleanup
Human Genome Program, U.S. Department of Energy,
Genomics and Its Impact on Medicine and Society
A 2001 Primer, 2001
9
Anticipated Benefits
Agriculture, Livestock Breeding, and
Bioprocessing disease-, insect-, and
drought-resistant crops healthier, more
productive, disease-resistant farm animals more
nutritious produce biopesticides edible
vaccines incorporated into food products new
environmental cleanup uses for plants like
tobacco
Human Genome Program, U.S. Department of Energy,
Genomics and Its Impact on Medicine and Society
A 2001 Primer, 2001
10
(No Transcript)
11
Human Genome Program, U.S. Department of Energy,
Genomics and Its Impact on Medicine and Society
A 2001 Primer, 2001
12
What is a gene ?
13
(No Transcript)
14
SNP and Genetic Disease
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
Mitochondrial ATP Synthase E. coli ATP
Synthase These images depicting models of ATP
Synthase subunit structure were provided by John
Walker. Some equivalent subunits from different
organisms have different names.
19
(No Transcript)
20
PART II. Microarray
  • Genome-wide expression profiling

21
Differential Gene expressiontissues, organs
22
(No Transcript)
23
Next Step in Genomics
Transcriptomics involves large-scale analysis
of messenger RNAs (molecules that are transcribed
from active genes) to follow when, where, and
under what conditions genes are expressed.  
Proteomicsthe study of protein expression and
functioncan bring researchers closer than gene
expression studies to whats actually happening
in the cell.   Structural genomics initiatives
are being launched worldwide to generate the 3-D
structures of one or more proteins from each
protein family, thus offering clues to function
and biological targets for drug design.  
Knockout studies are one experimental method for
understanding the function of DNA sequences and
the proteins they encode. Researchers inactivate
genes in living organisms and monitor any changes
that could reveal the function of specific
genes.   Comparative genomicsanalyzing DNA
sequence patterns of humans and well-studied
model organisms side-by-sidehas become one of
the most powerful strategies for identifying
human genes and interpreting their function.
Human Genome Program, U.S. Department of Energy,
Genomics and Its Impact on Medicine and Society
A 2001 Primer, 2001
24
Microarray
25
MicroArray
  • Allows measuring the mRNA level of thousands of
    genes in one experiment -- system level response
  • The data generation can be fully automated by
    robots
  • Common experimental themes
  • Time Course
  • Mutation/Knockout Response

26
Reverse-transcription
Color cy3, cy5 green, red
27
Exploring the Metabolic and Genetic Control
ofGene Expression on a Genomic ScaleJoseph L.
DeRisi, Vishwanath R. Iyer, Patrick O. Brown
28
(No Transcript)
29
PART III. Statistics
  • Low-level analysis
  • Comparative expression
  • Feature extraction
  • Classification,clustering
  • Pearson correlation
  • Liquid association

30
Image analysis
  • Convert an image into a number representing the
    ratio of the levels of expression between red and
    green channels
  • Color bias
  • Spatial, tip, spot effects
  • Background noises
  • cDNA, oligonucleotide arrays,

31
Genome-wide expression profileA basic structure
  • cond1 cond2 .. condp
  • Gene1 x11 x12 .. x1p
  • Gene2 x21 x22 .. x2p
  • ...
  • ...
  • Genen xn1 xn2 .. xnp

32
Cond1, cond2, , condp denote various
environmental conditions, time points, cell
types, etc. under which mRNA samples are
takenNote numerous cells are involved Data
quality issues 1. chip (manufacturer)
2. mRNA sample (user)It
is important to have a homogeneous sampleso that
cellular signals can be amplified- Yeast Cell
Cycle data ideally all cells are engaged in the
same activities- synchronization
33
Example 1
  • Comparative expression
  • Normal versus cancer cells
  • ALL versus AML

34
E.Landers group at MIT
  • Cancer classification (leukemia)
  • ALL AML (arising from lymphoid or myeloid
    precursors)
  • Require different treatments
  • Traditional methods nuclear morphology
  • Enzyme-based histochemical analysis(1960)
  • Antibodies (1970)
  • Genome wide expression comparision

35
ALL (acute lymphoblastic leukemia) AML(acute
myeloid leukemia)
36
Gene selection
  • For each gene (row) compute a score defined by
  • sample mean of X - sample mean of Y
  • divided by
  • standard deviation of X standard deviation
    of Y
  • XALL, YAML
  • Genes (rows) with highest scores are selected.
  • Works ????
  • 34 new leukemia samples
  • 29 are predicated with 100 accuracy 5 weak
    predication cases
Write a Comment
User Comments (0)
About PowerShow.com