Cancer Genomics - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Cancer Genomics

Description:

Cancer Genomics – PowerPoint PPT presentation

Number of Views:272
Avg rating:3.0/5.0
Slides: 39
Provided by: lid4
Category:
Tags: cancer | fm1 | genomics

less

Transcript and Presenter's Notes

Title: Cancer Genomics


1
Cancer Genomics
Richard K. Wilson, Ph.D.Washington
UniversitySchool of Medicine
rwilson_at_watson.wustl.edu
2
Human Genome v1.0
Cancer Genomics
Ancillarygenomesmousechimpetc.
Discovery
TechnologySoftware toolsInfrastructure
CancerOther diseases
3
PCR-based re-sequencing
list of candidate genes
large collection of patient samples
4
EGFR mutations in NSCLC
Tyrosine kinase
EGF ligand binding
autophos
K
DFG Y
Y
Y
Y
TM
745
Y869
718
964
835
776
858
947
K
DFG R Y
GXGXXG
R
H
M
LREA
Most TKI responders have EGFR mutations Study
1 8/9 (89) vs. 0/7 controls Study 2 5/5
(100) vs. 0/4 controls Study 3 19/24 (79)
vs. 0/20 controls
5
Tumor Sequencing Project
600 genes of interest
200 lung adenocarcinoma samples
  • Sequencing Centers BCM-HGSC, BI, WUGSC
  • Cancer Centers MSKCC, DFCI, SCC, MDA

6
TSP Target List
  • Too expensive to sequence the whole genome
    therefore, focus on drugable targets.
  • For lung adenocarcinoma TSP 600 genes (exons
    only)
  • Receptor tyrosine kinases (e.g. EGFR)
  • Selected serine-threonine kinases
  • Known oncogenes
  • Known tumor suppressor genes
  • EGFR pathway genes
  • DNA repair genes
  • Etc.

7
SNP Arrays
8
SNP Arrays
9
DNA Chips/SNP Arrays
10
Lung Adeno Genomic EventsSNP Array Analysis
Weir et al. Nature (2007)
11
Lung Adeno Genomic Events
Weir et al. Nature (2007)
12
Lung Adeno Genomic Events
Weir et al. Nature (2007)
13
Lung Adenocarcinoma Amplifications
Weir et al. Nature (2007)
14
Mutations in lung adenocarcinoma
  • KRAS and TP53 Are Mutated in About 1/3 of Tumor
    Samples
  • Indels have not been included in the analysis

15
Mutations in TP53, ERBB3, and AKT3 appear to
correlate with tumor grade
N24
N85
N71
Mutation
16
Correlations between mutations and clinical
features
  • Mutations in PDGFRA, PTEN, NTRK1 and PRKDC show
    positive correlation with tumor stage.
  • Mutations in LRP1B, PRKDC, TP53, and APC
    correlate with the solid tumor histological
    subtype of lung adenocarcinoma.
  • High correlation of mutations in EGFR and MYO3B
    with never smoker and mutations in KRAS and LRP1B
    with smokers.

17
EGFR mutations in glioblastoma
  • Screen of kinase domains in glioblastoma?no
    recurrent mutations
  • But

119 Lung Tumors no EC mutations 270 HapMap
Normals no EC mutations
18
Genomic Studies of Cancer
  • Hypothesis-driven (biased)
  • Gene sets with related functions kinome,
    phosphatome
  • Genes mutated in other cancers
  • Closely related genes
  • Investigator-driven ideas
  • Data-driven (unbiased)
  • Use genomic platforms to identify loci with
    recurrent somatic alterations
  • Array-based RNA profiling
  • Array CGH
  • Array-based SNP genotyping

? R.K.Wilson 2007
19
Acute myelogenous leukemia
  • Project initiated in 2002.
  • Primary tumors, matched normal tissue (i.e.,
    germline variants vs. somatic mutations)
  • Discovery set (46 tumors) Validation set
    (94 tumors)
  • Initial target list 450 genes
  • Orthogonal technologies (CGH arrays, expression
    profiling, etc.) for genome characterization and
    to detect additional sequencing targets.

20
Acute myelogenous leukemia
  • FLT3 29
  • NPM1 25
  • NRAS 9.6
  • PTPN11 4
  • RUNX1 4
  • GCSFR 4
  • Others 2-3

21
Is there a better approach?
  • What are we missing outside of the exons?
  • PCR-based re-sequencing
  • Relatively expensive
  • Diploid (at best) low coverage

? R.K.Wilson 2007
22
Solexa/Illumina 1G Analyzer
23
Solexa/Illumina 1G Analyzer
Illumina flow cell
  • Acts as the microfluidic conduit for cluster
    generation and sequencing reagents.
  • 8-lane flow cell configuration.
  • Separate libraries can be sequenced in each lane,
    or the same library in all.
  • 60M clusters are sequenced per flow cell.

24
Next Generation Sequencing Technologies
25
AML Whole Genome Sequencing
Data types
  • Whole genome sequence (tumor genome) Solexa
  • FL cDNA normalized library Solexa 454
  • Whole genome sequence (epidermal genome) Solexa
  • Compare sequence to previously identified
    mutations.
  • Compare increasing coverage levels to
    heterozygous SNPs from Affy/Illumina arrays for
    coverage evaluation.
  • Devise strategic approaches to find novel
    variants validate and characterize.

Analysis plans
26
933124
  • 57 y/o Caucasian female
  • De novo M1 AML
  • 100 blasts in initial BM sample
  • Relapsed and died at 11 months
  • Normal cytogenetics
  • No LOH on Affy 500K SNP array
  • Informed consent for whole genome sequencing

27
? R.K.Wilson 2007
28
(No Transcript)
29
AML Whole Genome Sequencing
  • As of 1/28/08
  • 75 Solexa runs completed (32 bp reads)
  • 62 billion bp (22X haploid coverage)
  • 2,123,143 sequence variants detected (Q30)
  • 492,569 (23.2) are previously undiscovered SNPs
  • 46,320 heterozygous (informative) SNPs from Affy
    and Ilumina SNP arrays.
  • 77 of informative SNPs with both WT and variant
    alleles were detected in the genome sequence.
  • 97.4 of informative SNPs of either allele were
    detected in the genome sequence.

? R.K.Wilson 2007
30
AML Whole Genome Sequencing
933124 genome sequence
2,123,143 variants
Intergenic 145,092
Genic 334,477
dbSNP 1,630,574
Splice_site 99
Other 329,322
Coding 5,056
Synonymous 1,222
Missense 3,402
Nonsense320
Nonstop 9
Only reporting Q30 variants Genic region gene
boundary /- 50kb
31
AML Transcriptome Sequencing
Various cDNA library construction procedures
normalization schemes
454 cDNA sequencing Number of mapped cDNA reads
306,267 Solexa cDNA sequencing Number of
mapped reads 47,153,784
32
Expressed genes variantgermline frequencies
AML Transcriptome Sequencing
  • MYCBP2 1188345
  • HSP90B1 6941347
  • BCCIP 391394
  • NCOR1 256268
  • CHFR 23052
  • DNAJ 2180
  • PTPN11 1981
  • NUMA1 1572
  • CASPASE 7 145147
  • HOX C6 1182
  • PLEKHC1 11214
  • NTRK3 11210
  • CDC2 9682

? R.K.Wilson 2007
33
V194M (C to T) in FLT3
CT
CT
cDNA sequence
Tumor genome sequence
34
AML Whole Genome Sequencing
  • Currently using SXOligoSearchG (Synamatix) to
    detect small (1-2 bp) indels.
  • Evaluating software tools for detection of larger
    indels.

35
AML Current status
thirsty for knowledge?
? R.K.Wilson 2007
36
AML Current status
  • Diploid coverage was obtained for 77 of an AML
    M1 tumor genome with 22x haploid coverage.
  • 2.1M sequence variants found (similar to other
    whole genomes already finished).
  • 495,000 novel variants SNPs vs. somatic
    mutations
  • 10x coverage of epidermis (normal) genome just
    completed may identify gt90 of variants as rare
    SNPs.
  • Remaining 50,000 variants are being prioritized
    by detection in cDNA should be lt1,000
  • Very rare somatic mutations in cDNA thusfar (only
    2 validated).
  • No mutator (driver) phenotype is readily
    apparent for this AML case passenger mutations
    appear to be rare.
  • We continue to sift through the data

? R.K.Wilson 2007
37
Cancer Genomics
  • Exon-targeted sequencing (TSP, glioblastoma) is
    revealing useful interesting findings
    expensive slow!
  • Next Gen sequencing is here and will have a
    substantial near-term impact on the study of
    cancer genomes!
  • Ancillary genome-based technologies (expression
    profiling, SNP arrays, cDNA sequencing) are
    crucial for understanding the target genome
    before considering WGS.
  • The dream is not hype a comprehensive
    understanding of the cancer genome is probable,
    and will change the way that you diagnose treat
    your patients.

? R.K.Wilson 2007
38
Acknowledgments
  • WU Genome Sequencing Center
  • Elaine Mardis, Li Ding, Dave Dooling, Tracy
    Miner, Mike McLellan, Ginger Fewell, Jim Eldred,
    Asif Chinwalla, Yumi Kasai, Lucinda Fulton, Vince
    Magrini, Matt Hickenbotham, Lisa Cook, Michael
    Wendl, Michael Province
  • WU Siteman Cancer Center
  • Tim Ley, Mark Watson, Matt Walter, Rhonda Ries,
    Jackie Payton, John DiPersio, Dan Link, Michael
    Tomasson, Tim Graubert, Sharon Heath
  • TSP/TCGA Colleagues
  • Baylor HGSC, Broad Institute, many others
  • Funding sources
  • NHGRI (Wilson), NCI (Ley), Alvin J. Siteman (AML
    WGS)

genome.wustl.edu
Write a Comment
User Comments (0)
About PowerShow.com