Title: RARE Germline variability in pediatric leukemia.
1RARE Germline variability in pediatric leukemia.
- Cancer Biology Series
- January 29, 2013
- Todd Druley, MD, PhD
- Assistant Professor of Pediatric and Genetics
2Presenter Disclosure InformationTodd E. Druley,
M.D., Ph.D.Druley Lab / WUSM CGSSB
In compliance with ACCME policy, WU requires the
following disclosures to the session audience
Research Support/P.I. No relevant conflicts of interest to declare
Employee No relevant conflicts of interest to declare
Consultant No relevant conflicts of interest to declare
Major Stockholder No relevant conflicts of interest to declare
Speakers Bureau No relevant conflicts of interest to declare
Scientific Advisory Board No relevant conflicts of interest to declare
3Why study rare variation?
- Whole genomes show 2-4 million variants PER
PERSON! - Only about 25 33 of these are common (gt2
MAF). - There are roughly 22,000 human genes
- This equals 40,000,000 nucleotides total for all
of our genes. - 1.5 of the entire genome
- If 2 individual genomes differ by
- 2M x 0.67 1,340,000 nucleotides
- There are 1.8 x 1012 possible combinations
between the two genomes!!
4Common vs. Rare Variants
- Critical differences between common and rare
variant analysis include - Rare variants have greater effect sizes average
OR3.7 (Bodmer Nat Genet
2008) - Disruptive rare variants are more likely to act
dominantly (Fearnhead Cell Cycle
2005) - Rare variants are individually rare, but
collectively common when collapsed (binned)
within a genetic locus or metabolic pathway
(Cohen Science 2004 Ji Nat
Genet 2008)
5Antonarakis SE et al. Nature Rev Genet 2009.
Private
6Antonarakis SE et al. Nature Rev Genet 2009.
Were operating here
Private
7Example
- Cystic Fibrosis
- Originally thought that only the ?F508 mutation
was causative for CF. - Sequencing of the CFTR gene was initiated.
- Now over 1000 mutations in CFTR have been
documented. - Cause various severities of cystic fibrosis.
http//www.ccb.sickkids.ca/index.php/cystic-fibros
is-mutation-database.html
8Complex diseases demonstrating increased rare
variation
AJHG 80, 779-791 2007
- Sequenced two groups of 128 individuals each
- Psychiatric illness, cancer, autoimmune
disorders, heart disease, height, - extreme longevity, many others
9What about pediatric cancer?
- Early onset cancer defined as cancer lt50
years old - Germline cancer causing gene alleles (TP53,
APC, BRCA1) average age of disease onset is
20s - Cannot explain the incidence of pediatric cancer
by somatic mutation. - Epi studies have failed to explain exposures
causing these cancers. - Almost all pediatric cancer patients have a
negative family history. - So why do we see 3 children/week with a new
cancer??
10Infant acute leukemia worst outcomes
- 50 mortality, 67 with MLL-rearrangements
- MLL regulates developmental transcription (HOX
genes) - Survivors often left with developmental problems
- COG AE24 Epidemiology of Infant Leukemia
- Largest case-control study to date looking for
pre/perinatal exposures associated with infant
leukemia - Topoisomerase II inhibitor exposure during
pregnancy - Only associated with AML, but didnt impact
survival - Ross JA, J Nat Cancer Inst Monogr 2008
11Pilot exome sequencing experiment
- GERMLINE exome sequencing from 25 pairs of
mothers and infants with MLL-negative acute
leukemia - Julie Ross, PhD (PI) and Amy Linabery, PhD.
- We are looking at genes with rare variants in
affected infants, but also inherited from mothers - These parents typically dont have leukemia or
other cancers. - We hypothesize a combinatorial effect from
parental variants contributes to the early
onset/short latency of leukemia.
12Demographics
25 pairs of Caucasian mothers and infants 12
ALL, 13 AML
13Validated bioinformatics
- We analyzed exome data using a validated
bioinformatic pipeline - Align using Novoalign
- Call variants with SAMtools
- Sensitivity 97
- Specificity 99.8
14Variant calls in COSMIC genes
- Prioritize by comparing our variant calls in
genes already associated with hematologic
malignancies in the COSMIC database. - http//www.sanger.ac.uk/genetics/CGP/cosmic/
- ALL (126 ALL-associated genes)
- Infants 695 total variants (481 known, 214
novel) - Mothers 728 total (588 known, 140 novel 65)
- AML (657 AML-associated genes)
- Infants 5517 total (3961 known, 1556 novel)
- Mothers 4735 total (4264 known, 471 novel 30)
15Permutation testing
Average ALL 5 variant genes/infant, AML
6 variant genes/infant
Null distribution
Null distribution
Both sets of infants have a statistically
significant (Plt10-7) enrichment of novel,
non-synonymous, deleterious germline variants in
genes associated with hematopoietic malignancies
(COSMIC).
Mark Valentine
16Validation
- No significant enrichment in randomly chosen gene
sets in infants - No significant enrichment in random or leukemia
gene sets in Caucasian unaffected exomes - Unlikely to see the same novel variant in only
related mother infant pairs by chance. - 45 in ALL 23 in AML
- Consistent with maternal totals of 65 30,
respectively - Sanger validation of other variants is ongoing
17micro-RNA regulation?
- Many variant candidate genes are regulated by
MIRs independently associated with leukemia and
cell cycle regulation
Nick Sanchez
18Pathway Analysis
- ABC transporters
- Developmental defects
- Chloride channel regulator activity
- Transcription factor dysregulation
- YYI, Cdx, HNF1, MAF, EA2
- TDG glycosylase mediated binding and cleavage of
a thymine, uracil or ethenocytosine opposite a
guanine
19Implications / Conclusions
- Supports the hypothesis that infants with
leukemia are born with a putatively functional
enrichment of variation in genes associated with
leukemogenesis. - Infants with AML have an excess of novel,
nonsynonymous, deleterious variation not from
mother. - Paternal age de novo mutation during
spermatogenesis? - De novo mutation during embryogenesis?
- Can we identify discreet biological/developmental
and regulatory mechanisms leading to early onset
leukemia? - MIRs
- ABC transporters
- Specific transcription factors
20Future work
- SHORT TERM
- Complete the bioinformatic analysis
- Compare to existing data (TARGET and PCGP)
- Exome sequencing of 25 MLL-positive pairs
- LONG TERM
- Validate results in a second cohort of triads
- Establish model systems to study complex genetic
interactions - Integrate information into clinical trials?
21High-risk pediatric ALL Pooled sequencing
- Patient germline (N96)
- Patient leukemia (N96)
- Unaffected controls (N93)
55 genes per pool
22Candidate genes for pooled sequencing
- 55 genes selected for pooled sequencing
- All genes have been published in relation to
pediatric ALL - 43 were identified near significant tagged-SNPs
on the prior array (asterisks) - Various cellular functions
23Pooled sequencing pilot project
- Sequenced 94.5 of coding regions from all three
pools. - 420 kb per person 1.2 x 108 total bases covered
Total Variants Coverage/Allele
Unaffected 4209 80-fold
Germline 3929 86-fold
Leukemia 3822 101-fold
24- Validation at 384 base positions by custom
Illumina GoldenGate array
25Overlap
- 49 of called variants are unique to the ALL
Germline pool - Only 2.5 of Leukemia variants were NOT seen in
the Germline pool (97.5 overlap) - Somatic mutations
Germline pool NOT in Unaffected Leukemia pool NOT in Germline
Total variants 1915 (49) 96 (2.5)
Coding substitutions 233 (12) (20) 22 novel mutations in UTRs 5 within putative splice site
Novel 175 (75) 15 (79)
Non-synonymous Synonymous 162 (70) 71 14 (74) 5
Damaging (per SIFT) 89 (38) 84 missense 5 nonsense 9 (47) all missense
Coding Insertions/Deletions 9 7 11 in UTR or splice site
Causes protein dysfunction (per SIFT)? 6 3 MLL, 1 ATM, 1 PAX5, 1 LEF1 7 6 MLL, 1 TCF3
26Visualizing the dataset
Leukemia SNPs (x)
Germline SNPs ()
Amplicons
Control SNPs (?)
High
Conservation Across Species
Low
Joe Giacalone Mark Valentine
27Visualizing the dataset
Leukemia SNPs (x)
Germline SNPs ()
Amplicons
Control SNPs (?)
High
Conservation Across Species
Low
- No variants in control group
- Multiple variants in affected germline
- Overlap with highly conserved region
Joe Giacalone Mark Valentine
28Mark Valentine
29Exome variant server overlay
Drew Hughes
30- All looking at known ancestral polymorphisms and
the incidence of acute leukemia. - None involve sequencing to demonstrate novel/rare
variants in the same genes.
31Overexpressed genes
- ATM
- CDKN1A
- CYP1A1
- CYP3A5
- IKZF1
- MDM2
- MLL
- MTHFR
- NAT2
- NQO1
- PAX5
- PTPN11
- TCF3
- TPMT
32Overexpressed genes
- ATM
- CDKN1A
- CYP1A1
- CYP3A5
- IKZF1
- MDM2
- MLL
- MTHFR
- NAT2
- NQO1
- PAX5
- PTPN11
- TCF3
- TPMT
6 of 14 overexpressed genes (43) are involved in
drug metabolism.
33Additional gene expression profiles
- Similar expression differences in 18 additional
genes (5 overexpressed CYPs). - All genes possess 1 novel coding variant in
P9906 patients. - No clear connection between genetic variation and
gene expression.
Drew Hughes
34Implications / Conclusions
- Overexpression of specific genes involved in
metabolism of anti-leukemia agents identifies a
subgroup of children with inferior EFS. - Private sequence variation in drug/energy
metabolism genes is not coupled to expression
profiles, but may predispose to leukemia or
modulate therapeutic response through defective
metabolism. - Pathogenesis vs. pharmacogenomics?
- Therapeutic implications
- Can look for these genomic signatures at
diagnosis existing precedent - Dose modification or direct to bone marrow
transplant
35Future work
- Validation and identification of individual
profiles. - Delve more into the underexpressed genes as well.
- Analyze sequencing results of 700 additional
drug/energy metabolism genes. -
- Functional iPSC-based assays from patient
fibroblasts. - Introduction into immune-deficient mice for
functional study.
36Acknowledgements Funding
- Wash U
- Bob Hayashi
- Alan Schwartz
- Rob Mitra
- F. Sessions Cole
- COG
- Julie Ross
- Logan Spector
- Mignon Loh
- Rick Harvey
- Druley Lab
- Nick Sanchez
- Mark Valentine
- Joe Giacalone
- Drew Hughes
- Andrew Young
1K08CA140720-01A1
Eli Seth Matthews Leukemia Foundation