Designing a high quality metabolomics experiment - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Designing a high quality metabolomics experiment

Description:

Title: Slide 1 Author: GPage Last modified by: Page, Grier Created Date: 7/6/2006 3:30:06 PM Document presentation format: On-screen Show (4:3) Company – PowerPoint PPT presentation

Number of Views:134
Avg rating:3.0/5.0
Slides: 73
Provided by: GPa84
Learn more at: https://www.uab.edu
Category:

less

Transcript and Presenter's Notes

Title: Designing a high quality metabolomics experiment


1
Designing a high quality metabolomics experiment
  • Grier P Page Ph.D.Senior Statistical Geneticist
  • RTI International
  • Atlanta Office
  • gpage_at_rti.org
  • 770-407-4907

2
Metabolomics is Powerful and Central
3
  • Designing a good study

4
Errors Errors Everywhere
5
(No Transcript)
6
UMSA Analysis
Day 1
Day 2
Insulin Resistant
Insulin Sensitive
7
Primary consideration of good experimental
design
  • Understand the strengths and weaknesses of each
    step of the experiments.
  • Take these strengths and weaknesses into account
    in your design.

8
(No Transcript)
9
From Drug Discov Today. 2005 Sep
110(17)1175-82.
10
  • State the Question and Articulate the Goals

11
The Myth That Metabolomics does not need a
Hypothesis
  • There always needs to be a biological question in
    the experiment. If there is not even a question
    dont bother.
  • The question could be nebulous What happens to
    the metabolome of this tissue when I apply Drug
    A.
  • The purpose of the question is to drive the
    experimental design.
  • Make sure the samples answer the question Cause
    vs. effect.

12
(No Transcript)
13
Design Issues
  • Known sources of non-biological error (not
    exhaustive) that must be addressed
  • Technician / post-doc
  • Reagent lot
  • Temperature
  • Protocol
  • Date
  • Location
  • Cage/ Field positions

14
  • Experimental Design

15
Biological replication is essential.
  • Two types of replication
  • Biological replication samples from different
    individuals are analyzed
  • Technical replication same sample measured
    repeatedly
  • Technical replicates allow only the effects of
    measurement variability to be estimated and
    reduced, whereas biological replicates allow this
    to be done for both measurement variability and
    biological differences between cases. Almost all
    experiments that use statistical inference
    require biological replication.

16
How many replicates?
  • Controlled experiments cell lines, mice, rats
    8-12 per group.
  • Human studies discovery 20 per group
  • For predictive models 100 per group, need
    model building and validation sets
  • The more the better, always.

17
  • Experimental Conduct
  • All experiments are subject to non-biological
    variability that can confound any study

18
Control Everything!
  • Know what you are doing
  • Practice!
  • Practice!

19
  • What if you cant control or make all things
    uniform
  • Randomize
  • Orthogonalize

20
What are Orthogonalization and Randomization ?
  • Orthogonalization- spreading the biological
    sources of error evenly across the non-biological
    sources of error.
  • Maximally powerful for known sources of error.
  • Randomization spear the biological sources of
    error at random across the non-biological sources
    of error.
  • Useful for controlling for unknown sources of
    error

21
Examples of Orthogonalization and Randomization ?
Randomize
The experiment
Orthogonalize
Order Sample
1 7
2 6
3 4
4 1
5 2
6 8
7 5
8 3
Sample Treatment Variety
1 1 1
2 1 2
3 1 1
4 1 2
5 2 1
6 2 2
7 2 1
8 2 2
Order Sample
1 1
2 2
3 5
4 6
5 8
6 7
7 4
8 3
22
Statistical analyses have assumptions too
23
Statistical analyses
  • Supervised analyses linear models etc
  • Assume IID (independently identically distibuted)
  • Normality
  • Sometimes can rely on central limit
  • Weird variances
  • Using fold change alone as a statistic alone is
    not valid.
  • Shrinkage and or use of Bayes can be a good
    thing.
  • False-discovery rate is a good alternative to
    conventional multiple-testing approaches.
  • Pathway testing is desirable.

24
Classification
  • Supervised classification
  • Supervised-classification procedures require
    independent cross-validation.
  • See MAQC-II recommendations Nat Biotechnol. 2010
    August 28(8) 827838. doi10.1038/nbt.1665.
  • Wholly separate model building and validation
    stages. Can be 3 stage with multiple models
    tested
  • Unsupervised classification
  • Unsupervised classification should be validated
    using resampling-based procedures.

25
Unsupervised classification - continued
  • Unsupervised analysis methods
  • Cluster analysis
  • Principle components
  • Separability analysis
  • All have assumptions and input parameters and
    changing them results in very different answers

26
(No Transcript)
27
(No Transcript)
28
  • Sample size estimation for metabolomics studies

29
There is strength in numbers power and sample
size .
  • Unsupervised analyses
  • Principal components, clustering, heat maps and
    variants
  • These are actually data transformations or data
    display rather than hypothesis testing, thus
    unclear if sample size estimation is appropriate
    or even possible.
  • Stability of clustering may be appropriate to
    think about. Garge et al 2005 suggested 50
    samples for any stability.

30
Sample size in supervised experiments
  • Supervised analyses
  • Linear models and variants
  • Methods are still evolving, but we suggest the
    approach we developed for microarrays may be
    appropriate for metabolomics (being evaluated)

31
(No Transcript)
32
(No Transcript)
33
Metabolomics does not reveal everything and
different technologies show different things
34
  • Technology and detection evolves over time.

35
Technologies are not perfect in agreement
36
The human urine metabolome
37
  • Sample, Image and Data Quality Checking

38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
Metabolite quality
  • Still evolving field
  • RTI is one of the Metabolomics Reference
    Standards Synthesis Centers

44
  • Know your data - What should it look like

45
These are OK
46
These are not OK
47
  • One bad sample can contaminate an experiment

48
Histogram of p-values
49
Potentially Bad Data
50
Histogram of p-values with bad data removed
51
  • Quality of Database, Bioinformatics and
    Interpretative tools

52
Understand what databases include, dont include,
and assumptions
  • Just because a database says something does not
    mean it is right. Read the evidence.
  • Databases are biased.
  • Databases are incomplete
  • Databases have lots of data
  • Understand data before you use it
  • Database are useful!

53
Issues in the Annotation of Genes, proteins,
metabolites
54
Annotation is inconsistent across sources
55
Issues with pathway data
56
(No Transcript)
57
TCA cycle from Ingenuity
58
TCA from GeneMAPP
59
TCA cycle from Ingenuity
60
Share Your Data
  • Use shared data!

61
Metabolomics WorkBench
  • http//www.metabolomicsworkbench.org/

62
MetaboLights
63
Overshare your data and show work
  • Practice compendium research to allow others to
    replicate your work
  • Many high profile omic studies are not even
    technically reproducible

64
Use metabolomics databases
  • Limited in the literature so far. Some work on
    tissue and species metabolomes.

65
Summary
  • Design your experiment well
  • Conduct your experiment well
  • Control for non-biological sources of error
  • Know what is good and bad quality data at each
    stage including metabolite, image, data, and
    annotation
  • If you are aware of these issues and control for
    them highly powerful and reproducible metabolite
    experimentation is possible.
  • Else you get garbage
  • Share your data and use shared data

66
References
  • The MicroArray Quality Control (MAQC)-II study of
    common practices for the development and
    validation of microarray based predictive models.
    Nat Biotechnol. 2010 August 28(8) 827838.
  • Microarray data analysis from disarray to
    consolidation and consensus. Nat Rev Genet. 2006
    Jan7(1)55-65.
  • Baggerly K. "Disclose all data in publications."
    Nature. 2010 Sep 23467(7314)401. PMID 20864982
  • Repeatability of published microarray gene
    expression analyses. Nat Genet. 2009
    Feb41(2)149-55
  • A design and statistical perspective on
    microarray gene expression studies in nutrition
    the need for playful creativity and scientific
    hard-mindedness. Nutrition. 2003
    Nov-Dec19(11-12)997-1000.
  • 39 Steps. From Drug Discov Today. 2005 Sep
    110(17)1175-82.

67
If time allows
68
RTI Regional Comprehensive Metabolomics Resource
Core(RTI RCMRC)
  • Susan Sumner, PhD
  • Director RTI RCMRC
  • Discovery Sciences
  • Proteomics and Metabolomics Programs
  • RTI International

69
Contact Information for the RTI RCMRC
  • Susan C.J. Sumner, PhD
  • Director RTI RCMRC
  • Senior Scientist nanoSafety
  • RTI International
  • Discovery Sciences
  • 3040 Cornwallis Drive
  • Research Triangle Park
  • North Carolina 27709
  • ssumner_at_rti.org
  • 919-541-7479 (office)
  • 919-622-4456 (cell)
  • Jason P. Burgess, PhD
  • Program Coordinator, RTI RCMRC
  • Associate Director, Discovery Sciences
  • RTI International
  • 3040 Cornwallis Drive
  • Research Triangle Park
  • North Carolina 27709
  • jpb_at_rti.org
  • 919-541-6700 (office)

70
MS and NMR Instruments at RTI and DHMRI
RTI DHMRI Mass Spectrometers (38) LC-MS
13 6 GC-MS 4 3 GC x GC-TOF-MS
1 1 ICP-MS 6 1 MALDI ToF/ToF
2 1 NMR (6) 2 4
71
Some RTI Metabolomics Applications and Pilots
  • Experience with adolescent and adult human
    subject research, animal model and cell based
    research, e.g.,
  • Apoptosis- cells
  • Drug induced liver injury- animal models
  • in utero exposure to chemicals and fetal
    imprinting- animal models
  • Dietary exposure and imprinting- animal models
  • NAFLD - pediatric obesity microbiome
  • Weight Loss- pediatric obesity
  • Preterm delivery- human subjects
  • Response to vaccine- human subjects
  • Nicotine withdrawal- human subjects
  • Colon cancer- human subjects

72
Pilot and Feasibility Studies
  • The aim of the pilot and feasibility program is
    to foster collaborations and promote the use of
    metabolomics.
  • Studies will be selected through an application
    process.
  • Application involves abstract, description of
    samples available (matrix type, volume, type and
    duration of storage, sample processing, freeze
    thaws, etc), description of phenotypes, and plan
    for subsequent grant/contract submissions for
    metabolomics analysis beyond initial pilot study.
  • Applications may also include technology
    development.
  • Applications must agree to deposit data in DRCC,
    coauthor publications, and submit joint
    grant/contract proposals.
  • Deadlines being defined
Write a Comment
User Comments (0)
About PowerShow.com