ArrayTrack Data management, analysis and interpretation tool for DNA microarray and beyond - PowerPoint PPT Presentation

1 / 43

About This Presentation

Title:

ArrayTrack Data management, analysis and interpretation tool for DNA microarray and beyond

Description:

AT version 2 (2002): in-house microarray core facility ... ArrayTrack contains pathways for human (134), rat (116) and mouse (124) ... – PowerPoint PPT presentation

Number of Views:243

Avg rating:3.0/5.0

Slides: 44

Provided by: wto8

Category:

more less

Transcript and Presenter's Notes

Title: ArrayTrack Data management, analysis and interpretation tool for DNA microarray and beyond

1
ArrayTrack- Data management, analysis and
interpretation tool for DNA microarray and beyond

Weida Tong
Director, Center for Toxicoinformatics, NCTR/FDA
Weida.tong_at_fda.hhs.gov

2
ArrayTrack A brief history in the 5 years
Development Cycle

AT version 1 (2001)
Filter array data management tool
AT version 2 (2002) in-house microarray core
facility
Customized two color arrays data management,
analysis and interpretation
Open to public (late of 2003)
AT version 3.1 (2004) VGDS
Affymetrix analysis capability enhanced
AT version 3.2 (2005) MAQC
Tested on 7 commercial platforms (Affy, Agilent
one- and two-color arrays, ABI, CodeLink,
Illumina )
Integrated with other software (IPA, MetaCore,
DrugMatrix, CEBS, SAS/JMP )
AT version 4 (2006 present)
CDISC/SEND standard
VGDS ? VXDS

3
ArrayTrack Client-Server Architecture
CLIENT
Analysis Tools
Pub data (Gene annotation, Pathways )
Study data (Clinical and non-clinical data)
Microarray Proteomics Metabolomics
SERVER
CDISC/SEND
MIAME
NCBI, KEGG, GO
4
ArrayTrack An Integrated Solution
Clinical and non-clinical data
Chemical data
ArrayTrack
5
ArrayTrack-Freely Available to Public
Web-access
Local installation
of unique users access the locally installed
version of ArrayTrack
of unique users access the web version of
ArrayTrack
6
ArrayTrack Website
http//www.fda.gov/nctr/science/centers/toxicoinfo
rmatics/ArrayTrack/
7
DNA Microarray
Key advantage Simultaneously measure tens of
thousands transcription in a single experiment
Called Array, Chip or slide
Spot (DNA probe) Oligo (25-80 mer) or cDNA
Principle Hybridization of known DNA probes on
the chip with complementary DNA sequence from the
sample
Substrate Glass, Nylon or Plastic
8
Application 1 - Mechanistic Study
Treated rats
Untreated rats
Comparing
Identify the affected genes in the treated
condition (Differentially expressed genes (DEGs)
identification)
Mechanism
9
Data Format of DNA Microarray
Genes
1 2 3 4 . N
1 2 3 4 5 6 . . . . . . . m
1
2
Experiments
3
m
Differential expression
10
(No Transcript)
11
Complexity of Microarray Experiment An Array
of Options
Affymetrix (GeneChip, 25-mer, in-situ synthesis,
one-color) Agilent (60-mer, in-situ synthesis,
two-color) Applied Biosystems (60-mer,
chemilumiscence) Clontech (7080-mer) GE
Healthcare (CodeLink, 30-mer, one-color) Illumina
(BeadArray, 70-mer) MWG (50-mer) NimbleGen (MAS
in-situ synthesis) Operon (Qiagen)
(70-mer) Customized long oligo or cDNA arrays .
12
ArrayTrack for Microarray Data Management and
Analysis
Hypothesis
Exp Design
Microarray Exp
Data management
Data analysis
Data interpretation
13
MicroarrayDB Storing data associated with a
microarray exp

Microarray database
Handling both one- and two-channel data,
including affy data
Only the CEL file is required for affy data
Supporting toxicogenomics research by storing tox
parameters, e.g., dose schedule and treatment,
sacrifice time
MIAME supportive to capture the key data of a
microarray experiment
Will be MAGE-ML compliant to ensure inter-
exchangeability between ArrayTrack and other
public databases

Microarray DB
14
LIB Component Containing functional
information for microarray data interpretation

Functional data
Individual gene analysis
Pathway-based analysis
Gene Ontology based analysis
Linking expression data to the traditional
toxicological data

Microarray DB
LIB
15
TOOL Component- Containing functionality for
microarray data analysis

Analysis tools
Four normalization methods
Mean/median scaling for affy data
LOWESS for 2-color array
Gene selection method
T-test, permutation t-test,
Filtering using fold changes, intensity, flag inf
Volcano plot, p-value plot
Data exploring (e.g., HCA, PCA)
Many visualization tools (e.g., flexible scatter
plot, Bar chart viewer,

TOOL
Microarray DB
LIB
16
TOOL
Microarray DB
LIB
17
(No Transcript)
18
Supporting Eight Platforms

Affy, Agilent, ABI, Combimatrix, Eppendorf, GE
Healthcare, Illumina and customized arrays
Affy data
Probe data (.cel file)
Probe-set data

Individual hyb import
Batch import
19
Comparing ArrayTrack-derived Gene Lists with
these reported by the sponsor

The gene lists that presented in the submitted
report

20
Normalization Methods
Four common normalization methods for converting
Affy probe data to probe-set data, including
MAS5, dChip, RMA, and Plier
Five common normalization methods for other
platforms, including LOWESS
21
Gene Selection

Significant genes can be identified based on
T-test (with or without Bonferroni correction)
and permutation t-test
False Discovery Rate (FDR) (e.g., Benjamini
Hochberg, p-value plot)
Volcano Plot (considering both p and fold-change)

22
Microarray Experiment Results

Treated group
Replicates
Cancer cell lines
Tumor tissues

Control group A set of control samples
Fold Change up-regulated or down-regulated P
statistical significance
23
Gene Selection- T-test, Bonferroni adjustment
and beyond
Two types of experiment Error rate for the exp
Single testing 1 gene Plt0.05 low error
rate Multiple testing n genes P1-(1-Pi)n If
Pi0.05, high error rate e.g., If n10 and
Pi0.05, P0.401
Select a gene list based on
P value
Bonferroni criterion
Low sensitivity
Low power
False discovery rate (e.g., Benjamini Hochberg,
p-value plot) Permutation t-test (e.g.,
SAM) Volcano plot (combination of p and fold
change)
24
Data Interpretation

Pathway-based tools
Ingenuity Pathways Analysis
KEGG
PathArt

GOFFA Gene Ontology-based tool
Gene Annotation
25
Data Interpretation- Pathway-based analysis
using KEGG Library

KEGG - Kyoto Encyclopedia of Genes and Genomes
(http//www.genome.jp/kegg/).
It provides a database (free) of metabolic,
regulatory and disease pathways Most of them are
metabolic pathways
ArrayTrack contains pathways for human (134), rat
(116) and mouse (124)
Click KEGG in GeneLib and the genes are
reorganized based on their involved pathways

26
Data Interpretation- Pathway-based analysis
using PathArt

PathArt (Jubilant) is a pathway database that
contains over 600 mammalian disease and signaling
pathways.
The pathways are collated through manual curation
from literature and public domain databases.
ArrayTrack contains PathArt pathways for human
(276), rat (116) and mouse (77)
Click PathArt in GeneLib and the genes are
reorganized based on their involved pathways (see
next slide)

27
Ingenuity Pathways Analysis (IPA)
Ingenuity Pathways Analysis
Conduct statistical analysis
Interrogate genes or proteins on omics scale
Elucidate functional pathways
Understand markers of efficacy and safety

KEGG and PathArt provide canonical pathways
IPA provides both canonical and de-novo pathways

28
Data Interpretation- Gene Ontology Analysis
(GOFFA)
29
Data Interpretation- GO-based analysis using
GOFFA

GOFFA Gene Ontology For Functional Analysis
It is developed based on Gene Ontology (GO)
database
Important for grouping the genes into functional
classes
GO Three ontologies
Molecular function activities performed by
individual gene products at the molecular level,
such as catalytic activity, transporter activity,
binding
Biological process broad biological goals
accomplished by ordered assemblies of molecular
functions, such as cell growth, signal
transduction, metabolism
Cellular component the place in the cell where a
gene product is found, such as nucleus, ribosome,
proteasome

30
Data Interpretation- GO-based analysis using
GOFFA

Each ontology (e.g., mol. function) is presented
as a hierarchical tree structure
Each node is a GO term that contains several
known genes
Levels represent the specificity of terms

P-Path View
Genes in a specific GO term (node)
Tree view
Hierarchical tree
Genes are searched again GO
Fisher test
of genes
Searching panel
31
(No Transcript)
32
Data Exploring

Before gene selection remove outliers
Mixed scatter plot
Principal component analysis (PCA)

After gene selection drill-down analysis
Bar chart
Hierarchical Clustering analysis (HCA)

33
Data Exploring
Expression profile
PCA
HCA
34
(No Transcript)
35
Toxicogenomics Study

Toxicology parameters Clinical pathology data
(Clinical chemistry, Hematology),
histopathology, liver weight
Gene expression data
Other omics data

36
(No Transcript)
37
Study Data Management and Analysis

FDA eSubmission efforts
Clinical data Clinical Data Interchanges
Standards Consortium (CDISC)
Non-clinical data Standard for Exchange of
Nonclinical Data (SEND)
Subject, treatment, Clinical pathology,
histopathology,
Conforming to SDTM used for CDISC/SEND
Microarray data management and analysis are
processed in Array Domain and the findings are
available to correlate with data in Study Domain

38
ArrayTrack Tutorial

39

40

41
Topic 1Comparing two groups (e.g., treated vs
control groups)

The array data are uploaded into ArrayTrack and
normalized, and now what?
You are going to learn how to determine the
differentially expressed genes (DEGs) and make
sense out of it using the ArrayTrack analysis and
library functions

Select a set of saved arrays
Biological interpretation
Divide the arrays into treated and control groups
Individual gene analysis
Pathway analysis
T-test
Determine differentially expressed genes (DEGs)
Gene Ontology Analysis
42
Examination of Transcriptional Fingerprints of
Primary Rat Hepatocytes Exposed to Cadmium Acetate

Examine Cd-treated effect on rat hepatocytes at
multiple doses and time points
Affy chip RT-U34 (1030 genes)
Only one dose and one time point are used 2 mg
and 12hrs
12 hybridizations 6 treated vs. 6 control

B
Hepatocytes
12 hrs later 4 hybs for each animal and total
12 hybs.
Control
C
D
Treated with Cd (2 mg)
D0_T12_C_a
Naming Dose_Time_BioRep_TechRep
D0_T12_C_b
43
(No Transcript)

Write a Comment

User Comments (0)