Systems Biology Visualization - PowerPoint PPT Presentation

About This Presentation
Title:

Systems Biology Visualization

Description:

Systems Biology Visualization There has been a rapid accumulation of data from protein interaction, gene expression and metabolic pathway analysis. – PowerPoint PPT presentation

Number of Views:206
Avg rating:3.0/5.0
Slides: 48
Provided by: BioSc3
Category:

less

Transcript and Presenter's Notes

Title: Systems Biology Visualization


1
Systems Biology Visualization
There has been a rapid accumulation of data from
protein interaction, gene expression and
metabolic pathway analysis. To derive meaningful
information out of this data, we need to develop
integrative visualization techniques, which
provide an insight into its biological relevance.
  • Surabhi Agarwal

2
Definition of the Problem
1
We will consider the case study of the disease
condition known as Glioma which is a group of
brain tumors. In the first part of the
animation, we take an insight into the regulation
of genes in Glioma by gene expression data
analysis . It will give us an insight into the
genes, which are modulated (up- or
down-regulated) during Glioma. In the second
part of the study, we will find the metabolic
pathways that are involved in Glioma by
undertaking a study with the protein Interaction
data . In the third part, we will explore
pathway databases and its features to study the
pathways that were retrieved from the gene and
protein interaction studies.
2
3
4
Audio Narration
Action
Description of the action

Static Image
Dsiplay image and read narration
We will consider the case study of the disease
condition known as Glioma which is a group of
brain tumors. In the first part of the animation,
we take an insight into the regulation of genes
in Glioma by gene expression data analysis . It
will give us an insight into the genes, which are
modulated (up- or down-regulated) during Glioma.
In the second part of the study, we will find
the metabolic pathways that are involved in
Glioma by undertaking a study with the protein
Interaction data . In the third part, we will
explore pathway databases and its features to
study the pathways that were retrieved from the
gene and protein interaction studies.
5
3
Master Layout (Part 1)
1
This animation consists of 3 parts Part 1 Gene
Expression Data Analysis Part 2 Protein
Interaction Data Analysis Part 3 Metabolic
Profile Databases
Chose the problem to study and extract relevant
data
2
Send the gene expression profile data as input to
the tool
3
4
Compute the features related to gene regulation
5
Genes up- or down-regulation
http//www.genome.jp/kegg/
4
Definitions of the componentsPart 1 Gene
expression data analysis
1
  • Interaction Data Interaction data refers to
    information regarding the nature and type of
    bonding between various biological components. It
    can be Protein Interaction Data, Gene Expression
    Data and Metabolic Pathway Data.
  • Visualization tools Software tools that are
    capable of reading interaction data and then
    representing it in a graphical format thereby
    providing a simplistic biological insight. E.g.
    Cytoscape for Protein Interaction data,
    Genespring for Gene Expression Data.
  • Microarray Microarrays are printed on a solid
    surface, typically glass, and used to study and
    analyze large number of samples simultaneously in
    high-throughput.

2
3
4
5
5
Gene Expression Profile DataOption
1
DATA GENERATION
INPUT
VISUALIZATION
2
3
Proceed to Full Animation
4
Audio Narration
Action
Description of the action

Option for user to view Input Or Output
The Data generation box should be linked to step
1. Input box should be linked to the step 2
input slides. Same goes for output. Output slides
should be linked to step 3. Visulaization slide
should be linked to Step 4.This SLIDE is to
provide the user an option to go through only
specific content from the animation
To view the protocol for submitting files, click
on input. To view the protocol for retrieving and
analyzing output files, click on output. To
proceed to full animation click on the arrow.
5
6
Step 1.a - Gene Expression Profile Data Data
Extraction from Experiments
?
1
2
Biological Samples e.g. gliomas
3
Microarray Chips
Scanned Slides
4
Audio Narration
Action
Description of the action
Schematic for extracting the data for defined
problem
Follow the animation. Re-draw the figures.
Users can extract gene microarray data from
Microarray Experiments. The normalized microarray
data gives an insight into the regulation of the
genes. This regulation is checked by studying the
microarray data through Gene Expression Profile
Data Analysis software. For a detailed insight
into the Microarray Technique, study the OSCAR
animation for Microarray Technologies.
5
Biochemistry by A.L.Lehninger et al., 3rd edition
7
Step 1.b - Gene Expression Profile Data Data
Extraction from Databases
?
1
B Input - Extracting microarray data For analysis
Microarray Data Repository
Query Term
High-Grade glioma
2
3
Microarray Data file
PMID ACCESSION NUMBER PROTEIN NAME GLIOMA TYPE VALIDATION FOLD CHANGE p-VALUE



4
Audio Narration
Action
Description of the action
Schematic for extracting the data for defined
problem
Follow the animation and show storage of files in
Local System
Users can extract microarray data directly from
experiments or from Public repositories such as
GEO datasets from NCBI. Premier microarray
research institutes have their own dedicated
databases for the microarray data that has been
extracted in their labs. This data is in the
form of compressed files due to their large file
sizes. These files need to be stored in a local
Personal Computer System. Here, as an example,
well study the regulation of genes in brain
tumor, known as Glioma. Gene expression data
analysis will give us a picture of the genes,
which are modulated (up- or down-regulated)
during Glioma.
5
8
Step 2 Gene Expression Profile Data - Input
1
?
The technology used in Microarray Experiments
refers to the reference organism used for making
the microarray chip
ADD PROJECT
Glioma
ADD EXPERIMENT
Select Experimental Type
Affymentrix Expression
2
  • Agilent Single Color
  • Agilent Two Color
  • Affymentrix Copy Number
  • Affymentrix Expression
  • Illumina Association Analysis
  • Illumina Copy Number
  • Illumina Single Color
  • RealTime - PCR

SELECT PLATFORM
Select Technology (if applicable)
Human
  • Barley
  • Bovine
  • E.Coli
  • BSubtilis
  • Drosophila
  • Human
  • Mouse
  • Maize
  • Human

3
UPLOAD DATA
Folder A/GSE123/GSM456.CEL
4
Action
Audio Narration
Description of the action
Schematic for entering data and setting parameters
Follow the animation and re-draw images to
replicate the working of a software environment
The software follows the input procedure in a
sequential manner. Initial steps are to add a new
project and experiment. While adding experiment,
user needs to define the type of experiment. Due
to lack of standardization, microarray data is
saved in various file formats such as CEL, GPR,
GAL, CDT. Various tools support one or more of
such formats.
5
http//www.genome.jp/kegg/
9
Step 3.a - Gene Expression Profile Data - Output
1
?
High cutoff to give significant results.
Probe Set ID Fold change(GSM34580.CEL vs GSM34586.CEL) Regulation(GSM34580.CEL vs GSM34586.CEL) Gene Symbol
34517_at 16.870739 up HMGCS1
37513_at 14.440558 up SCD
33369_at 9.3396635 up SC4MOL
34375_at 10.11749 down CCL2
35372_r_at 12.105057 down IL8
35766_at 8.585363 down KRT18
38427_at 12.070478 down COL15A1
1369_s_at 11.556258 down IL8
875_g_at 9.369903 down CCL2
695_at 9.015739 down TNC
266_s_at 8.460315 up CD24

2
Filter data - Fold Change
Heat Map
3
Summary Statistics
Functional Analysis - GO
4
Action
Audio Narration
Description of the action
Schematic for interpreting the results of Gene
Expression Data Analysis
High cutoff is provided to give significant
results. During comparison, probe sets that
satisfy the fold change cutoff of more than 8 in
at least one condition pair will be displayed in
the result. Regulation is reported by comparing
ratio of conditions 1 and 2. Thus, highlighted
gene HMGCS1 is up-regulated in sample GSM34580 as
compared to GSM 34586.
Show the simulation of the software. In each
slide, the tab that is high-lighted is ACTIVE. In
the animation format, the tab should highlight
when you click on it followed by the content of
the slide. Then the mouse should move to the
second tab and click on it leaving the first tab
inactive and second tab active. Activity of tabs
can be differentiated by separate Colors
5
http//www.genome.jp/kegg/
10
Step 3.b - Gene Expression Profile Data - Output
1
?
upregulated
downregulated
2
Filter data - Fold Change
Heat Map
3
Summary Statistics
Legend for color coding of regulation
Functional Analysis - GO
4
Audio Narration
Action
Description of the action
Schematic for interpreting the results of Gene
Expression Data Analysis
Animator needs to re-draw all screen shots as
they have been taken from the references
software. Animator must not copy the image or a
part thereof., in the final animation. Show the
simulation of the software. In each slide, the
tab that is high-lighted is ACTIVE. In the
animation format, the tab should highlight when
you click on it followed by the content of the
slide. Then the mouse should move to the second
tab and click on it leaving the first tab
inactive and second tab active. Activity of tabs
can be differentiated by separate Colors
Heat Map is the graphical visualization of the
regulation of genes, which is determined by the
cut-off value of fold change provided by the
user. The up-regulation of the gene is marked in
red while the down-regulation is marked by
blue color as explained in the figure legend.
5
http//www.genome.jp/kegg/
11
Step 3.c - Gene Expression Profile Data - Output
1
?
Property GSM34580.CEL GSM34586.CEL
No. of Observations 11 11
No. of Missing Values 0 0
Minimum -1.798769 -2.0382257
Maximum 2.0382257 1.798769
Mean -0.42409563 0.42409545
Median -1.5862231 1.5862226
Std. Deviation 1.7535444 1.7535444
2
Filter data - Fold Change
Heat Map
3
Summary Statistics
Functional Analysis - GO
4
Action
Audio Narration
Description of the action
Schematic for interpreting the results of Gene
Expression Data Analysis
Show the simulation of the software. In each
slide, the tab that is high-lighted is ACTIVE. In
the animation format, the tab should highlight
when you click on it followed by the content of
the slide. Then the mouse should move to the
second tab and click on it leaving the first tab
inactive and second tab active. Activity of tabs
can be differentiated by separate Colors
The summary statistics result gives the
statistical gist of the genes screened after
specifying a cut-off to the gene expression
analysis server. This includes the number of
genes observed to be regulated and the
statistical significance of the fold change
corresponding to it.
5
http//www.genome.jp/kegg/
12
Step 3.d - Gene Expression Profile Data - Results
1
?
  • Molecular Functions
  • catalytic activity
  • hydroxymethylglutaryl-CoA synthase activity
  • cytokine activity
  • protein binding
  • chemokine activity
  • G-protein-coupled receptor binding
  • signal transducer activity
  • Cellular Components affected
  • endoplasmic reticulum
  • extracellular region
  • soluble fraction
  • cytoplasm
  • membrane fraction

2
Filter data - Fold Change
Heat Map
  • Biological Functions
  • lipid metabolic process
  • fatty acid metabolic process
  • positive regulation of endothelial cell
    proliferation
  • angiogenesis
  • apoptosis
  • cell adhesion
  • response to hypoxia

3
Summary Statistics
Functional Analysis - GO
4
Audio Narration
Action
Description of the action
Schematic for interpreting the results of Gene
Expression Data Analysis
Show the simulation of the software. In each
slide, the tab that is high-lighted is ACTIVE. In
the animation format, the tab should highlight
when you click on it followed by the content of
the slide. Then the mouse should move to the
second tab and click on it leaving the first tab
inactive and second tab active. Activity of tabs
can be differentiated by separate colors
The Functional Analysis tools gives the functions
that the regulated genes are involved in at the
molecular level, biological level and the
cellular components they modulate.
5
http//www.genome.jp/kegg/
13
Step 4. - Gene Expression Profile Data -
Visualization
1
?
2
3
4
5
http//www.ingenuity.com/
14
Step 3.d - Gene Expression Profile Data -
Visualization
1
Audio Narration
Action
Description of the action
The pathway information relevant in Gliomas
Studies, from the input data, can be extracted.
In this we show the merged gene regulatory
pathway. We zoom into the pathway titled Cell
Cycle, Cellular Assembly and Organization, DNA
Replication, Recombination, and Repair and see
the interactions of TP53 pathway.
Static Slide
Animator needs to re-draw all screen shots as
they have been taken from the references
software. Animator must not copy the image or a
part thereof, in the final animation. Show the
image with audio narration. Show the zooming
effect a shown in the animation.
2
3
4
5
http//www.ingenuity.com/
15
Master Layout (Part 2)
1
This animation consists of 3 parts Part 1 Gene
Expression Data Analysis Part 2 Protein
Interaction Data Analysis Part 3 Metabolic
Profile Databases
Retrieve protein interaction data from
experiments or public repositories or experiments
2
Input the data in the software tool in the right
format
3
4
View, download and interpret the results
5
http//www.genome.jp/kegg/
16
Definitions of the componentsPart 2 Protein
Interaction Data Analysis
1
  • Knowledgebase The Protein Interaction Network
    tools accept the user data and map it to its
    repository. These storage units of the tools are
    called their knowledgebase.
  • Accession Number The accession number of a
    protein refers to the unique identifier, which
    acts as a common link to relate the data provided
    as input by the users with the knowledgebase of
    the tool.
  • Protein microarray These are miniaturized
    arrays, commonly printed on glass, polyacrylamide
    gel pads or microwells, onto which small
    quantities of thousands of proteins can be
    simultaneously immobilized for high-throughput
    assaying.

2
3
4
5
17
Gene Expression Profile DataOption
1
DATA GENERATION
INPUT
VISUALIZATION
2
3
Proceed to Full Animation
4
Audio Narration
Action
Description of the action

Option for user to view Input Or Output
The Data generation box should be linked to step
1. Input box should be linked to the step 2
input slides. Same goes for output. Output slides
should be linked to step 3. Visulaization slide
should be linked to Step 4.This SLIDE is to
provide the user an option to go through only
specific content from the animation
To view the protocol for submitting files, click
on input. To view the protocol for retrieving and
analyzing output files, click on output. To
proceed to full animation click on the arrow.
5
18
Step 1.a - Protein Molecular Interaction Network
Data Extraction
1
2
Protein Samples
3
Protein Microarray Chips
Scanned Slides
4
Audio Narration
Action
Description of the action
Schematic for extracting the data for defined
problem
Follow the animation. Re-draw the figures.
Users can extract protein microarray data from
Microarray Experiments. The normalized microarray
data gives an insight into the regulation of the
genes. This regulation is checked by studying the
microarray data through Gene Expression Profile
Data Analysis software. For a detailed insight
into the Microarray Technique, study the OSCAR
animation for Microarray Technologies.
5
19
Step 1.b - Protein Molecular Interaction Network
Data Extraction
1
?
Extract Data from Literature sources and store it
in a spreadsheet
Literature Resource
Query Term
High-Grade glioma
Rawdata.xls
2
PMID ACCESSION NUMBER PROTEIN NAME GLIOMA TYPE VALIDATION FOLD CHANGE p-VALUE



Extract data from Microarray Data repositories
3
4
Audio Narration
Action
Description of the action
  • Protein molecular interaction software are used
    to build and analyze networks of proteins, given
    their accession numbers. The networks are built
    by mapping input data to the softwares
    knowledgebase. Here, we explain with a list of
    proteins modulated in the disease condition
    called glioma, which are extracted from
  • literature resources.
  • Microarray Databases
  • As an output we get a spreadsheet containing
    microarray data

The first panel is about extracting information
from web resource. Show the required PDFs getting
downloaded and read through to extract data.
Follow this by a screen shot of Microarray
databases. In the end show the Raw.xls file
being formed.
Schematic for extracting the data for defined
problem
5
20
Step 1.c - Protein Molecular Interaction Network
Data Extraction
1
2
Extract data from Microarray Data repositories
3
Rawdata.xls
PMID ACCESSION NUMBER PROTEIN NAME GLIOMA TYPE VALIDATION FOLD CHANGE p-VALUE



4
Audio Narration
Action
Description of the action
Schematic for extracting the data for defined
problem
The first panel is about extracting information
from web resource. Show the required PDFs getting
downloaded and read through to store specific
data in spreadsheets
Protein molecular interaction software are used
to build and analyze networks of proteins, given
their accession numbers. The networks are built
by mapping input data to the softwares
knowledgebase. Here, we explain with a list of
proteins modulated in the disease condition
called glioma, which are extracted from
literature resources or databases.
5
21
Step 2.a - Protein Molecular Interaction Network
Input
1
?
CREATE PROJECT
UPLOAD
MAP DATA
Project Glioma
Enter Project Name
2
Core Analysis
Enter Experiment Type
Biomarker Analysis Core Analysis Toxicology
Analysis Metabolic Analysis
3
4
Audio Narration
Action
Description of the action
Schematic for Input
Show the simulation of the software. In each
slide, the tab that is high-lighted is ACTIVE. In
the animation format, the tab should highlight
when you click on it followed by the content of
the slide. Then the mouse should move to the
second tab and click on it leaving the first tab
inactive and second tab active. Activity of tabs
can be differentiated by separate Colors
The name of the project and experiments must be
entered by the user in the software for the
purpose of saving the current status of the work.
In the experiment type, the user must select the
type of analysis that needs to be conducted on
the dataset. For this Glioma case study, we
undertake core analysis of the data to identify
its network.
5
22
Step 2.b - Protein Molecular Interaction Network
Input
1
?
CREATE PROJECT
UPLOAD DATA
MAP DATA
Folder1/Rawdata.xls
Upload Excel File
2
PMID Protein Name Accession Number Glioma Type
17653765 Fructose bisphosphate aldolase 78070601 anaplastic oligodendroglioma
17653765 Phosphoglycerate mutase 1 56081766 anaplastic oligodendroglioma
17653765 Carbonic anhydrase ii 443135 anaplastic oligodendroglioma
Enolase 1 4503571 Glioblastoma multiforme
Enolase 693933 Glioblastoma multiforme
a-Enolase like 1 3282243 Glioblastoma multiforme
Enolase 1 4503571 Glioblastoma Multiforme
Aldolase C, fructose biphosphate P09972 glioblastoma,Grade II,III,IV
Enolase 1 P06733 glioblastoma,Grade II,III,IV
Enolase 2 P09104 glioblastoma,Grade II,III,IV
Glyceraldehyde-3-phosphate dehydrogenase, liver P04406 glioblastoma,Grade II,III,IV
Lactate dehydrogenase B P07195 glioblastoma,Grade II,III,IV
Phosphoglycerate kinase 1 P00558 glioblastoma,Grade II,III,IV
Phosphoglycerate mutase 1, brain Q6P6D7 glioblastoma,Grade II,III,IV
Pyruvate kinase, isozymes M1/M2 P14618-2 glioblastoma,Grade II,III,IV
Pyruvate kinase, isozymes M1/M2, splice isoform M1 P14618 glioblastoma,Grade II,III,IV
Triosephosphate isomerase P60174 glioblastoma,Grade II,III,IV
Pyruvate kinase NI Malignant Glioma
Glyceraldehyde 3-phosphate dehydrogenase P04406 Malignant Glioma
Triosephosphate isomerase P60174 Malignant Glioma
Enolase 1 P06733 Malignant Glioma
Aldolase A NI Malignant Glioma
19109410 GAPDH P16858 Glioma gradeIII,IV
19109410 Pyruvate kinase isozyme M1/M2 P52480 Glioma gradeIII,IV
19109410 Alpha-Enolase P17182 Glioma gradeIII,IV
19109410 Phosphoglycerate kinase 1 P09411 Glioma gradeIII,IV
19109410 GAPDH P16858 Glioma gradeIII,IV
3
4
MENTION THE TYPE OF IDENTIFIER SUCH AS UNIPROT,
GENEBANK ID, REFSEQ ID, ENTREZ GENE, ETC
5
23
Step 2.c - Protein Molecular Interaction Network
Input
1
Audio Narration
Action
Description of the action
Upload the Raw data file that was created after
scrutinizing the papers. The format of the Raw
data file to be uploaded varies amongst different
software. Although most software recognize
Spreadsheet format of data, some of them have
their own specific input file format such as .sif
file for Cytoscape. Once the raw data file is
uploaded, the tool will display all columns. The
user needs to select the columns that are to be
given to the tool. Out of all the columns, it is
compulsory to enter the ACCESSION NUMBER (OR ANY
OTHER PROTEIN IDENTIFIER). This column is
highlighted in red. These identifiers can be of
multiple types, which need to be defined so that
the tool can match the users data to its
dictionary of identifier terms called the
knowledgebase. All other information provided is
optional and the users can provide them depending
on the nature of analysis.
Schematic for Input
Show the simulation of the software. In each
slide, the tab that is high-lighted is ACTIVE. In
the animation format, the tab should highlight
when you click on it followed by the content of
the slide. Then the mouse should move to the next
tab and click on it leaving the first tab
inactive and second tab active. Activity of tabs
can be differentiated by separate colors.
2
3
4
5
24
Step 2.d - Protein Molecular Interaction Network
Input
1
?
CREATE PROJECT
UPLOAD DATA
MAP DATA
PMID Protein Name Accession Number Glioma Type
17653765 Fructose bisphosphate aldolase 78070601 anaplastic oligodendroglioma
17653765 Phosphoglycerate mutase 1 56081766 anaplastic oligodendroglioma
17653765 Carbonic anhydrase ii 443135 anaplastic oligodendroglioma
Enolase 1 4503571 Glioblastoma multiforme
Enolase 693933 Glioblastoma multiforme
a-Enolase like 1 3282243 Glioblastoma multiforme
Enolase 1 4503571 Glioblastoma Multiforme
Aldolase C, fructose biphosphate P09972 glioblastoma,Grade II,III,IV
Enolase 1 P06733 glioblastoma,Grade II,III,IV
Enolase 2 P09104 glioblastoma,Grade II,III,IV
Glyceraldehyde-3-phosphate dehydrogenase, liver P04406 glioblastoma,Grade II,III,IV
Lactate dehydrogenase B P07195 glioblastoma,Grade II,III,IV
Phosphoglycerate kinase 1 P00558 glioblastoma,Grade II,III,IV
Phosphoglycerate mutase 1, brain Q6P6D7 glioblastoma,Grade II,III,IV
Pyruvate kinase, isozymes M1/M2 P14618-2 glioblastoma,Grade II,III,IV
Pyruvate kinase, isozymes M1/M2, splice isoform M1 P14618 glioblastoma,Grade II,III,IV
Triosephosphate isomerase P60174 glioblastoma,Grade II,III,IV
Pyruvate kinase NI Malignant Glioma
Glyceraldehyde 3-phosphate dehydrogenase P04406 Malignant Glioma
Triosephosphate isomerase P60174 Malignant Glioma
Enolase 1 P06733 Malignant Glioma
Aldolase A NI Malignant Glioma
19109410 GAPDH P16858 Glioma gradeIII,IV
19109410 Pyruvate kinase isozyme M1/M2 P52480 Glioma gradeIII,IV
19109410 Alpha-Enolase P17182 Glioma gradeIII,IV
19109410 Phosphoglycerate kinase 1 P09411 Glioma gradeIII,IV
19109410 GAPDH P16858 Glioma gradeIII,IV
2
3
4
5
25
Step 2.d - Protein Molecular Interaction Network
Input
1
Audio Narration
Action
Description of the action
The input raw data is mapped to the knowledgebase
of the software to provide a uniform set of IDs
for building a network. The IDs from the input
file that are not matched with its knowledgebase
are highlighted in red
Schematic for Input
This file is same as input file. Only the entries
that are not mapped need to be highlighted as
animation
2
3
4
5
26
Step 2.e - Protein Molecular Interaction Network
Input
1
?
CREATE PROJECT
UPLOAD
MAP DATA
Data gets mapped to Knowledgebase of software to
produce output files
ID Gene Description Location Family
78070601 ALDOC aldolase C, fructose-bisphosphate Cytoplasm enzyme
56081766 PGAM1 phosphoglycerate mutase 1 (brain) Cytoplasm phosphatase
4503571 ENO1 enolase 1, (alpha) Cytoplasm transcription regulator
693933 ENO1 enolase 1, (alpha) Cytoplasm transcription regulator
P09972 ALDOC aldolase C, fructose-bisphosphate Cytoplasm enzyme
P06733 ENO1 enolase 1, (alpha) Cytoplasm transcription regulator
P09104 ENO2 enolase 2 (gamma, neuronal) Cytoplasm enzyme
P04406 GAPDH (includes EG2597) glyceraldehyde-3-phosphate dehydrogenase Cytoplasm enzyme
P07195 LDHB lactate dehydrogenase B Cytoplasm enzyme
P00558 PGK1 phosphoglycerate kinase 1 Cytoplasm kinase
Q6P6D7 PGAM1 phosphoglycerate mutase 1 (brain) Cytoplasm phosphatase
P14618-2 PKM2 pyruvate kinase, muscle Cytoplasm kinase
P14618 PKM2 pyruvate kinase, muscle Cytoplasm kinase
P60174 TPI1 triosephosphate isomerase 1 Cytoplasm enzyme
P04406 GAPDH (includes EG2597) glyceraldehyde-3-phosphate dehydrogenase Cytoplasm enzyme
P60174 TPI1 triosephosphate isomerase 1 Cytoplasm enzyme
P06733 ENO1 enolase 1, (alpha) Cytoplasm transcription regulator
P16858 GAPDH (includes EG14433) glyceraldehyde-3-phosphate dehydrogenase Plasma Membrane enzyme
P52480 PKM2 pyruvate kinase, muscle Cytoplasm kinase
P17182 ENO1 enolase 1, (alpha) Cytoplasm transcription regulator
2
3
4
5
27
Step 2.e - Protein Molecular Interaction Network
Input
1
Audio Narration
Action
Description of the action
The tool also extracts other relevant information
from its knowledgebase corresponding to that ID.
The uniform IDs and the new columns are displayed
in the form of a new spreadsheet which has the
refined data. The columns highlighted in blue
are the ones that are newly added. The red column
is provided for uniformity by taking one specific
naming scheme for identifiers.
Schematic for Input
Show the simulation of the software. In each
slide, the tab that is high-lighted is ACTIVE. In
the animation format, the tab should highlight
when you click on it followed by the content of
the slide. Then the mouse should move to the next
tab and click on it leaving the first tab
inactive and second tab active. Activity of tabs
can be differentiated by separate Colors
2
3
4
5
28
Step 3 - Protein Interaction Data Analysis -
Output
1
?
BUILD PATHWAY
OUTPUT NETWORK
OUTPUT PATHWAY
TOP DISEASE NETWORK
TOP PHYSIOLOGICAL NETWORK
TOP NETWORK FUNCTIONS
2
  • Genetic Disorder, Neurological Disease, Nucleic
    Acid Metabolism
  • Cell-To-Cell Signaling and Interaction, Nervous
    System Development and Function, Cellular
    Assembly and Organization
  • Cancer, Reproductive System Disease,
    Gastrointestinal Disease
  1. Cancer
  2. Gastrointestinal Disease
  3. Neurological Disease
  1. Nervous System Development and Function
  2. Hematological System Development and Function
  3. Immune Cell Trafficking

3
TOP CANONICAL PATHWAY
  1. Glycolysis/Gluconeogenesis
  2. Mitochondrial Dysfunction
  3. 14-3-3-mediated Signaling

4
Audio Narration
Action
Description of the action
The tools provide a summary of results which show
the top networks produced in each category. The
ranking is based on the number of mappings from
user input dataset to softwares knowledgebase.
The prediction of Neurological Disease,
Cancer, Nervous System as top networks
reinforce our data analysis. The data analysis
from this tool also shows that Glycolysis/Glucone
ogenesis is the pathway that is getting
modulated from our list of proteins
Schematic for Output summary
Follow the animation
5
29
Step 4.a -Protein Molecular Interaction Network -
Output
1
?
BUILD PATHWAY
OUTPUT NETWORK
OUTPUT PATHWAY
2
Select the number of networks to be constructed
1
Select the maximum number of Molecules in the
network
70
3
Select endogenous chemicals
No
4
Audio Narration
Action
Description of the action
Users can modulate parameters which define the
number and size of networks to be formed. Users
can also modulate the presence of molecules apart
from genes, proteins or RNA. The molecules that
have shown relationships with other genes or
proteins of the knowledgebase are mapped into the
network. The IDs that are repetitive will point
to the same node in the network
Schematic for Output summary
Follow the animation
5
30
Step 4.b - Protein Molecular Interaction Network
- Output
1
?
BUILD PATHWAY
OUTPUT NETWORK
OUTPUT PATHWAY
2
3
Seed Molecules
Molecular interaction
Another Small Interaction Network
Network interaction
4
Audio Narration
Action
Description of the action
From the input given by users, the tool analyzes
the set of molecules, which are present in its
database of metabolic network. The molecules that
are found to occur most frequently are used as
seeds which connect to other such molecules.
Networks are also extended based on interactions
between two small networks to produce a larger
network. Such analysis will depend on the
parameters set by the user in the initial steps.
Based on this information, the tool will predict
the pathway to which the molecules are most
likely to belong. Further analysis of these
pathways can be carried out using metabolic
profile databases.
Schematic for Output summary
Follow the animation. Highlight the yellow boxes
in animation as well.
5
31
Step 4.c - Protein Molecular Interaction Network
- Visualization
1
?
2
3
4
5
http//www.ingenuity.com/
32
Step 3.d - Gene Expression Profile Data -
Visualization
1
Audio Narration
Action
Description of the action
Zoom effect
Animator needs to re-draw all screen shots as
they have been taken from the references
software. Animator must not copy the image or a
part thereof, in the animation. Show the image
with each part zooming and then coming as a
zoomed image.
The pathway information relevant in Gliomas
Studies, from the input data, can be extracted.
In this pathway, we can observe the role of
Isocitrate Dehydrogenase (IDH), in regulation of
metabolism during Glioma. Recently a published
study has also shown the involvement of IDH in
Gloma related pathways. Most such software are
linked to Protein Pathway Interaction Software,
which are described in detail in the next part of
the animation.
2
3
4
5
http//www.ingenuity.com/
33
Master Layout (Part 3)
This animation consists of 3 parts Part 1 Gene
Expression Data Analysis Part 2 Protein
Interaction Data Analysis Part 3 Metabolic
Profile Databases
1
Select the level of organization of the
biological system to study
2
Select from one of the publicly available
databases
3
Select the relevant options in the database to
view the pathway network and interaction data of
the system under consideration
4
5
http//www.genome.jp/kegg/
34
Definitions of the componentsPart 3 Metabolic
profile databases
1
  • 1. Biological System In the biological context,
    a system refers to an entity that exists with
    the help of mutual interactions between its
    components.
  • 2. Level of organization The level of
    organization describes the complexity of the
    biological system being studied. Components of
    one system could be made up of constituent parts,
    which in turn form another system at a different
    level of organization. For example, a cell is a
    system in itself. However for larger
    physiological systems, a cell would only be a
    component within it.
  • 3. Visualization To explore various
    protein-protein interactions, it is critical to
    percept lists of protein interaction data, which
    is retrieved as elaborate spreadsheets that make
    the analysis cumbersome. Mapping of such data in
    a diagrammatic form makes it easier for
    scientists to develop a biological insight into
    the interaction data.
  • 4. Functional annotation By examining the maps
    of proteinprotein interaction data, researchers
    can discover new biological relationships between
    proteins or predict their functions based on
    specific interactions.
  • 5. Graphical Notation The first step in the
    analysis of protein interaction data is the
    identification of protein complexes and groups of
    complexes. In a simple graphical notation, a
    Node represents a protein while the Edges
    represent the interaction between the two
    proteins.

2
3
4
5
35
Definitions of the componentsPart 3 Metabolic
profile databases
1
  • 6. Pathway A pathway in Biology refers to a
    series of inter-related metabolic reactions,
    which depicts the order of conversion of one
    entity to another.
  • 7. Meta node It is a single node onto which all
    members of a protein cluster are collapsed. These
    meta nodes help in deciphering biological
    applications of the networks which are collapsed
    as one.

2
3
4
5
36
Step 1 Pathway Databases Input
1
Choose the system
ORGANISM
ENZYMES
DISEASE
PATHWAY
2
METABOLISM GENETIC INFORMATION PROCESSING ENVIRONM
ENTAL INFORMATION PROCESSING CELLULAR
PROCESSES ORGANISMAL SYSTEMS
CANCER IMMUNE SYSTEM DISEASE NEURO DEGENERATIVE
DISEASE CARDIO-VASCULAR DISEASE METABOLIC
DISEASES INFECTIOUS DISEASES
ENZYME NAME EC NUMBER SYNONYMS
PROKARYOTES PROTISTS FUNGI PLANTS ANIMALS
CANCER IMMUNE SYSTEM DISEASE NEURO DEGENERATIVE
DISEASE CARDIO-VASCULAR DISEASE METABOLIC
DISEASES INFECTIOUS DISEASES
3
4
Action
Audio Narration
Description of the action
  • The pathway databases are repositories to gain a
    visual insight into the biological interaction
    of genes and proteins. The general features of
    these databases include searching by
  • Pathway The entire network information in the
    web based database can be searched by selecting
    the metabolic pathway of interest, such as
    cellular processes, genetic information flow,
    etc.
  • Diseases Here all the networks are grouped based
    on the diseases which are caused by their
    modulation.
  • Enzymes The enzymes belonging to the pathway
    database are grouped and the pathways can be
    searched by giving their enzyme information as a
    query.
  • Organism All organisms are given a unique
    identifier. Users can also select the organism,
    and then study the pathway as it occurs in those
    organisms.

Animation of the Input search strategies for
Pathway databases
Follow the steps in the animation. Re-draw
images. The audio narration must be read, as the
cursor in the animation moves to the 4 headings
of the web-page
5
http//www.genome.jp/kegg/
37
Step 2.a - Pathway Databases Visualization of
Pathways for Glioma
1
Nodes
2
3
Edges
4
Audio Narration
Description of the action
Action
Animator needs to re-draw all screen shots as
they have been taken from the references
software. Animator must not copy the image or a
part thereof., in the final animation. Display
Image. Highlight the nodes and edges as
shown in animation. The red box zooms to show the
area of the network which is getting zoomed into.
This is followed by the zoomed image of that part
of the network. Each zoomed image is followed by
the narration in the order given.
We use pathway databases to study one of the
pathways from our Glioma studies in Protein
Interaction Networks, namely Cell Cycle,
Cellular Assembly and Organization, DNA
Replication, Recombination, and Repair. Here we
highlight the nodes and edges within the pathway.
Here the nodes are the corresponding gene and the
edges are interaction between them. Users can
also find images from such visualization tools
for specific gene interaction such as in this
case we depict the interactions of TP53, derived
from Glioma studies.
Zoomed Images
5
http//www.ingenuity.com/, http//www.cytoscape.or
g/
38
Step 2.c - Pathway Databases Interpretation
1
2
3
4
Audio Narration
Action
Description of the action
Options given once you click on a particular
entity of Pathway
Pathways can also be pbtained for protein
interaction networks. In such networks, the
metabolites are the nodes and the reaction
between them are the edges. Each node such as a
substrate, reactant or an enzyme is hyper-linked
to another page which gives the detailed
information about the particular entity. Each
element of the pathway including the pathway
itself is assigned an identifier for the purpose
of referring to it from anywhere in the database.
It also gives all the information related to the
molecule or reaction such as its orthology, the
pathways it belongs to and the corresponding gene
IDs.
In each slide, the tab that is high-lighted is
ACTIVE. In the animation format, the tab should
highlight when you click on it followed by the
content of the slide. Then the mouse should move
to the second tab and click on it leaving the
first tab inactive and second tab active.
Activity of tabs can be differentiated by
separate Colors
5
http//www.genome.jp/kegg/
39
Step 2.d - Pathway Databases Interpretation
1
2
3
4
Action
Audio Narration
Description of the action
Options given once you click on a particular
entity of Pathway
It also gives all the enzyme related information
for the reaction such as the Enzyme nomenclature,
Enzyme Commission Number, Class of Enzyme,
substrates and products.
In each slide, the tab that is high-lighted is
ACTIVE. In the animation format, the tab should
highlight when you click on it followed by the
content of the slide. Then the mouse should move
to the second tab and click on it leaving the
first tab inactive and second tab active.
Activity of tabs can be differentiated by
separate Colors
5
http//www.genome.jp/kegg/
40
Step 2.e - Pathway Databases Interpretation
1
2
3
4
Action
Audio Narration
Description of the action
Options given once you click on a particular
entity of Pathway
Re-Draw the equation. In each slide, the tab that
is high-lighted is ACTIVE. In the animation
format, the tab should highlight when you click
on it followed by the content of the slide. Then
the mouse should move to the second tab and click
on it leaving the first tab inactive and second
tab active. Activity of tabs can be
differentiated by separate Colors
The metabolic reaction that the enzyme is
involved in is also provided in its equation form
along with structures of reaction substrates.
5
http//www.genome.jp/dbget-bin/www_bget?R00960RP0
0303RC00078
41
Interactivity option 1Step No 1 - Assignment
1
.gal Files
Name of Enzyme
Name of Disease
Type of Input Data
.cel Files
Name of Pathway
List of Protein Identifiers
2
.gpr Files
.sif Files
.cdt Files
3
Type of Analysis Tools
4
Results
Boundary/limits
Interactivity Type Options
Drag the yellow buttons into one amongst the 3
Analysis Tools. The correct results are given in
the next slide
If the user drags it into the right box, the
animation should flash a Tick Sign. If the box
is incorrect, flash a Cross Sign and ask the
user to Try Again
Drag and Drop.
5
42
Interactivity option 1Step No 2 -RESULTS
1
.cdt Files
Name of Enzyme
2
.gpr Files
Name of Pathway
.sif Files
.gal Files
Name of Disease
List of Protein Identifiers
.cel Files
3
4
Results
Boundary/limits
Interacativity Type Options
Drag the yellow buttons into one amongst the 3
Analysis Tools. The correct results are given in
the next slide
If the user drags it into the right box, the
animation should flash a Tick Sign. If the boox
is incorrect, flash a Cross Sign and ask the
user to Try Again
Drag and Drop.
5
43
Questionnaire - 1
1
1. Which amongst these is not a feature of a
Protein network? a. Edges b. Nodes c. Metanodes d.
Antinodes 2. What are the results of Gene
Expression Analysis? a. Heat Map b. Fold
Change c. P-value d. All of the Above 3.
Protein Pathways can be studied
using? a. Stand-alone tools b. Web-based
tools c. Both d. None
2
3
4
5
44
Questionnaire - 2
1
4. Which is a mandatory entry to study Protein
Interaction Pathways? a. Fold Change b. p-Value c.
Unique Identifier like Accession Number d. All
of the Above 5. In case of Gene Expression Data
Analysis, Heat Map represents? a. Significance of
the Gene b. Fold Change c. p-value d. Gene
Ontology 6. Which amongst these is a valid
Microarray File Extension? a. GAL b. GPR c. CEL d.
All of the Above
2
3
4
5
45
Links for further reading
  • Books
  • Systems Biology An Approach P Kohl1, EJ
    Crampin2, TA Quinn1 and D Noble1
  • An introduction to Systems Biology Design
    Principles of Biological Circuits by Uri Alon
    June 2006, ChapmanHall/CRC, Taylor and Francis
    Group
  • Introduction to Systems Biology Choi, Sangdun
    (California Institute of Technology) July 2007,
    Humana Press
  • Research Papers
  • Visualizing biological pathways requirements
    analysis, systems evaluation, and research
    agenda. Saraiya, P., North, C. Duca, K. (2005).
  • Tools for visually exploring biological networks.
    Suderman, M. Hallett, M (2007).
  • A survey of visualization tools for biological
    network analysis. Pavlopoulos, G.A.G., Wegener,
    A.L.A. Schneider, R.R. (2008).
  • Visualization of omics data for systems biology
    Nils Gehlenborg, Seán I ODonoghue, Nitin S
    Baliga, Alexander Goesmann, Matthew A Hibbs,
    Hiroaki Kitano, Oliver Kohlbacher, Heiko
    Neuweger, Reinhard Schneider, Dan Tenenbaum
    Anne-Claude Gavin. Nature (2010)

46
Links for further reading
  • Webliography
  • http//www.genome.jp/kegg/
  • http//www.chem.agilent.com/Library/usermanuals/Pu
    blic/GeneSpring-manual.pdfhttp//www.moleculardevi
    ces.com/pages/software/gn_genepix_pro.html
  • http//www.cytoscape.org/
  • http//www.ingenuity.com/
  • http//www.genego.com/metacore.php
  • http//www.ece.cmu.edu/brunos/Lecture3.pdf
  • http//pathways.embl.de/
  • http//www.biocyc.org/
  • http//www.arena3d.org/
  • http//spotfire.tibco.com/
  • http//www.bioconductor.org/
  • http//www.chem.agilent.com/en-US/Products/softwar
    e/lifesciencesinformatics/genespringgx/pages/gp347
    27.aspx
  • http//www.cytoscape.org/download.php

47
Links for further reading
  • Following URLs are used for animations
  • http//www.genome.jp/kegg/
  • Biochemistry by A.L.Lehninger et al., 3rd edition
  • http//www.ingenuity.com/
  • http//www.cytoscape.org/
  • http//www.genome.jp/dbget-bin/www_bget?R00960RP0
    0303RC00078
  • http//www.genego.com/metacore.php
  • http//www.ece.cmu.edu/brunos/Lecture3.pdf
  • http//pathways.embl.de/
  • http//www.chem.agilent.com/Library/usermanuals/Pu
    blic/GeneSpring-manual.pdfhttp//www.moleculardevi
    ces.com/pages/software/gn_genepix_pro.html
Write a Comment
User Comments (0)
About PowerShow.com