Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction - PowerPoint PPT Presentation

About This Presentation

Title:

Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

Description:

Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction Lecture 13: Protein Function Systems Biology is the study of the interactions ... – PowerPoint PPT presentation

Number of Views:238

Avg rating:3.0/5.0

Slides: 55

Provided by: AntonFe9

Category:

more less

Transcript and Presenter's Notes

Title: Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

1
Bioinformatics Master CourseDNA/Protein
Structure-Function Analysis and Prediction

Lecture 13 Protein Function

2
Sequence-Structure-Function
Ab initio prediction and folding
Sequence Structure Function
impossible but for the smallest structures
Threading
Function prediction from structure
Homology searching (BLAST)
very difficult
3
Metabolomics fluxomics
4
Systems Biology

is the study of the interactions between the
components of a biological system, and how these
interactions give rise to the function and
behaviour of that system (for example, the
enzymes and metabolites in a metabolic pathway).
The aim is to quantitatively understand the
system and to be able to predict the systems
time processes
the interactions are nonlinear
the interactions give rise to emergent
properties, i.e. properties that cannot be
explained by the components in the system
Biological processes include many time-scales,
many compartments and many interconnected network
levels (e.g. regulation, signalling,
expression,..)

5
Systems Biology

understanding is often achieved through modeling
and simulation of the systems components and
interactions.
Many times, the four Ms cycle is adopted
Measuring
Mining
Modeling
Manipulating

6
The silicon cell (some people think
silly-con cell)
7
(No Transcript)
8
A system response
Apoptosis programmed cell death Necrosis
accidental cell death
9
Human
Yeast
Comparative metabolomics
We need to be able to do automatic pathway
comparison (pathway alignment)
This pathway diagram shows a comparison of
pathways in (left) Homo sapiens (human) and
(right) Saccharomyces cerevisiae (bakers yeast).
Changes in controlling enzymes (square boxes in
red) and the pathway itself have occurred (yeast
has one altered (overtaking) path in the graph)
10
The citric-acid cycle
http//en.wikipedia.org/wiki/Krebs_cycle
11
The citric-acid cycle
Fig. 1. (a) A graphical representation of the
reactions of the citric-acid cycle (CAC),
including the connections with pyruvate and
phosphoenolpyruvate, and the glyoxylate shunt.
When there are two enzymes that are not
homologous to each other but that catalyse the
same reaction (non-homologous gene displacement),
one is marked with a solid line and the other
with a dashed line. The oxidative direction is
clockwise. The enzymes with their EC numbers are
as follows 1, citrate synthase (4.1.3.7) 2,
aconitase (4.2.1.3) 3, isocitrate dehydrogenase
(1.1.1.42) 4, 2-ketoglutarate dehydrogenase
(solid line 1.2.4.2 and 2.3.1.61) and
2-ketoglutarate ferredoxin oxidoreductase (dashed
line 1.2.7.3) 5, succinyl- CoA synthetase
(solid line 6.2.1.5) or succinyl-CoAacetoacetate
-CoA transferase (dashed line 2.8.3.5) 6,
succinate dehydrogenase or fumarate reductase
(1.3.99.1) 7, fumarase (4.2.1.2) class I (dashed
line) and class II (solid line) 8,
bacterial-type malate dehydrogenase (solid line)
or archaeal-type malate dehydrogenase (dashed
line) (1.1.1.37) 9, isocitrate lyase (4.1.3.1)
10, malate synthase (4.1.3.2) 11,
phosphoenolpyruvate carboxykinase (4.1.1.49) or
phosphoenolpyruvate carboxylase (4.1.1.32) 12,
malic enzyme (1.1.1.40 or 1.1.1.38) 13, pyruvate
carboxylase or oxaloacetate decarboxylase
(6.4.1.1) 14, pyruvate dehydrogenase (solid
line 1.2.4.1 and 2.3.1.12) and pyruvate
ferredoxin oxidoreductase (dashed line 1.2.7.1).
M. A. Huynen, T. Dandekar and P. Bork Variation
and evolution of the citric acid cycle a genomic
approach'' Trends Microbiol, 7, 281-29 (1999)
12
The citric-acid cycle
b) Individual species might not have a complete
CAC. This diagram shows the genes for the CAC for
each unicellular species for which a genome
sequence has been published, together with the
phylogeny of the species. The distance-based
phylogeny was constructed using the fraction of
genes shared between genomes as a similarity
criterion29. The major kingdoms of life are
indicated in red (Archaea), blue (Bacteria) and
yellow (Eukarya). Question marks represent
reactions for which there is biochemical evidence
in the species itself or in a related species but
for which no genes could be found. Genes that lie
in a single operon are shown in the same color.
Genes were assumed to be located in a single
operon when they were transcribed in the same
direction and the stretches of non-coding DNA
separating them were less than 50 nucleotides in
length.
M. A. Huynen, T. Dandekar and P. Bork Variation
and evolution of the citric acid cycle a genomic
approach'' Trends Microbiol, 7, 281-29 (1999)
13
Experimental

Structural genomics
Functional genomics
Protein-protein interaction
Metabolic pathways
Expression data

14
Communicability Functional Genomics

Interpretation of genome-scale gene expression
data

External Program
DNA-chip data

Cluster of coregulated genes
gene 1
gene 2
...
gene n

PFMP query

Pathways affected
pathway 1
pathway 2

15
Communicability Functional Genomics

Interpretation of genome-scale gene expression
data

External Programs
DNA-chip data

Cluster of coregulated genes
gene 1
gene 2
...
gene n

Pattern discovery
gene 1
gene 2
...
(putative regulatory sites)

Similarities with known regulatory sites
site 1 Factor 1
site 2 Factor 2
...

PFMP query
16
Other Issues

Partial information (indirect interactions) and
subsequent filling of the missing steps
Negative results (elements that have been shown
not to interact, enzymes missing in an organism)
Putative interactions resulting from
computational analyses

17
Protein function categories

Catalysis (enzymes)
Binding transport (active/passive)
Protein-DNA/RNA binding (e.g. histones,
transcription factors)
Protein-protein interactions (e.g.
antibody-lysozyme) (experimentally determined by
yeast two-hybrid (Y2H) or bacterial two-hybrid
(B2H) screening )
Protein-fatty acid binding (e.g. apolipoproteins)
Protein small molecules (drug interaction,
structure decoding)
Structural component (e.g. ?-crystallin)
Regulation
Signalling
Transcription regulation
Immune system
Motor proteins (actin/myosin)

18
Catalytic properties of enzymes
Vmax S V -------------------
Km S
Michaelis-Menten equation
Vmax

Km kcat
E S ES E P
E enzyme
S substrate
ES enzyme-substrate complex (transition state)
P product
Km Michaelis constant
Kcat catalytic rate constant (turnover number)
Kcat/Km specificity constant (useful for
comparison)

Moles/s
Vmax/2
Km
S
19
Protein interaction domains
http//pawsonlab.mshri.on.ca/html/domains.html
20
Energy difference upon binding

Examples of protein interactions (and functional
importance) include
Protein protein (pathway analysis)
Protein small molecules (drug interaction,
structure decoding)
Protein peptides, DNA/RNA (function analysis)
The change in Gibbs Free Energy of the
protein-ligand binding interaction can be
monitored and expressed by the following
?G ? H T ?S
(HEnthalpy, SEntropy and TTemperature)

21
Protein function

Many proteins combine functions
Some immunoglobulin structures are thought to
have more than 100 different functions (and
active/binding sites)
Alternative splicing can generate (partially)
alternative structures

22
Protein function Interaction
Active site / binding cleft
Shape complementarity
23
Protein function evolution
Chymotrypsin
24
How to infer function

Experiment
Deduction from sequence
Multiple sequence alignment conservation
patterns
Homology searching
Deduction from structure
Threading
Structure-structure comparison
Homology modelling

25
Cholesterol Biosynthesis

Cholesterol biosynthesis primarily occurs in
eukaryotic cells. It is necessary for membrane
synthesis, and is a precursor for steroid hormone
production as well as for vitamin D. While the
pathway had previously been assumed to be
localized in the cytosol and ER, more recent
evidence suggests that a good deal of the enzymes
in the pathway exist largely, if not exclusively,
in the peroxisome (the enzymes listed in blue in
the pathway to the left are thought to be at
least partly peroxisomal). Patients with
peroxisome biogenesis disorders (PBDs) have a
variable deficiency in cholesterol biosynthesis

26
Cholesterol Biosynthesis from acetyl-Coa to
mevalonate
Mevalonate plays a role in epithelial cancers
it can inhibit EGFR
27
Epidermal Growth Factor as a Clinical Target in
Cancer

A malignant tumour is the product of uncontrolled
cell proliferation. Cell growth is controlled by
a delicate balance between growth-promoting and
growth-inhibiting factors. In normal tissue the
production and activity of these factors results
in differentiated cells growing in a controlled
and regulated manner that maintains the normal
integrity and functioning of the organ. The
malignant cell has evaded this control the
natural balance is disturbed (via a variety of
mechanisms) and unregulated, aberrant cell growth
occurs. A key driver for growth is the epidermal
growth factor (EGF) and the receptor for EGF (the
EGFR) has been implicated in the development and
progression of a number of human solid tumours
including those of the lung, breast, prostate,
colon, ovary, head and neck.

28
Energy housekeeping

Adenosine diphosphate (ADP) Adenosine
triphosphate (ATP)

29
Chemical Reaction
30
Enzymatic Catalysis
31
Gene Expression
32
Inhibition
33
Metabolic Pathway Proline Biosynthesis
34
Transcriptional Regulation
35
Methionine Biosynthesis in E. coli
36
Shortcut Representation
37
High-level Interaction
38
Levels of Resolution
39
Cholesterol Biosynthesis
40
SREBP Pathway
41
Signal Transduction
Important signalling pathways Map-kinase (MapK)
signalling pathway, or TGF-? pathway
42
Transport
43
Phosphate Utilization in Yeast
44
Multiple Levels of Regulation

Gene expression
Protein activity
Protein intracellular location
Protein degradation
Substrate transport

45
Graphical Representation Gene Expression
46
Experimental Data Gene Expression
47
Experimental Data Transcriptional Regulation
48
Experimental Data Transcriptional Regulation
49
Transcriptional RegulationIntegrated View
50
Pathways and Pathway Diagrams

Pathways
Set of nodes (entities) and edges (associations)
Pathway Diagrams
XY coordinates
Node splitting allowed
Multiple views of the same pathway
Different abstraction levels

51

Metabolic networksGlycolysis and
Gluconeogenesis
Kegg database (Japan)
52
Gene Ontology (GO)

Not a genome sequence database
Developing three structured, controlled
vocabularies (ontologies) to describe gene
products in terms of
biological process
cellular component
molecular function
in a species-independent manner

53
The GO ontology
54
Gene Ontology Members

FlyBase - database for the fruitfly Drosophila
melanogaster
Berkeley Drosophila Genome Project (BDGP) -
Drosophila informatics GO database software,
Sequence Ontology development
Saccharomyces Genome Database (SGD) - database
for the budding yeast Saccharomyces cerevisiae
Mouse Genome Database (MGD) Gene Expression
Database (GXD) - databases for the mouse Mus
musculus
The Arabidopsis Information Resource (TAIR) -
database for the brassica family plant
Arabidopsis thaliana
WormBase - database for the nematode
Caenorhabditis elegans
EBI GOA project annotation of UniProt
(Swiss-Prot/TrEMBL/PIR) and InterPro databases
Rat Genome Database (RGD) - database for the rat
Rattus norvegicus
DictyBase - informatics resource for the slime
mold Dictyostelium discoideum
GeneDB S. pombe - database for the fission yeast
Schizosaccharomyces pombe (part of the Pathogen
Sequencing Unit at the Wellcome Trust Sanger
Institute)
GeneDB for protozoa - databases for Plasmodium
falciparum, Leishmania major, Trypanosoma brucei,
and several other protozoan parasites (part of
the Pathogen Sequencing Unit at the Wellcome
Trust Sanger Institute)
Genome Knowledge Base (GK) - a collaboration
between Cold Spring Harbor Laboratory and EBI)
TIGR - The Institute for Genomic Research
Gramene - A Comparative Mapping Resource for
Monocots
Compugen (with its Internet Research Engine)
The Zebrafish Information Network (ZFIN) -
reference datasets and information on Danio rerio