Diapositiva 1 - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Diapositiva 1

Description:

HUNTER. EcoGene/EcoProt (bmb.med.miami.edu/EcoGene) Protein coding genes: 4,173 ... new proteins predicted in the class with Hunter, out of the non-annotated region ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 49
Provided by: casa49
Category:

less

Transcript and Presenter's Notes

Title: Diapositiva 1


1
Transmembrane protein annotation
Rita Casadio
BIOCOMPUTING GROUP Interdepartmental Centre for
Biotechnological Research/Department of
Biology University of Bologna, Italy
2
The omic era
Genome Sequencing Projects
Archaea
26 species
Bacteria
286 species
Complete - 21 Assembly - 86 In Progress - 171
Eukaryotic
http//www.ncbi.nlm.nih.gov/
Update March 2006
3
The Data Bases of Biological Sequences and
Structures
GenBank 54,584,635 sequences 59,750,386,305
nucleotides
gtBGAL_SULSO BETA-GALACTOSIDASE Sulfolobus
solfataricus. MYSFPNSFRFGWSQAGFQSEMGTPGSEDPNTDWYKW
VHDPENMAAGLVSG DLPENGPGYWGNYKTFHDNAQKMGLKIARLNVEWS
RIFPNPLPRPQNFDE SKQDVTEVEINENELKRLDEYANKDALNHYREIF
KDLKSRGLYFILNMYH WPLPLWLHDPIRVRRGDFTGPSGWLSTRTVYEF
ARFSAYIAWKFDDLVDE YSTMNEPNVVGGLGYVGVKSGFPPGYLSFELS
RRHMYNIIQAHARAYDGI KSVSKKPVGIIYANSSFQPLTDKDMEAVEMA
ENDNRWWFFDAIIRGEITR GNEKIVRDDLKGRLDWIGVNYYTRTVVKRT
EKGYVSLGGYGHGCERNSVS LAGLPTSDFGWEFFPEGLYDVLTKYWNRY
HLYMYVTENGIADDADYQRPY YLVSHVYQVHRAINSGADVRGYLHWSLA
DNYEWASGFSMRFGLLKVDYNT KRLYWRPSALVYREIATNGAITDEIEH
LNSVPPVKPLRH
NR 2,638,494 sequences
847,653,699 residues
SwissProt 211,104 sequences
77,361,893 residues
PDB 35,460 structures
Membrane proteins lt1
Update March 2006
4
Why membrane proteins ? So many important
functions.
5
Different architectures
Outer Membrane proteins (all b-Transmembrane
proteins)
Inner Membrane proteins (all a-Transmembrane
proteins)
6
Membrane classification and membrane protein
structures
Structural Type of M proteins
All-b
All-a
7

Outer Membrane
Inner Membrane
?-barrel
?-helices
Bilayer
Bacteriorhodopsin (Halobacterium salinarum)
Porin (Rhodobacter capsulatus)
8
Functional annotation in silico by homology search
ADH1_SULSO ----------MRAVRLVEIGKP--LSLQEIGVPKPKGP
QVLIKVEAAGVCHSDVHMRQGRFGNLRIVE ADH_CLOBE
----------MKGFAMLGINKLG---WIEKERPVAGSYDAIVRPLAVSPC
TSDIHTVFEGA------- ADH_THEBR ----------MKGFAMLSI
GKVG---WIEKEKPAPGPFDAIVRPLAVAPCTSDIHTVFEGA-------
ADH1_SOLTU MSTTVGQVIRCKAAVAWEAGKP--LVMEEVDVAPPQKM
EVRLKILYTSLCHTDVYFWEAKG------- ADH2_LYCES
MSTTVGQVIRCKAAVAWEAGKP--LVMEEVDVAPPQKMEVRLKILYTSLC
HTDVYFWEAKG------- ADH1_ASPFL ----MSIPEMQWAQVAEQK
GGP--LIYKQIPVPKPGPDEILVKVRYSGVCHTDLHALKGDW-------

Sequence comparison is performed with alignment
programs
Sequence identity ? 30
Similar function
Methods for similarity searches
BLAST, Psi-BLAST (http//www.ncbi.nlm.nih.gov/BLAS
T/)
Altschul et al., (1990) J Mol Biol
215403-410 Altschul et al., (1998) Nucleic Acids
Res. 253389-3402
Pfam (http//pfam.wustl.edu/hmmsearch.shtml)
Bateman et al., (2000) Nucleic Acids Research
28263-266
9
Our strategy Annotation by predicting membrane
protein topology
10
Predictors of the Topology of Membrane Proteins
11
Tools out of machine learning approaches Neural
Networks(NNs) and/or Hidden Markov Models(HMMs)
Testing
Training
NN HMM
General rules
Prediction
12
Annotation/prediction ofall-alpha transmembrane
proteins
13
Biosapiens network of excellence Annotation of
all-alpha membrane proteins
14
Our starting Data Base
UniProt (September 22, 2004) 33,135 unique
human proteins http//www.ebi.ac.uk/integr8/FtpSea
rch.do?orgProteomeID25
In UniProt 4002 sequences are annotated as
Transmembrane (12) 8897 sequences are
annotated as Hypothetical (27)
15
Methods
Prediction of the signal peptide
SPepLip Method NNs. Input Single
sequence Fariselli P, Finocchiaro G, Casadio R
(2003) Bioinformatics 192498-2499
Prediction of transmembrane a-helices
MEMSAT Method Dynamic progamming. Input
Sequence profiles Jones DT, Taylor WR, Thornton
JM (1994) Biochemistry 153038-3049
TMHMM2.0 Method HMMs. Input Single
sequence Krogh A, Larsson B, von Heijne G,
Sonnhammer ELL (2001) JMB 305567-580
ENSEMBLE 1.0 Method NNs and HMMs. Input
Sequence profiles Martelli PL, Fariselli P,
Casadio R (2003) Bioinformatics 19I205-I211
ENSEMBLE 1.0 FILTER Filtering procedure for
reducing false positives Martelli PL, Fariselli P
and Casadio R (2003) Bioinformatics 19I205-I211
New methods
TMHMMdomfix Method HMMs. Input Single profiles,
SMART domains Bernsel A, von Heijne G (2005) Prot
Sci 141723-1728
PRODIV_TMHMM Method HMMs. Input Sequence
profiles Viklund H, Eloffson A (2004) Prot.Sci.
131908-1917
in progress.
ENSEMBLE 2.0 Alternative topology assignment 2
preliminary versions
16
Performance of the high scoring methods on the
121 high-resolved chains (from PDB)
Correct Topography correct position of TMhelices
along the sequence Correct Topography correct
Topography AND correct orientation with respect
to the membrane plane
17
A new annotation server http//pongo.biocomp.unibo
.it/pongo
18
A new annotation server http//pongo.biocomp.unibo
.it/pongo
For retrieving results stored in the data base..
19
A new annotation server http//pongo.biocomp.unibo
.it/pongo
..and for predicting new sequences
20
(No Transcript)
21
(No Transcript)
22
A new annotation server http//pongo.biocomp.unibo
.it/pongo
23
TM Protein Annotation of the human
genome Annotation of theUniProt data base (33,135
sequences)
24
  • Out of 33135 unique sequences of the
    Ensemble35a1
  • 19.5 of the sequences are predicted as membrane
    proteins by TMHMM2.0 (single sequenced-based)
    (19 are predicted by all three predictors)
  • Prodiv 17 are predicted by all predictors..
  • 33.5 and 32.2 are predicted as membrane
    proteins by MEMSAT and ENSEMBLE, respectively
    (25 are predicted by both predictors).
  • These results set the lower and upper bound for
    the membrane protein content of the human genome
    and allow a list of putative membrane proteins
    for further applications.

25
Distribution of the number of predicted a-helices
26
Distribution of predicted TM proteins among the
chromosomes
27
(No Transcript)
28
Annotation/prediction ofall-beta transmembrane
proteins
29
Prediction Server Page of The Biocomputing Group
http//gpcr.biocomp.unibo.it/
30
Trample
http//gpcr.biocomp.unibo.it/biodec
31
Strand
Helix
SignalPep
Fariselli et al. NAR 33, 2005
32
TRAMPLE www.biocomp.unibo.it
33
Performance of HMM-B2TMR on 21 high-resolved TM
beta barrel proteins compared to other predictors

34
A brand new prediction with Trample
Omp32 anion-selective porin Delftia
acidovorans, 5 Å (2FGR) and 1.45 Å (2FGQ)
Zachariae et al. (2006)
35
Rate of false positives for ENSEMBLE (all-alpha)
and B2-TMR (all-beta)
The predictors are tested on 809 globular protein
with sequence identity ? 25 0.5
have at least 1 a-TM helix predicted 5.6
have at least 2 b-TM strand predicted
36
A software system for genome annotation
37
PROTEOME
HUNTER
Signal peptide
Yes
No
All-a TM
All-a TM
No
All-b TM
38
Escherichia coli K12, complete genome
Completed Oct 13, 1998. Total Bases 4,639,221
bp
NCBI (www.ncbi.nlm.nih.gov) Protein coding
genes 4,289 Structural RNAs 115
EcoGene/EcoProt (bmb.med.miami.edu/EcoGene) Prote
in coding genes 4,173 Structural RNAs 120
39
Classification of the non annotated proteins 1253
NON ANNOTATED PROTEINS
40
Experimental validation on 8 new outer membrane
proteins (Protein Science,March 2006/Von
Heijene-Casadio Labs)
Outer Membrane Fraction
41
Predicting globular, inner and outer membrane
proteins in genomes of Gram-negative bacteria
with Hunter
  • the number of new proteins predicted in the class
    with Hunter, out of the non-annotated region
  • Lists available at www.biocomp.unibo.it

42
PROTEOME
MANHUNTER
Subcellular Localization/SPEP
Yes
No
All-a TM
All-a TM
Human genome annotation
No
No
All-b TM
All-b TM
No
43
Some preliminar results Distribution of the
different protein structures among the
chromosomes in Homo sapiens
44
3D structure prediction of proteins
New folds
Existing folds
Membrane proteins
Building by homology
Ab initio prediction
Threading/ fold recognition
0 10 20 30 40 50 60 70 80
90 100
Homology ()
45
- On the basis of predicted TM topology it may be
possible to select a template for 3D structure
prediction, even when sequence alignment is lt30
-Modeling the 3D structure of all-alpha membrane
proteins
-Modeling the 3D structure of eukaryotic ? barrel
proteins (VDAC) on prokaryotic porins
46
Some examples..
A VDAC in Neurospora
A carrier in mitochondria
Casadio et al., FEBS Lett (2002)
A VDAC in drosophila
OGC_BOVIN (20,1OKC)
Morozzo Della Rocca et al., JMBiol, 2005
pori_drome (15,2OMF)
Aiello et al., JBC (2004)
47
The Biocomputing Group of the University of
Bologna
Remo
Piero
Gianluca
Emidio
Pier Luigi
Ludovica
Paola
Ivan
Rita
Lisa
Alberto
48
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com