- PowerPoint PPT Presentation

1 / 83
About This Presentation
Title:

Description:

Title: PowerPoint Presentation Author: blatter Last modified by: marie-claude blatter Created Date: 1/3/2005 2:21:26 PM Document presentation format – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 84
Provided by: blat2
Category:

less

Transcript and Presenter's Notes

Title:


1
Corrections
 
2
(No Transcript)
3
N-linked glycosylation (GlcNac) Look at the
Swiss-Prot annotation (in a random glycosylated
entry)
4
Query
annotation(typecarbohyd "N-linked (GlcNAc...)"
confidenceexperimental) reviewedyes
5
Taxonomic distribution
6
TPNLINDTME
7
Multiple alignment (ClustalW)
-LAPIQ-N-HAYRCS-ST-KLESGM
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
N-glycosylation does not occur in Bacteria
false positive !
12
301 protein (within the set of 1000 proteins) are
N-glycosylated according to the UniProtKB
annotation!
13
(No Transcript)
14
Scan Prosite with the official pattern
The official pattern also match with bacteria
sequences (false positives)
15
(No Transcript)
16
(No Transcript)
17
PRATT pattern with 20 sequences D-K-T-G-T-IL-T-x
(3)-ILMV-x-FILV
18
(No Transcript)
19
AT31_HUMAN SIMILARITY Belongs to the cation
transport ATPase (P-type) family. Type V
subfamily. The pattern is a discriminator for
ATP ase family (Cation-transporting )
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
C-x(2,4)-C-x(3)-LIVMFYWC-x(8)-H-x(3,5)-H
28
Pattern scan
29
(No Transcript)
30
(No Transcript)
31
The pattern missed some Zn finger in the same
protein i.e. Q24174
Pattern
Profile
Not found with the pattern
32
The pattern C - X(2,4) - C - X(3) - LIVMFYWC
- X(8) - H - X(3,5) H Should includes
YRCVLCGTVAKSRNSLHSHMSrQHRGIST
C-X(2,4)-C-X(3)-LIVMFYWCA-X(8)-H-X(3,5)-H
33
(No Transcript)
34
Yes !
But
The pattern becomes less restrictive. You get
more sequences which should not be here. (As the
results are limited to 1000, the number of hits
is not the same)
35
Discriminators (Signatures, descriptors) for the
Zinc finger C2H2 type domain can be found in
Prosite (Pattern and Profile) and Pfam (HMM)
36
(No Transcript)
37
Step 1 scan UniProtKB/Swiss-Prot with the
pattern Use the scanprosite tool at
http//www.expasy.org/tools/scanprosite/
38
(No Transcript)
39
Step 2 Retrieve the matched human entries _at_
UniProt (go at the end of the Scan Prosite result
page click on Matched UniProtKB entries)
40
Step 3 Retrieve the sequences annotated as being
phosphorylated on a Thr
41
Step 3 Retrieve the sequences annotated as being
phosphorylated on a Thr
-gt 19 candidates to be manually checked .
42
(No Transcript)
43
InterPro scan results
44
InterPro other shema (Graphical view from
UniProtKB)
45
InterPro shema
PFAM Graphical view
46
Prosite Graphical view
47
Blast _at_ NCBI against Swiss-Prot
NCBI Color key for alignment scores
48
NCBI Swiss-Prot does not contain the alternative
sequences (i.e. P28175-2) !! NCBI gives the
version number of the Swiss-Prot sequence (i.e.
Q8BU25.2).
49
UniProt Color code for identity scores (not
alignment !)
50
(No Transcript)
51
UniProt Color code for identity scores (not
alignment !)
52
ProDom database List of proteins sharing at least
a common domain
53
(No Transcript)
54
1) BLAST at www.uniprot.org
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
2) PROSITE tools
59
(No Transcript)
60
You are lucky domains are rarely not annotated
in the different domain/family databases !
61
3) Construct a profile with My hits at SIB Use
PSI Blast
62
Do a PSI BLAST against UniProtKB
63
(No Transcript)
64
Select sequence with a E value gt 0.001 and do a
second cycle
65
Look at the MSA
66
(No Transcript)
67
Construct a profile with the MSA
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
The profile
72
The profile hits
73
Construct a HMM with the MSA
74
The HMM
75
The HMM hits
76
- Look at the Goloco data in InterPro. How
many proteins (and/or hits) are found by the
different methods ?
77
http//www.ebi.ac.uk/interpro/
78
According to InterPro Goloco domain is described
by at least one of the different methods (PFAM,
Prosite, Smart) PFAM 167 proteins Prosite 192
proteins SMART  1 proteins These different
numbers are the consequence of the interval
between the different releases of the different
databases (including the sequence databases
(UniProtKB). It may also be due to the different
methods used (HMM, profile)
79
Look for the HMM for the Goloco domain in PFAM
80
Look for the HMM for the Goloco domain in PFAM
81
Download the HMM matrix
82
the HMM matrix
83
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com