Title:
1Corrections
Â
2(No Transcript)
3N-linked glycosylation (GlcNac) Look at the
Swiss-Prot annotation (in a random glycosylated
entry)
4Query
annotation(typecarbohyd "N-linked (GlcNAc...)"
confidenceexperimental) reviewedyes
5Taxonomic distribution
6TPNLINDTME
7Multiple alignment (ClustalW)
-LAPIQ-N-HAYRCS-ST-KLESGM
8(No Transcript)
9(No Transcript)
10(No Transcript)
11N-glycosylation does not occur in Bacteria
false positive !
12301 protein (within the set of 1000 proteins) are
N-glycosylated according to the UniProtKB
annotation!
13(No Transcript)
14Scan Prosite with the official pattern
The official pattern also match with bacteria
sequences (false positives)
15(No Transcript)
16(No Transcript)
17PRATT pattern with 20 sequences D-K-T-G-T-IL-T-x
(3)-ILMV-x-FILV
18(No Transcript)
19AT31_HUMAN SIMILARITY Belongs to the cation
transport ATPase (P-type) family. Type V
subfamily. The pattern is a discriminator for
ATP ase family (Cation-transporting )
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27C-x(2,4)-C-x(3)-LIVMFYWC-x(8)-H-x(3,5)-H
28Pattern scan
29(No Transcript)
30(No Transcript)
31The pattern missed some Zn finger in the same
protein i.e. Q24174
Pattern
Profile
Not found with the pattern
32The pattern C - X(2,4) - C - X(3) - LIVMFYWC
- X(8) - H - X(3,5) H Should includes
YRCVLCGTVAKSRNSLHSHMSrQHRGIST
C-X(2,4)-C-X(3)-LIVMFYWCA-X(8)-H-X(3,5)-H
33(No Transcript)
34Yes !
But
The pattern becomes less restrictive. You get
more sequences which should not be here. (As the
results are limited to 1000, the number of hits
is not the same)
35Discriminators (Signatures, descriptors) for the
Zinc finger C2H2 type domain can be found in
Prosite (Pattern and Profile) and Pfam (HMM)
36(No Transcript)
37Step 1 scan UniProtKB/Swiss-Prot with the
pattern Use the scanprosite tool at
http//www.expasy.org/tools/scanprosite/
38(No Transcript)
39Step 2 Retrieve the matched human entries _at_
UniProt (go at the end of the Scan Prosite result
page click on Matched UniProtKB entries)
40Step 3 Retrieve the sequences annotated as being
phosphorylated on a Thr
41Step 3 Retrieve the sequences annotated as being
phosphorylated on a Thr
-gt 19 candidates to be manually checked .
42(No Transcript)
43InterPro scan results
44InterPro other shema (Graphical view from
UniProtKB)
45InterPro shema
PFAM Graphical view
46Prosite Graphical view
47Blast _at_ NCBI against Swiss-Prot
NCBI Color key for alignment scores
48NCBI Swiss-Prot does not contain the alternative
sequences (i.e. P28175-2) !! NCBI gives the
version number of the Swiss-Prot sequence (i.e.
Q8BU25.2).
49UniProt Color code for identity scores (not
alignment !)
50(No Transcript)
51UniProt Color code for identity scores (not
alignment !)
52ProDom database List of proteins sharing at least
a common domain
53(No Transcript)
541) BLAST at www.uniprot.org
55(No Transcript)
56(No Transcript)
57(No Transcript)
582) PROSITE tools
59(No Transcript)
60You are lucky domains are rarely not annotated
in the different domain/family databases !
613) Construct a profile with My hits at SIB Use
PSI Blast
62Do a PSI BLAST against UniProtKB
63(No Transcript)
64Select sequence with a E value gt 0.001 and do a
second cycle
65Look at the MSA
66(No Transcript)
67Construct a profile with the MSA
68(No Transcript)
69(No Transcript)
70(No Transcript)
71The profile
72The profile hits
73Construct a HMM with the MSA
74The HMM
75The HMM hits
76 - Look at the Goloco data in InterPro. How
many proteins (and/or hits) are found by the
different methods ?
77http//www.ebi.ac.uk/interpro/
78According to InterPro Goloco domain is described
by at least one of the different methods (PFAM,
Prosite, Smart) PFAM 167 proteins Prosite 192
proteins SMART Â 1 proteins These different
numbers are the consequence of the interval
between the different releases of the different
databases (including the sequence databases
(UniProtKB). It may also be due to the different
methods used (HMM, profile)
79Look for the HMM for the Goloco domain in PFAM
80Look for the HMM for the Goloco domain in PFAM
81Download the HMM matrix
82the HMM matrix
83(No Transcript)