Protein structure viewing: - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Protein structure viewing:

Description:

2. Protein Classification: SCOP and CATH databases. 3. Families, Patterns, Motifs - InterPro. ... The SCOP database aims to classify. proteins according to structural ... – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 29
Provided by: kag7
Category:

less

Transcript and Presenter's Notes

Title: Protein structure viewing:


1
  • Lecture 9
  • Protein structure viewing
  • PDB database, pdbsum, QuickPDB.
  • NCBI MMDB (Molecular Modeling DataBase), Cn3D.
  • 2. Protein Classification SCOP and CATH
    databases.
  • 3. Families, Patterns, Motifs - InterPro.
  • Databases prosite.
  • Blocks.
  • Pfam.
  • eMOTIF.
  • Other interesting protein features.
  • 5. Other Diseases Related to Protein Folding.

2
How Many Folds Are There ?
  • Structural Classification
  • of proteins (SCOP)
  • Status (1 Mar 2002)
  • based on 13220 PDB entries.
  • How many more folds
  • are there ?
  • Estimation
  • Number of possible
  • folds 4,000.
  • Database of 930
  • folds covers 90
  • of protein families.

http//scop.berkeley.edu/count.html
3
Clustering of Structures
Structural Classification of Proteins
(SCOP) http//scop.mrc-lmb.cam.ac.uk/scop/
all a
Globin- like
Nearly all proteins have structural similarities
with other proteins and sometimes share a common
evolutionary origin. The SCOP database aims to
classify proteins according to structural and
evolutionary relationship.
Globin- like
globins
myoglobin
4
Clustering of Structures
  • Class - similar secondary structures
  • all a, all b, ab?
  • Fold major structural similarity
  • (similar secondary structures).
  • Super-family - low sequence identity,
  • probable common ancestry.
  • Family - clear evolutionary
  • relationship (usually sequence
  • identity gt 30).
  • Individual protein.

all a
Globin- like
Globin- like
globins
myoglobin
5
SCOP - Results
6
http//www.biochem.ucl.ac.uk/bsm/cath_new/index.ht
ml http//www.biochem.ucl.ac.uk/bsm/cath_new/cath
_info.html
Classification of protein domain structures.
C - Class - determined according to secondary
structure composition. A - Architecture -
describes the overall shape of the domain
structure. T - Topology (FOLD) - major structural
similarities. H - Homology Super-family -
Protein domains which share a common ancestor.
Click on 3D figure
Domains for 1fupA2
7
  • Lecture 9
  • Protein structure viewing
  • PDB database, pdbsum, QuickPDB.
  • NCBI MMDB (Molecular Modeling DataBase), Cn3D.
  • 2. Protein Classification SCOP and CATH
    databases.
  • 3. Families, Patterns, Motifs - InterPro.
  • Databases prosite.
  • Blocks.
  • Pfam.
  • eMOTIF.
  • Other interesting protein features.
  • 5. Other Diseases Related to Protein Folding.

8
Higher Level Structures Motifs Domains
Family is a set of sequences that are related
(functionally/structurally). Motif is a simple
combination of a few secondary structures, that
appear in several different proteins in nature.
A collection of motifs forms a domain. Domain
is a more complex combination of secondary
structures, that is common in a family
(consensus pattern). It has a very specific
function, (contains an active site). A protein
may contain more than one domain.
For further reading http//www.expasy.org/swissmo
d/course/text/chapter4.htm http//www.ii.uib.no/i
nge/talks/sverige00/sld003.htm
9
Grouping of Secondary Structures Elements -
Super-secondary Structures or Motifs.
b-hairpin
bab
aa
?-barrels
http//www.expasy.org/swissmod/course/text/chapter
4.htm
10
Example DNA Pattern Search
  • Patterns most often
  • examined in DNA
  • sequences are
  • Examples
  • Recognition sites of restriction
  • enzymes.
  • Codons specifying the amino
  • acid sequence of a protein.
  • Intron splice sites.
  • Promoter.
  • Binding sites for regulatory
  • proteins.

http//www.blc.arizona.edu/courses/bioinformatics/
patterns.html
11
Example Calmodulin-Binding Motif
(calcium-binding proteins)
12
Example Leucine Zipper Motif -
(Transcription factor)
http//www.blc.arizona.edu/courses/bioinformatics/
patt-lab.html
13
Example Zinc-Finger Motif - (DNA binding
proteins)
http//www.blc.arizona.edu/courses/bioinformatics/
patt-lab.html
14
Example Zinc-Finger Motif
http//www.ii.uib.no/inge/talks/ebi-nov-99/sld006
.htm
15
Motifs in Protein Analysis
http//www.ii.uib.no/inge/talks/ebi-nov-99/sld009
.htm
16
Protein Sequence Motif Databases
http//www.ii.uib.no/inge/talks/sverige00/sld003.
htm
17
Conserved Protein Regions Profiles, Motifs and
Domains
We will tour various web servers and databases
identifying conserved regions within protein
families.
!
Warning Definitions, formats outputs vary
significantly from one server to the next. The
field is still relatively young and very
dynamic, so no standards have been established
yet !
18
http//www.expasy.ch/prosite/
ProSite determines the function of
uncharacterized protein, and to which known
family of proteins it belongs. A pattern
describes a group of amino acids that constitutes
an usually short but characteristic motif within
a protein sequence.
For example The pattern AC - x - V - x(4) -
ED. is interpreted as Ala or Cys - any -
Val - any-any-any-any- any but Glu or Asp.
Note Search by full text search.
19
PROSITE SYNTAX
For example The pattern AC - x - V - X(4) -
ED. is interpreted as Ala or Cys - any -
Val - any-any-any-any- any but Glu or Asp.
  • The standard one-letter code for amino acids.
  • x' any amino acid.
  • ' residues allowed at the position.
  • ' residues forbidden at the position.
  • ( )' repetition of a pattern element are
    indicated in parenthesis.
  • X(n) or X(n, m) to indicate the number or
    range of repetition.
  • -' separates each pattern element.
  • ' indicated a N-terminal restriction of
    the pattern.
  • ' indicated a C-terminal restriction of
    the pattern.
  • .' the period ends the pattern..

20
http//www.blocks.fhcrc.org/
Blocks are multiply aligned un-gapped segments
corresponding to the most highly conserved
regions of proteins. The Blocks Database is a
collection of blocks representing known protein
families.
Input The amino acid sequence of a
protein. Outputs (1) Protein families with
similar block structure. (2) Blocks
inside families.
21
http//blocks.fhcrc.org/blocks/blocks search.html
Blocks segments corresponding to the most highly
conserved regions of proteins documented in
PROSITE.
InterPro (IP) families
4 Blocks for saposin - IPB003119A-D
Help http//blocks.fhcrc.org/blocks/help/tutorial
/tutorial.html
22
A collection of protein families domains.
http//www.sanger.ac.uk/Pfam/
Query amino acid sequence of human prosaposin.
View graphics
23
(No Transcript)
24
Protein families database
Representative SAPA family proteins
25
The eBLOCKS server
Search a sequence
http//eblocks.stanford.edu/
  • eBLOCKs is a database of protein sequence
    blocks - ungapped alignments of
  • highly conserved regions among a protein
    family or superfamily.
  • eBLOCKs is generated automatically from
    PSI-BLAST results, using protein
  • sequences contained in SWISS-PROT.
  • The PSI-BLAST result is then analysed by a
    clustering algorithm to build
  • protein groups with different levels of
    similarities.
  • Each group of sequences are aligned and
    trimmed into blocks.
  • The current eBLOCKs database contains 81,413
    eBLOCKs.

26
http//eblocks.stanford.edu/eblocks/kwsearch.html
Logos Select display format GIF PDF
Postscript
27
eBLOCKS - Results (cont.)
Sequence Logos A graphical way to display
consensus sequences. Amino acids are colored
according to their chemical and physical
characteristics Red for acidic amino acids
Glu, Asp Blue for basic amino acids Lys,
Arg, His White for polar OH/SH amino acids
Ser, Thr, Cys (light grey) Green for amide amino
acids Asn and Gln Yellow (sulphur) for
Metionine Black for hydrophobic amino acids
Ala, Val, Leu, Ile Orange for aromatic amino
acids Tyr, Phe, Trp Purple for proline
Pro Grey for glycine Gly All other letters light
blue so they would stand out
consensus
The motif of saposin, found using PSSM (from
PSI-BLAST analysis). Larger letters indicate more
significant amino acid position. The consensus
is the top amino acid in each column.
28
http//motif.stanford.edu/emotif/
Discrete motifs represent specific function.
eMOTIF search
Results
motif
Write a Comment
User Comments (0)
About PowerShow.com