The Genome Access Course Protein Structure - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

The Genome Access Course Protein Structure

Description:

High quality; nice graphics. Alignment of. Representative. Members. Description ... Pfam, SMART, ProDom, PRINTS, and Prosite domains. High quality annotation ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 32
Provided by: jamesg61
Category:

less

Transcript and Presenter's Notes

Title: The Genome Access Course Protein Structure


1
TheGenomeAccessCourseProtein Structure
HSP 70 (1DKG, 1DKZ) and prefoldin (1FXK)
2
Protein Structural Elements
  • 2o Structural Elements
  • a-Helix
  • ß-Sheet
  • Globular regions
  • Domains
  • SH2
  • Leucine Zipper

3
Domains
  • Discrete structural units
  • Can infer boundaries from sequence analysis
  • 25 500 residues long
  • Most lt 200 residues
  • Less than 50 residues usually stabilized by SS
    bonds or metal ions

4
LipoxygenaseDomain
gt500 residues
5
WW Domain
33 residues
6
Domain Determination
  • Internal duplications
  • Detect with a dotplot
  • Transmembrane segments
  • Hydrophobic, 1535 residues
  • Segments easy to predict
  • Topology and multiple segments harder to predict
  • PHD, TMHMM, TMpred
  • Low complexity segments
  • Composition typically non-random
  • Non-compact folds coiled coils, rods, flexible
    domain linkers
  • Complexity function (SEG)
  • Small-pitch overlapping repeats (XNU)

7
Protein Sequence Databases
  • GenPept
  • Swiss-Prot
  • TrEMBL

8
Protein Domain Databases
  • Pfam
  • PROSITE
  • BLOCKS
  • PRINTS
  • CDD
  • ProDom
  • SMART
  • InterPro

9
  • HMM family profiles constructed by hand
  • Structural data in alignments
  • No hierarchy
  • No specific compositional bias
  • Good graphical output

10
Pfam-A and Pfam-B
  • Pfam-A (75)
  • Curated, annotated families
  • Pfam-B (30)
  • Families derived automatically from ProDom
  • Other

11
  • Protein fingerprint database (fingerprints are
    groups of conserved motifs that characterize a
    protein family)
  • Regular grammar for describing profiles (e.g.
    EDQ-x-G-x-DN-A-x-x-GALI)
  • Profile search is sensitive, but low coverage
    (signaling)
  • Pattern search has high false positive rate

12
  • Highly conserved, ungapped MSAs
  • Derived from PROSITE

13
  • Fingerprints are sets of ungapped weight matrices
  • Hierarchical classification for important
    families
  • Families, domains, and proteins

14
  • Conserved Domain Database (NCBI)
  • Linked into other NCBI resources
  • Includes Pfam and SMART domains (but does not
    give the same answer)

15
  • Simple Modular Architecture Research Tool
  • Collected by Ponting and Bork (641 HMMs)
  • Focuses on
  • Signaling Domains
  • Extracellular domains
  • Nuclear domains
  • High quality nice graphics

16
Alignment of Representative Members
Profile-HMM built with HMMer 2.0
Search Protein DB
Description
Full alignment
17
  • Profiles automatically built from PSI-BLAST
    alignments of Swiss-PROT
  • No annotation
  • As with other automated DBs (Pfam-B, DOMO),
    useful for seeing if region appears in different
    contexts

18
  • Pfam, SMART, ProDom, PRINTS, and Prosite domains
  • High quality annotation

19
Comparison of Protein Family DBs
Pfam
SMART
CDD
PROSITE
SRS
20
Protein Sequence Analysis
  • Biochemical/biophysical properties
  • Secondary Structure
  • Super-secondary (signal peptides, domains,
    motifs)
  • 3D prediction (Threading)

21
Amphipathic Helix
Edge Strand
Buried Strand
22
(No Transcript)
23
(No Transcript)
24
Viewing 3D Structures
  • Cn3d
  • Chime
  • RasMol
  • Protein Explorer

25
(No Transcript)
26
(No Transcript)
27
Predicting Structure from Sequence
  • 100 amino acid protein has 3200 backbone
    configurations
  • Threading

28
Protein Structure Prediction is Quite a Challenge
29
3D Structure Prediction
  • UCLA-DOE
  • SwissMODEL
  • CPHmodels

30
Methods for Aligning Structures
  • (Double) Dynamic Programming
  • Distance Matrix
  • Gibbs sampling
  • Branch-and-bound searching

31
HMMSTR Local Sequence-Structure Correlations
  • Constructed using motif clustering
  • No gaps or insertion states
  • Non-linear, highly branching model
  • Models local motifs common to all proteins
Write a Comment
User Comments (0)
About PowerShow.com