Structural Bioinformatics - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Structural Bioinformatics

Description:

Primary structure = Secondary structure, Tertiary structure = Function ... Level of abstraction highly indicative of function. Databases: PDB, SCOP, CATH, FSSP ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 13
Provided by: deendayald
Category:

less

Transcript and Presenter's Notes

Title: Structural Bioinformatics


1
Structural Bioinformatics
  • Primary structure
  • 20100 possibilities, but about 103-4 families
  • High spatial locality
  • Secondary structure
  • Helix, Sheet, Loop, Coil
  • Intermediate spatial locality
  • Tertiary structure
  • Only so many folds
  • Evolutionary bias?
  • Only so many structures? Smaller folding space?
  • Low spatial locality
  • Primary structure gt Secondary structure,
    Tertiary structure gt Function

2
Structural Bioinformatics
  • Macromolecular complex gt
  • Protein gt (No. of species 104)
  • Fold gt (103)
  • Domain gt (103-4)
  • Module gt
  • Motif gt
  • Residue gt
  • Atom

Level of abstraction
3
Primary structure
  • Determined
  • Experimentally
  • Sequencing
  • Computationally
  • Proteome prediction from genome
  • Finite number of real-world families based on
    sequence similarity
  • Significance Sine qua none
  • Databases Swiss-Prot, PIR, Genpept

4
Secondary structure
  • Determined
  • Experimentally
  • Circular dichroism, NMR, Raman spectroscopy
  • Computationally
  • Sliding window context analysis
  • Periodicity analysis
  • Significance
  • Higher order building block
  • Mechanistic significance in protein folding

5
Tertiary structure
  • Determined
  • Experimentally
  • X-ray crystallography, NMR
  • Computationally
  • Based on similarity to known structures (Homology
    modeling)
  • a priori
  • Significance
  • Level of abstraction highly indicative of
    function
  • Databases PDB, SCOP, CATH, FSSP

6
Structural representation
  • Average (crystallography) or set (NMR) of
    structures specified
  • Cartesian coordinates (x,y,z) for each atom in
    structure One vector for each atom
  • Internal coordinate representation (edges,
    angles) Set of inter-atom vectors

7
Comparing structures
  • Align structures
  • Exhaustive search space vast (Complexity?)
  • Algorithms typically use subset of coordinates
  • Compute similarity
  • Root mean square deviation
  • Other normalized measures

8
Structural alignment
  • Vector based strategies
  • Compare summarized vector representations (VAST)
  • Compare nearest-neighbor vectors (SSAP) by
    dynamic programming
  • Distance matrix comparisons (a là Dot-matrix)
  • Subset of internal coordinate space (DALI)

9
VAST alignment
  • Use only subset of atom coordinates
  • Replace atom coordinates with vector coordinates
    corresponding to secondary structural elements
  • Compare sets of vectors to assess similarity

10
Double dynamic programming
  • First level
  • Represent each residue by neighborhood vector
  • Compare n versus m neighborhood vectors
  • Generate optimal alignment based on vector
    differences and dynamic programming
  • Second Level
  • Add matrix scores if paths cross in a cumulative
    matrix
  • Generate optimal alignment based on the
    cumulative matrix

11
Distance matrix based alignment
  • Generate dot-matrix of inter-residue distances,
    using threshold
  • Pick out secondary structure elements based on
    matrix patterns
  • Compare two matrices to generate structural
    alignment

12
Summary
  • Sequence similarity (gt 50 identity) implies
    structural similarity. Converse not necessarily
    true (evolutionary convergence/information
    convergence)
  • Structural similarity algorithms are heuristic
    ways to assess structural similarity
    independent of sequence similarity
  • Structural variation is smaller that the number
    of possible sequences
Write a Comment
User Comments (0)
About PowerShow.com