Databases: Navigating the MSD - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Databases: Navigating the MSD

Description:

Databases: Navigating the MSD. A simple form-based ... Allows multiple search fields to be combined ... Distance matrix alignment (DALI, Holm & Sander, EBI) ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 23
Provided by: exter89
Category:

less

Transcript and Presenter's Notes

Title: Databases: Navigating the MSD


1
Databases Navigating the MSD
2
MSDlite
  • A simple form-based query system to search the
    MSD Databases
  • Allows multiple search fields to be combined
  • Relatively fast, despite performing complex SQL
    queries

3
MSDlite
  • Strengths
  • Weaknesses
  • simple, easy to use form
  • allows multiple search fields to be combined
  • relatively fast, despite performing quite complex
    SQL queries
  • not exposing the power of a relational database
  • user can't specify the relationship between
    search fields
  • "name" AND "title" AND "keyword"
  • "name" OR "title" OR "keyword"
  • ( "name" OR "title" ) AND NOT "keyword"
  • the search form is defined by the authors of the
    search system, not the author of a query

4
Complex Searches (Advanced Users)
  • Wanted to allow the user to entirely control
    their query developed MSDpro
  • Uses an applet to provide a dynamic "form" that
    lets the user
  • choose the fields to be searched
  • specify relationships between search fields
  • choose result fields and how they are presented
  • perform "complex" sub-queries e.g. SSM, FASTA
  • MSDpro uses an applet for constructing queries
    and a server to execute them
  • The user describes their query entirely
    graphically, including logical operations such as
    AND, OR and NOT

5
MSD Atlas Pages
You can download coordinates and structure
factors (if available)
6
Structural Similarity MSDfold
7
If you have to ask.
  • Are there any structures in the PDB that are
    similar to mine?
  • What SCOP and/or CATH family could my structure
    belong to ?
  • Can I get some idea about the possible function
    of my protein based on structural similarity with
    others?
  • How do I get a multiple alignment of many of my
    structures ?

8
Structure Alignment
Structure alignment may be defined as
identification of residues occupying equivalent
geometrical positions
  • Unlike in sequence alignment, residue type is
    neglected
  • Used for
  • measuring the structural similarity
  • protein classification and functional analysis
  • database searches

9
Methods
  • Many methods are known
  • Distance matrix alignment (DALI, Holm Sander,
    EBI)
  • Vector alignment (VAST, Bryant et. al. NCBI)
  • Depth-first recursive search on SSEs (DEJAVU,
    Madsen Kleywegt, Uppsala)
  • Combinatorial extension (CE, Shindyalov Bourne,
    SDSC)
  • Dynamical programming on Ca (Gerstein Levitt)
  • Dynamical programming on SSEs (SSA, Singh
    Brutlag, Stanford University)
  • many others
  • MSDfold (SSM) employs a 2-step procedure
  • Initial structure alignment and superposition
    using SSE graph matching
  • Ca - alignment

10
Graph representation of SSEs
E. M. Mitchell et al. (1990) J. Mol. Biol. 212151
SSE graphs differ from conventional chemical
graphs only in that they are labelled by vectors
of properties. In graph matching, the labels are
compared with tolerances chosen empirically.
11
SSE graph matching
A
Matching the SSE graphs yields a correspondence
between secondary structure elements, that is,
groups of residues. The correspondence may be
used as initial guess for structure superposition
and alignment of individual residues.
B
12
Ca - alignment
  • SSE-alignment is used as an initial guess for
    Ca-alignment
  • Ca-alignment is an iterative procedure based on
    the expansion of shortest contacts at best
    superposition of structures
  • Ca-alignment is a compromise between the
    alignment length Nalign and r.m.s.d. Longest
    contacts are unmapped in order to maximise the
    Q-score

13
Using MSDfold
Discover hitherto unknown relationships
88 structural identity
11 Sequence identity
14
MSDfold Search Interface
15
MSDfold Output
  • Table of matched Secondary Structure Elements
  • Table of matched backbone Ca-atoms with distances
    between them at best structure superposition
  • Rotation-translation matrix of best structure
    superposition
  • Visualisation in Jmol and Rasmol
  • r.m.s.d. of Ca-alignment
  • Length of Ca-alignment Nalign
  • Number of gaps in Ca-alignment
  • Quality score Q
  • Statistical significance scores P(S), Z
  • Sequence identity

16
Results Page For Pairwise Alignment
17
Specific Pairwise Results
18
Structural Alignment
Residue by residue structural alignment result
19
Multiple 3D Alignment
  • More than 2 structures are aligned simultaneously
  • Multiple alignment is not equal to the set of
    all-to-all pairwise alignments
  • Helps to identify common structure motifs for a
    whole family of structures

20
Multiple 3D Alignment Interface
21
Results From Multiple 3D Alignment
22
Conclusions from use of MSDfold
  • residue identity may play a much less significant
    role in protein structure than often believed
  • as a consequence, the role of residue identity in
    protein function may be often overestimated
  • using sequence identity for the assessment of
    structural or functional features may give more
    false negatives than expected
  • physical-chemical properties of residues should
    be given preference over residue identity in
    structure and function analysis
  • modern methods for structure alignment are
    efficient there is little sense to use sequence
    alignment in structure-related studies
Write a Comment
User Comments (0)
About PowerShow.com