Computational Biology Protein Threading - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Computational Biology Protein Threading

Description:

However, some structurally and functionally similar proteins have vastly ... Just after the start of the AGAPE section, you will find a useful explanation ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 21
Provided by: JesusIz8
Category:

less

Transcript and Presenter's Notes

Title: Computational Biology Protein Threading


1
Computational BiologyProtein Threading
Instructor Prof. Jesús A. Izaguirre Textbook
Introduction to Computational Molecular
BiologySetubal and Meidanis, Chapter
8.3 References R. H. Lathrop and T. F. Smith,
Global Optimum Threading with Gapped Alignment
and Empirical Pair Score Functions, J. Mol.
Biol. (1996) 255, 641-665 Z. Luthey-Schulten,
Bioinformatics, NSF Summer School 2003,
http//www.ks.uiuc.edu/Training
2
Motivation
  • Study sequence-structure alignment
  • Proteins with similar sequence of aminoacids
    often have similar structure
  • However, some structurally and functionally
    similar proteins have vastly different structure

3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
Mathematical problem
  • This problem is NP-hard when variable gaps and
    amino acid interaction energy functions are
    allowed
  • Lathrop Smith (1996) used a Branch and Bound
    technique for solving the problem

7
Building the Scaffold I
  • (A) Two structurally similar proteins and a
    common core of four secondary structure segments
    (dark lines I-L)

8
Building the Scaffold II
  • (B) Abstract model showing the interactions
    (spatial adjacencies)

9
One possible threading with novel sequence
  • (C) Small circles represent amino acids holes
    in the abstract scaffold. The t are the
    indices of the target residues placed into the
    first element of the core (i-l)

10
Defining and splitting sets of threadings
(branching)
  • (D) Sets of threadings are defined by lower and
    upper bounds b,d for the first element of the
    sequence amino acid residue. A set is split by
    choosing a core segment (here, I), a splitting
    point (dark interior arrow) and subintervals (lt,
    , gt)

11
Illustration of branch and bound technique
  • Illustration of the splitting of search space
    into subsets
  • Numbers closest to the set indicate the current
    lower bound
  • List of current subsets kept on priority queue or
    heap ordered from lowest bound

12
Threading Algorithm I
  • Initialization
  • Compute a lower bound for the set of all
    threadings
  • Initialize priority queue to contain one entry,
    the set of all threadings with its lower bound

13
Threading Algorithm II
  • Iteration
  • Get set from priority queue having lowest bound
  • If set contains one threading, found solution
  • Else, split set into smaller subsets, compute
    lower bound for these, and insert into priority
    queue

14
Performance of the B B threading algorithm
  • Log_10 of search time vs. Log_10 of search space
    size for 60 proteins from the PDB database (ref.
    Lathrop Smith, 96, Fig. 2)
  • The line marks a 2 hour computational limit

15
Experiment to motivate need for threading
  • Compare sequence vs. sequence alignment and
    sequence vs. structure alignment (threading) for
    two structurally related proteins that have not
    much sequence similarity

16
Bioinformatics Tutorial I
  • We will compare the sequences and structures of
    two proteins.
  • Go to the PDB database in two separate windows
    http//www.rcsb.org/pdb/
  • In one window, search for PDB ID 1sbp and in the
    other 1pot.
  • In the Sulfate Binding Protein window, click
    Sequence Details on the left then Download in
    FASTA Format.

17
Bioinformatics Tutorial II
  • In a new window, go to the BLAST page
    http//www.ncbi.nlm.nih.gov/blast/
  • Click Standard protein-protein BLAST blastp.
  • Paste in the sequence for 1sbp.
  • Select pdb from the database menu.

18
Bioinformatics Tutorial III
  • Deselect Do CD-Search.
  • Set an expect threshold of 1000.
  • In the Format section of the options, specify
    1000 descriptions. It is important to set this in
    the Format section of the first page of BLAST
    options, not in the subsequent page that appears.
  • Now click BLAST then Format.

19
Bioinformatics Tutorial IV
  • Approximately at what ranking in the results did
    1pot appear? You can search for 1pot on the web
    page of results using the browser's Find command.
  • What is the ratio between the score for 1pot and
    the maximum possible score for a result in this
    BLAST search?
  • We will now perform threading of the 1sbp
    sequence. Go to the PredictProtein page at
    http//cubic.bioc.columbia.edu/predictprotein/pred
    ictprotein.html.

20
Discussion
  • At what rank does 1pot appear in the results?
    Just after the start of the AGAPE section, you
    will find a useful explanation explanation of the
    columns in the table.
  • What is the ratio between the score for 1pot and
    the maximum possible score for a result in the
    threading search?
  • How can you explain the difference between the
    rankings and score ratios seen in the BLAST and
    threading search?
Write a Comment
User Comments (0)
About PowerShow.com