Title: Computational Biology Protein Threading
1Computational BiologyProtein Threading
Instructor Prof. Jesús A. Izaguirre Textbook
Introduction to Computational Molecular
BiologySetubal and Meidanis, Chapter
8.3 References R. H. Lathrop and T. F. Smith,
Global Optimum Threading with Gapped Alignment
and Empirical Pair Score Functions, J. Mol.
Biol. (1996) 255, 641-665 Z. Luthey-Schulten,
Bioinformatics, NSF Summer School 2003,
http//www.ks.uiuc.edu/Training
2Motivation
- Study sequence-structure alignment
- Proteins with similar sequence of aminoacids
often have similar structure - However, some structurally and functionally
similar proteins have vastly different structure
3(No Transcript)
4(No Transcript)
5(No Transcript)
6Mathematical problem
- This problem is NP-hard when variable gaps and
amino acid interaction energy functions are
allowed - Lathrop Smith (1996) used a Branch and Bound
technique for solving the problem
7Building the Scaffold I
- (A) Two structurally similar proteins and a
common core of four secondary structure segments
(dark lines I-L)
8Building the Scaffold II
- (B) Abstract model showing the interactions
(spatial adjacencies)
9One possible threading with novel sequence
- (C) Small circles represent amino acids holes
in the abstract scaffold. The t are the
indices of the target residues placed into the
first element of the core (i-l)
10Defining and splitting sets of threadings
(branching)
- (D) Sets of threadings are defined by lower and
upper bounds b,d for the first element of the
sequence amino acid residue. A set is split by
choosing a core segment (here, I), a splitting
point (dark interior arrow) and subintervals (lt,
, gt)
11Illustration of branch and bound technique
- Illustration of the splitting of search space
into subsets - Numbers closest to the set indicate the current
lower bound - List of current subsets kept on priority queue or
heap ordered from lowest bound
12Threading Algorithm I
- Initialization
- Compute a lower bound for the set of all
threadings - Initialize priority queue to contain one entry,
the set of all threadings with its lower bound
13Threading Algorithm II
- Iteration
- Get set from priority queue having lowest bound
- If set contains one threading, found solution
- Else, split set into smaller subsets, compute
lower bound for these, and insert into priority
queue
14Performance of the B B threading algorithm
- Log_10 of search time vs. Log_10 of search space
size for 60 proteins from the PDB database (ref.
Lathrop Smith, 96, Fig. 2) - The line marks a 2 hour computational limit
15Experiment to motivate need for threading
- Compare sequence vs. sequence alignment and
sequence vs. structure alignment (threading) for
two structurally related proteins that have not
much sequence similarity
16Bioinformatics Tutorial I
- We will compare the sequences and structures of
two proteins. - Go to the PDB database in two separate windows
http//www.rcsb.org/pdb/ - In one window, search for PDB ID 1sbp and in the
other 1pot. - In the Sulfate Binding Protein window, click
Sequence Details on the left then Download in
FASTA Format.
17Bioinformatics Tutorial II
- In a new window, go to the BLAST page
http//www.ncbi.nlm.nih.gov/blast/ - Click Standard protein-protein BLAST blastp.
- Paste in the sequence for 1sbp.
- Select pdb from the database menu.
18Bioinformatics Tutorial III
- Deselect Do CD-Search.
- Set an expect threshold of 1000.
- In the Format section of the options, specify
1000 descriptions. It is important to set this in
the Format section of the first page of BLAST
options, not in the subsequent page that appears.
- Now click BLAST then Format.
19Bioinformatics Tutorial IV
- Approximately at what ranking in the results did
1pot appear? You can search for 1pot on the web
page of results using the browser's Find command.
- What is the ratio between the score for 1pot and
the maximum possible score for a result in this
BLAST search? - We will now perform threading of the 1sbp
sequence. Go to the PredictProtein page at
http//cubic.bioc.columbia.edu/predictprotein/pred
ictprotein.html.
20Discussion
- At what rank does 1pot appear in the results?
Just after the start of the AGAPE section, you
will find a useful explanation explanation of the
columns in the table. - What is the ratio between the score for 1pot and
the maximum possible score for a result in the
threading search? - How can you explain the difference between the
rankings and score ratios seen in the BLAST and
threading search?