Homework grades - PowerPoint PPT Presentation

About This Presentation
Title:

Homework grades

Description:

3. Insertions and deletions. insertion. AHYATPTTT. AH---TPSS. deletion. Occur mostly between secondary structures, in the loop regions. ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 26
Provided by: Pan92
Category:

less

Transcript and Presenter's Notes

Title: Homework grades


1
Homework grades
0-14 (D) 15-19 (C) 20-24 (B) 25-30(A)
HW1 5 1 9 5
HW2 0 6 4 9
2
Protein folding.Anfinsens experiments.
Assumption amino acid sequence completely and
uniquely determines the protein tertiary
structure. Protein folding problem find native
conformation among the large number of
alternative conformations. Ex polypeptide chain
of 100 residues can have 9100 different
conformations.
3
Protein folding step by step.
Disordered globule hydrophobic inside,
hydrophilic - outside
Extended chain
Native highly ordered conformation
  • Factors which influence the thermodynamics and
    kinetics of protein folding
  • size, amino acid content, hydrophobic/hydrophilic
    content
  • strength of intramolecular interactions, number
    of S-S bonds
  • domain architecture

4
Fold recognition.
  • Unsolved problem direct prediction of protein
    structure from the physico-chemical principles.
  • Solved problem to recognize, which of known
    folds are similar to the fold of unknown protein.
  • Fold recognition is based on observations/assumpti
    ons
  • The overall number of different protein folds is
    limited (1000-3000 folds)
  • The native protein structure is in its ground
    state (minimum energy)

5
Definition of protein folds.
  • Protein fold arrangement of secondary
    structures into a unique topology/tertiary
    structure.
  • Example of alphabeta proteins
  • TIM beta/alpha-barrel contains parallel
    beta-sheet barrel, closed n8, S8
  • strand order 12345678, surrounded by
    alpha-helices
  • NAD(P)-binding Rossmann-fold domains core 3
    layers, a/b/a parallel beta-sheet of 6 strands,
  • order 321456

6
Protein structure prediction.
  • Prediction of three-dimensional structure from
    its protein sequence. Different approaches
  • Homology modeling (predicted structure has a very
    close homolog in the structure database).
  • Fold recognition (predicted structure has an
    existing fold).
  • Ab initio prediction (predicted structure has a
    new fold).

7
Homology modeling.
  • Aims to produce protein models with accuracy
    close to experimental and is used for
  • Protein structure prediction
  • Drug design
  • Prediction of functionally important sites
    (active or binding sites)

8
Steps of homology modeling.
  1. Template recognition initial alignment.
  2. Backbone generation.
  3. Loop modeling.
  4. Side-chain modeling.
  5. Model optimization.
  6. Model validation.

9
1. Template recognition.
  • Recognition of similarity between the target and
    template.
  • Target protein with unknown structure.
  • Template protein with known structure.
  • Main difficulty deciding which template to
    pick, multiple choices/template structures.
  • Template structure can be found by searching for
    structures in PDB using sequence-sequence
    alignment methods.

10
Two zones of sequence alignment.Two sequences
are guaranteed to fold into the same structure if
their length and sequence identity fall into
safe zone.
Sequence identity
100
Safe homology modeling zone
50
Twilight zone
50
100
150
200
Alignment length
11
2. Backbone generation.
  • If alignment between target and template is
    ready, copy the backbone coordinates of those
    template residues that are aligned.
  • If two aligned residues are the same, copy their
    side chain coordinates as well.

12
3. Insertions and deletions.
  • insertion
  • AHYATPTTT
  • AH---TPSS
  • deletion
  • Occur mostly between secondary structures, in the
    loop regions. Loop conformations difficult to
    predict.
  • Approaches to loop modeling
  • Knowledge-based searches the PDB for loops with
    known structure
  • Energy-based an energy function is used to
    evaluate the quality of a loop. Energy
    minimization or Monte Carlo.

13
4. Side chain modeling.
  • Side chain conformations rotamers. In similar
    proteins - side chains have similar
    conformations.
  • If identity is high - side chain conformations
    can be copied from template to target.
  • If identity is not very high - modeling of side
    chains using libraries of rotamers and different
    rotamers are scored with energy functions.

E2
E3
E1
E min(E1, E2, E3)
14
5. Model optimization.
  • Energy optimization of entire structure.
  • Since conformation of backbone depends on
    conformations of side chains and vice versa -
    iteration approach

Predict rotamers
Shift in backbone
15
6. Model validation.
  • Correct bond length and bond angles
  • Correct placement of functionally important sites

gtgt 3.8 Angstroms
16
Classwork I Homology modeling.
  • Go to NCBI Entrez, search for gi461699
  • Do Blast search against PDB
  • Do CD-search.

17
Fold recognition.
  • Goal to find in PDB a fold which best matches a
    given sequence.
  • Since similarity between target and the closest
    to it template is not high, sequence-sequence
    alignment methods fail to find a closest match.
  • Solution threading sequence-structure
    alignment method.

18
Threading method for structure prediction.
  • Sequence-structure alignment, target sequence is
    compared to all structural templates from the
    database.
  • Requires
  • Alignment method (dynamic programing, Monte
    Carlo,)
  • Scoring function, which yields relative score for
    each alternative alignment

19
Scoring function for threading.
Contact-based scoring function depends on the
amino acid types of two residues and distance
between them. Sequence-sequence alignment
scoring function does not depend on the distance
between two residues. If distance between two
non-adjacent residues in the template is less
than 8 Å, these residues make a contact.
20
Scoring function for threading.
Ala
Trp
Tyr
Ile
w is calculated from the frequency of amino acid
contacts in protein structures ai amino acid
type of target sequence aligned with the position
i of the template N- number of contacts
21
Classwork II calculate the score for target
sequence ATPIIGGLPY aligned to template
structure which is defined by the contact matrix.
A T P Y I G L
A -0.2 -0.1 0 -0.1 0.5 -0.2 0.2
T 0.3 -0.1 -0.2 -0.3 0.1 0
P -0.2 -0.4 -0.1 0.1 -0.2
Y -0.4 -0.2 -0.1 -0.2
I 0.3 0.2 0.4
G 0.4 0.2
L 0.3
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
10
22
GenThreader http//bioinf.cs.ucl.ac.uk/psipred.
  1. Predicts secondary structures for target
    sequence.
  2. Makes sequence profiles (PSSMs) for each template
    sequence.
  3. Uses threading scoring function to find the best
    matching profile.
  4. Evaluates sequence-structure quality using neural
    networs.

23
Evaluation of model accuracy
Root Mean Square Deviation
dij (A) - distance between residues i and j in
template structure A dij (B) distance
between residues iand j in predicted structure
of target sequence B. Residues i and j in
template structure A are aligned to residues
iand j in predicted structure B
24
CASP prediction competitions.
25
Classwork III.
  • Go to http//bioinf.cs.ucl.ac.uk/psipred
  • Go over the options of protein structure
    prediction program
  • http//bioinf2.cs.ucl.ac.uk/psiout/271020e9dc74fea
    d.mgen.html
Write a Comment
User Comments (0)
About PowerShow.com