FASTA and BLAST - PowerPoint PPT Presentation

About This Presentation
Title:

FASTA and BLAST

Description:

Use a variant of Smith-Waterman algorithm on a narrow band around initn and ... Use Smith-Waterman algorithm around a band of this segment (as in FASTA) Home ... – PowerPoint PPT presentation

Number of Views:199
Avg rating:3.0/5.0
Slides: 5
Provided by: publi5
Category:

less

Transcript and Presenter's Notes

Title: FASTA and BLAST


1
FASTA and BLAST
  • Chitta Baral

2
FASTA Basic Steps
  • Step 1
  • Set a word size. (usually 6 for DNA and 2 for
    proteins)
  • Make a plot.
  • Find the long diagonals (or high scoring regions)
  • Step 2
  • Score the 10 best diagonal runs using a scoring
    matrix. (allow mismatches, end extensions,
    joining of two diagonals but no gaps)
  • (init1 single best sub-alignment found in this
    stage.)
  • Step 3
  • Merge non-overlapping diagonal runs to allow gaps
    (ins/del).
  • Score of joined regions sum of individual
    scores penalty
  • Score of the highest scoring region at the end of
    this step is called initn.
  • Step 4
  • Use a variant of Smith-Waterman algorithm on a
    narrow band around initn and construct an optimal
    alignment of this region.
  • Modifications
  • In Step 4, use a band around init1.

3
BLAST basic steps
  • Step 1
  • Set a word size (3 for protein and 11 for DNA)
    Create a word list for the query sequence
  • Eg. qlnfsagw ? ql, ln, nf, fs, sa, ag, gw
  • Expand the list (using a threshold T, say 8)
  • ql ql, qm, hl, zl ln ln, lb
  • nf nf, af, ny,df,qf, ef, gf, hf, kf, sf, tf, bf,
    zf fs fs, fa fn, fd, fg, fp, ft, fb, ys
  • sa none ag ag
  • gw gw, aw, rw, nw, dw, qw, ew, hw, iw, kw, mw,
    pw, sw, tw, vw, bw, zw, xw
  • Step 2
  • Scan through the string and whenever a word in
    the list is found try to extend it in both
    directions (no gaps) to get to a score beyond a
    threshold S. While extending use a parameter L
    that defines how long an extension will be tried
    to raise the score over S.
  • Modifications of Step 2
  • Original BLAST extension is continued as long as
    the score continues to increase
  • Another version extension is stopped when the
    accumulated score stops increasing and has just
    begun to fall a certain amount below the best
    score found.
  • Blast2 (gapped BLAST)
  • Lower value of T is used
  • After extension try to combine (allowing gaps)
  • Find maximal scoring segment. Use Smith-Waterman
    algorithm around a band of this segment (as in
    FASTA)

4
Home Work (due 3/31/03)
  • Compare BLAST and FASTA.
  • (Hint Read the external pointers in the class
    notes page.)
Write a Comment
User Comments (0)
About PowerShow.com