Sequence%20Alignment - PowerPoint PPT Presentation

About This Presentation
Title:

Sequence%20Alignment

Description:

Title: PowerPoint Presentation Author: narcis Last modified by: malboobi Created Date: 9/9/2002 10:41:39 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:187
Avg rating:3.0/5.0
Slides: 18
Provided by: nar53
Category:

less

Transcript and Presenter's Notes

Title: Sequence%20Alignment


1
Sequence Alignment
2
Two general methods for sequence alignment
  • Global alignment considers similarity across the
    full extent of the sequences, e.g. MegAlign
  • Local alignment focuses on regions of similarity
    in parts of the sequences only, e.g. BLAST
    programs.

3
  • Questions
  • How similar are two sequences?
  • What is the best alignment between the two
    sequences?
  • How should alignments be scored?
  • And, if gaps are allowed, how should they be
    scored?
  • Three things are required
  • a means of scoring matches and mismatches,
  • a means of scoring gaps, and
  • a method of using the two to evaluate numerous
    possible alignments.

4
Sequence 1 ALCPQCDIE ALC CDE Sequence
2 ALCAKCDVE
5
Grouping of amino acids based on
physico-chemical properties important in protein
structures.
6
  • Commonly used substitution matrices are
  • Point Accepted Mutation matrix (PAM)
  • PAM250
  • BLOcks SUBstitution Matrix (BLOSUM)
  • BLOSUM62

7
Gap penalties
  • Mutational events include not only substitutions
    but also insertions and deletions.
  • Affined gap penalties impose an 'opening' penalty
    for a gap and an 'extension' penalty that
    decreases the relative penalty for each
    additional position in an already opened gap.

Sequence 1 ALCPQCDIE ALC CDE Sequence
2 ALCA--DVE
8
Sequence Search
9
Sensitivity versus Speed
  • FASTA looks for exactly matching 'words.
  • BLAST uses a scoring matrix.

10
  • BLAST (Basic Local Alignment Search Tools)
  • The BLAST programs have been designed for speed,
    with a minimal sacrifice of sensitivity.
  • Include a set of similarity search programs
    designed to explore all of the available sequence
    databases regardless of whether the query is
    protein or DNA.
  • The scores assigned in a BLAST search have a
    well-defined statistical interpretation, making
    real matches easier to distinguish from random
    background hits.
  • Local alignment may produce more biologically
    meaningful and sensitive results.

11
Dynamic programming
  • First described in the 1950s.
  • First applied in this context by Needleman and
    Wunsch in 1970.
  • Breaking the original problem into smaller and
    smaller subproblems until the subproblems have a
    trivial solution, and then using those solutions
    to construct solutions for larger and larger
    portions of the original problem.

12
All BLAST programs take the following steps
  • The query is divided to overlapping, short word
    sizes, (e.g. 3 for amino acid sequence, 11 for
    nucleotide sequence).
  • Words with simple compositions are filtered out.
  • The remaining words are searched for in the
    databases.
  • After finding the best matching sequence with
    each word, the matching is extended in both
    direction until the highest scoring pairs (HSP)
    are found.
  • HSPs are reported to the client.

MNPLSSSGQPHTLM MNP SGQ NPL GQP PLS QPH
LSS PHT SSS HTL SSG TLM
MNGPLSSSGQTSTSPH LSS
13
BLAST Programs
  • BLASTN
  • Compares a nucleotide query sequence against a
    nucleotide sequence database.
  • BLASTP
  • Compares an amino acid query sequence against a
    protein sequence database.
  • BLASTX
  • Compares a nucleotide query sequence translated
    in all reading frames against a protein sequence
    database.

14
  • tblastn
  • Compares a protein query sequence against a
    nucleotide sequence database dynamically
    translated in all reading frames.
  • tblastx
  • Compares the six-frame translations of a
    nucleotide query sequence against the six-frame
    translations of a nucleotide sequence database.

15
If your sequence is NUCLEOTIDE

16
(No Transcript)
17
BLAST search examples
Write a Comment
User Comments (0)
About PowerShow.com