S1 - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

S1

Description:

DIALIGN Diagonal Alignment. ASSIRC Accelerated Search for Similarity Regions in Chromosomes ... Idea: long enough MUMs are part of the alignment ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 13
Provided by: cs146
Category:
Tags: alignment

less

Transcript and Presenter's Notes

Title: S1


1
1
1.5
1
S1
S2
S1
S2
S3
S4
0.5
1
1.5
1
S1
S2
S3
S4
0.75
0.5
1
2.75
1.5
1
S1
S2
S3
S4
S5
2
Score s(T,V) w1 w5 s(T,I) w1 w6 s(L,V) w2
w5 s(L,I) w2 w6 s(K,V) w3 w5 s(K,I) w3
w6 s(K,V) w4 w5 s(K,I) w4 w6 / 8
  • PEEKSAVTAL
  • GEEKAAVLAL
  • K
  • K

5. V 6. I
3
CLUSTALW
  • Each sequence is weighted by how similar to
    others
  • - overrepresentation of a subfamily
  • - weight derived from the guide tree
  • (in the example, weight for S5 and S4 2.75,
    1.9375)
  • Details handling of gaps
  • Problem situations
  • similar only in smaller regions
  • large insertion in a sequence
  • repetitive elements in one sequence

4
Deriving the Sequence of Genome
  • Need a rapid and accurate sequencing method
  • Normal sequencing technique 700 b
  • BAC 100-400 kb
  • Random small fragments 2-10kb
  • Assemble them
  • http//seqcore.brcf.med.umich.edu/doc/educ/dnapr/s
    equencing.html

5
Genome alignment and comparison
  • Sequencing - fast, available genomes
  • Annotation not so fast
  • - identify gene structure, function
  • - cross species linking
  • Comparative genomics
  • -potential
  • -coding and non-coding regions
  • (e.g. human and mouse)
  • -species-specific regions
  • - selective pressure, evolution, rearrangement

6
Genome alignment and comparison - An example
  • Mouse chromosome 16 and human genome
  • (Mural et al, 2000, Science, 296, 1661-71)

7
Genome alignment
  • Previous discussed alignment methods
  • -most targets single gene
  • -not accurate enough for genome scale
  • -or computationally demanding
  • Situation with genome comparisons
  • rearrangements, repeats,

8
Methods in Genome Alignment gt1998
  • DIALIGN Diagonal Alignment
  • ASSIRC Accelerated Search for Similarity
    Regions in Chromosomes
  • DBA DNA block aligner
  • MUMmer - Maximal Unique Match (er)
  • PipMaker percent Identity Plot Maker
  • GLASS Global alignment System
  • WABA Wobble Aware Bulk Aligner
  • LSH-ALL-PAIRS Locality Similarity Hashing in All
    Pairs

9
Suffix Tree
  • A suffix tree for an m-character-string S
  • Has exactly m leaves, numbered 1 to m
  • An internal nodes has at least 2 children
  • Each edge is labeled with a non-empty substring
  • No two edges out of the same node have the labels
    beginning with the same character
  • From root to leaf i, the concatenation of the
    edge-labels on the path is the suffix of S
    starting from position i.

10
An Example of a Suffix Tree
String atgtgtgtc
11
MUMmer
Delcher AL, 2002 Nucleic Acid Research Vol
30(11) Delcher AL, 1999 Nucleic Acid Research Vol
27(11)
  • Maximum Unique Match
  • Occurs once in genome A and once in genome B
  • Cant be extended
  • Difference between MUM and common subsequence
  • Idea long enough MUMs are part of the alignment

12
MUMmer
Delcher AL, 2002 Nucleic Acid Research Vol 30(11)
Example A acat B acaa
  • Find MUMs
  • Build a suffix tree for A and B
  • Find unique matches then maximum matches
Write a Comment
User Comments (0)
About PowerShow.com