Multiple Sequence Alignment and Molecular Evolution - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Multiple Sequence Alignment and Molecular Evolution

Description:

18th and 19th centuries: The evolution of a theory ... Part of Darwin's Theory. The world is not constant, but changing ... Part of Darwin's Theory. This ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 50
Provided by: stephe78
Category:

less

Transcript and Presenter's Notes

Title: Multiple Sequence Alignment and Molecular Evolution


1
Multiple Sequence Alignment and Molecular
Evolution
2
Quiz
  • What do
  • BLAST
  • FASTA
  • Protein motif searching
  • Multiple Sequence alignment
  • Have in common?

3
Key Concepts
  • Appreciate the foundation of sequence alignment
    evolutionary theory
  • Appreciate the significance of unique and
    non-unique sequence characters
  • Realize potential uses of multiple sequence
    alignments
  • Understand the basics of the most popular
    multiple sequence alignment method
  • Appreciate importance of accurate multiple
    alignments and the need for manual editing in
    some cases

4
18th and 19th centuries The evolution of a theory
  • Earth erosion, sediment deposition, strata
    present earth conditions provide keys to the past

5
18th and 19th centuries The evolution of a theory
  • Discoveries of fossils accumulated
  • Remains of unknown but still living species that
    are elsewhere on the planet?
  • Cuvier (circa 1800) the deeper the strata, the
    less similar fossils were to existing species

6
  • Discoveries of fossils accumulated
  • Remains of unknown but still living species that
    are elsewhere on the planet?
  • Cuvier (circa 1800) the deeper the strata, the
    less similar fossils were to existing species

7
(No Transcript)
8
Part of Darwins Theory
  • The world is not constant, but changing
  • All organisms are derived from common ancestors
    by a process of branching.

9
Part of Darwins Theory
  • This explained
  • Fossil record
  • Similarities of organisms classified together
    (shared traits inherited from common ancestor)
  • Similar species in the same geographic region

10
  • What is evolution?
  • Dynamic changes with selected pressure
  • Punctuated equilibrium
  • Progressive generational adaptation
  • Environmentally imposed mutational success
  • E (mutationselective pressure) / time
  • Staying fit

11
Characters
  • Heritable changes in features (morphology,
    DNA sequence etc)
  • The more similar characters you have, the more
    related you are
  • However.. characters can be unique and non-unique

12
Evolution and characters
time
13
A Unique Character Hair for Mammals
  • Hair evolved only once and is unreversed
  • Presence of hair ? strong indication that
    organism is a mammal

14
Homoplasy The formation of tails
  • Tails evolved independently in the ancestors of
    frogs and humans
  • Presence of a tail ? no useful conclusions

15
Unique and non-unique characters
Non-unique Unique
bioinformatics bioinfortatics bioinfortatios
oinformatios informatios infortation
information
time
16
Unique and non-unique characters
  • Example Sequence analysis of functionally
    similar transporters
  • All share the same deleted sequence region, which
    is not found in any other transporter examined to
    date
  • Unique character?
  • Further investigate for possible functional
    significance, or use for classification

17
Unique and non-unique characters
  • Example Sequence analysis of functionally
    similar transporters
  • All have isoleucine at the third position in the
    sequence, however some other transporters have
    isoleucine there too, while some other
    transporters have valine at that position
  • Non-unique.
  • Changes from I ? V ? I are common (see BLOSUM or
    PAM matrices). Not a high priority for further
    analysis of significance and not useful for
    classification.

18
Classification according to characters more
characters can be good
Chicken most similar to Tofu?
19
Classification according to characters
20
Classification according to characters
increasing the number of characters
Chicken most similar to Duck?
21
Multiple Sequence Alignment The power of many
many characters
VTISCTGSSSNIGAG-NHVKWYQQLPG
VTISCTGTSSNIGS--ITVNWYQQLPG
LRLSCSSSGFIFSS--YAMYWVRQAPG
LSLTCTVSGTSFDD--YYSTWVRQPPG
PEVTCVVVDVSHEDPQVKFNWYVDG--
ATLVCLISDFYPGA--VTVAWKADS--
AALGCLVKDYFPEP--VTVSWNSG---
VSLTCLVKGFYPSD--IAVEWESNG--
22
Evolution and characters the importance of
comparing characters with common origins
(homologous)
bioinformatics bioinformatics bioinformatios oinfo
rmatios informatios information information
time
23
Evolution and characters
  • Gaps represent non-homologous positions in the
    sequence.
  • They reflect the occurrence of insertions/deletion
    s or other rearrangements during the evolutionary
    process.

bioinformatics bioinformatics bioinformatios --oin
formatios ---informatios ---information ---informa
tion
time
24
Multiple Sequence Alignment
VTISCTGSSSNIGAG-NHVKWYQQLPG
VTISCTGTSSNIGS--ITVNWYQQLPG
LRLSCSSSGFIFSS--YAMYWVRQAPG
LSLTCTVSGTSFDD--YYSTWVRQPPG
PEVTCVVVDVSHEDPQVKFNWYVDG--
ATLVCLISDFYPGA--VTVAWKADS--
AALGCLVKDYFPEP--VTVSWNSG---
VSLTCLVKGFYPSD--IAVEWESNG--
The sole purpose of multiple sequence alignments
is to place homologous positions of homologous
sequences into the same column.
25
On further with Multiple Sequence
AlignmentQuestions?
26
Multiple Sequence Alignment - uses
  • Powerful tool
  • Detect trends/patterns in homologous sequences
    (motifs, domains, indels)
  • Indels (insertions and deletions) of evolutionary
    interest, yet not incorporated into some
    phylogenetic tree algorithms
  • - ATTYNETCITRTQ -
  • - SITYNETCVTITQ -
  • - SVTY-----CIVR -

27
  • Multiple sequence alignments and phylogenetic
    analysis
  • First step in any phylogenetic analysis
  • Phylogenetic analysis only as good as the
    alignment
  • in ?
    out!

28
  • Multiple alignments not just sequence
  • insertions and deletions in sequences

29
  • Automated Analysis or Manual Intervention?
  •  
  • Automated more explicit or objective than manual
  • Leads to false sense of security
  • Aligns residues that are likely similar only by
    chance
  • ILPITSPSKEGYESGKAPDEFSSGG
  • ILPEH--IKDDGELGAAPHSFSTAG
  • VLPLD-----S--AGRPADSFSAAG
  • VLPVDR-------DGQARDEYTKVG
  • VLPVDN-------KGEARDEYTKVG
  • LLPYDD-------QGRPQDDYSRAG
  • GIVSRSG---SNFDGEPKDSYGKVG

30
  • Clustal
  • Thompson, J.D., Higgins, D.G. and Gibson,
    T.J. (1994)
  • CLUSTAL W improving the sensitivity of
    progressive multiple sequence alignment through
    sequence weighting, positions-specific gap
    penalties and weight matrix choice. Nucleic
    Acids Research, 224673-4680.

31
  • Clustal Incorporation of phylogenetic criterion
    into multiple sequence alignment algorithms
  • 1. Pairwise alignments calculate a distance
    matrix
  • 2. Guide tree constructed
  • 3. Sequences progressively aligned according to
    guide tree hierarchy

32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
Clustal Incorporating Biology into Sequence
Alignment Algorithms
  • Matrices varied at different alignment stages
    according to the divergence of the sequences
  • For proteins, gap penalties differ for
    hydrophilic (water-loving) sequence regions to
    encourage new gaps in potential loop regions on
    the protein surface (which is usually exposed to
    water)
  •  
  • Gapped positions in early alignments have reduced
    gap penalties to encourage the opening up of new
    gaps at these positions
  • (gaps not penalized as much at the end of
    proteins)
  • gh

36
ClustalX
  • Subset of sequences in alignment can be selected
    and realigned. Useful when trying to align very
    divergent sequences.
  • A range of the sequence alignment can be selected
    for realignment. Guide tree built based only on
    the residue range selected.

37
Differences between Clustal and BLAST?Clustal
has full length (global) alignmentGap penalty
differencesInput differences selection of
sequencesClustal vary gap penalties and
matricesmany to many Speed Clustal
slowerAlign pro-pro or nuc-nucSimilarities?Iden
tifies conserved domainsUse the same matrices
38
Algorithms in Molecular Biology http//www.math.t
au.ac.il/rshamir/algmb/00/algmb00.html
39
ClustalX features
  • 'Alignment Quality Score' below the alignment.

40
MACAW - a program for semi-manual local multiple
alignment of DNA and protein sequences.
  • User delimits the sequences and regions in which
    to search for blocks or specify blocks
  • Decides which to keep and significance of each
    block is given statistical value.

41
Genedoc - for editing and flexible display of
alignments
  • view your alignment with different forms of
    shading that you customize
  • edit your alignment (add or remove gaps) or the
    sequence order or the sequences themselves
  • print directly, or export a graphic of your
    alignment

42
Genedoc - for editing and flexible display of
alignments
43
  • Statistics Report
  • 1 residues identical
  • 2 residues gt zero score (similar residues)
  • 3 residues lined up with a gap
  • human rat rabbit turtle
  • human 1870 97 96 22
  • 0 98 96 28
  • 0 0 2 61
  • rat 1830 1874 94 22
  • 1846 0 95 28
  • 18 0 2 61
  • rabbit 1818 1793 1863 22
  • 1828 1815 0 28
  • 45 53 0 61

44
Standard multiple sequence alignment approach
  • Be as sure as possible that the sequences
    included are homologous
  • Know as much as possible about the gene/protein
    in question before trying to create an alignment
    (secondary structure, domains etc..)
  • Start with an automated alignment preferably one
    that utilizes some evolutionary theory such as
    Clustal

45
  • Examine alignment
  • Are you confident that aligned residues/bases
    evolved from a common ancestor?
  • Are domains of the proteins/predicted secondary
    structures, etc. aligning correctly?
  • ? No? May need to edit sequences and redo
  • _______________________________
  • _________________ ___ __ ____ _
  • ? Yes? Move on!
  • Note indels (insertions and deletions)
  • Possible insights into functionally important
    regions

46
  • Use in subsequent analyses (identify consensus
    or other pattern recognition, for HMM
    construction, phylogenetic analysis, etc..)
  • For phylogenetic analysis Remove unreliably
    aligned regions
  • ILPITSPSKEGYESGKAPDEFSSGG
  • ILPEH--IKDDGELGAAPHSFSTAG
  • VLPLD-----S--AGRPADSFSAAG
  • VLPVDR-------DGQARDEYT-VG
  • VLPVDN-------KGEARDEYT-VG
  • LLPYDD-------QGRPQDDYSRAG
  • GIVSRSG---SNFDGEPKDSYGKVG

Delete?
47
  • If aligning DNA sequence for phylogenetic
    analysis may remove every third codon position

MMET GLY SER GLYMET GLY SER GLY MET ARG
CYS ARG AATG GGA AGT GGA ATG GGG AGC GGGATG
AGG TGC AGG
48
Key Concepts
  • Appreciate the foundation of sequence alignment
    evolutionary theory
  • Appreciate the significance of unique and
    non-unique sequence characters
  • Realize potential uses of multiple sequence
    alignments
  • Understand the basics of the most popular
    multiple sequence alignment method
  • Appreciate importance of accurate multiple
    alignments and the need for manual editing in
    some cases

49
  • Questions?

M
Write a Comment
User Comments (0)
About PowerShow.com