Incorporating Bioinformatics in an Algorithms Course - PowerPoint PPT Presentation

About This Presentation
Title:

Incorporating Bioinformatics in an Algorithms Course

Description:

Algorithms to analyze DNA, RNA, or protein sequences ... a b k for a gap of k spaces (affine gap penalty) Global Alignment Matrix. 1 -2 -3 ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 17
Provided by: lda7
Category:

less

Transcript and Presenter's Notes

Title: Incorporating Bioinformatics in an Algorithms Course


1
Incorporating Bioinformatics in an Algorithms
Course
  • Lawrence DAntonio
  • Ramapo College of New Jersey

2
What is Bioinformatics?
  • Algorithms to analyze DNA, RNA, or protein
    sequences
  • Database searches to find homologous sequences
  • Construction of evolutionary trees
  • Structure prediction
  • Human Genome Project

3
Why use Bioinformatics in an Algorithms Course?
  • Real-life applications of algorithms
  • Variety of string processing algorithms
  • Use of similarity instead of exact matching
  • Dynamic programming examples
  • Theory vs. Practice Issues

4
Models for Incorporating Bioinformatics
  • Infusion include material from bioinformatics
    in computer science courses
  • Paired Courses have joint lectures and projects
    from, e.g., Algorithms and Genetics courses
  • Tracked Courses have a separate Algorithms for
    Bioinformatics course

5
Biology Basics
  • Primary DNA structure Oriented
    character string
  • Double strand constructed through base pairing
  • Central Dogma Information passes in one
    direction, from DNA to RNA to protein
  • Amino acids formed from triples of bases, called
    codons

6
Bonding along a strand
7
Bonding between strands
8
Complexity of DNA Problems
  • 3 billion base pairs in human genome
  • Many NP complete problems
  • 10600 possible alignments for two 1000 character
    sequences

9
Sequence Alignment
  • Determine the alignment of two sequences that
    maximizes similarity (global alignment)
  • Determine substrings of two sequences with
    maximum similarity (local alignment)
  • Determine the alignment for several sequences
    that maximizes the sum of pairs similarity
    (multiple alignment)

10
Edit Operations

Substitution
Insertion
Deletion
AATAAGC
AAT-AAGC
AATAAGC
ATTAAGC
AATTAAGC
AA-AAGC
11
Dynamic Programming Alignment Algorithm
(Needleman-Wunsch)
If a1,a2,,ai and b1,b2,,bj have been
aligned, there are three possible next moves
  • Match ai1 with bj1
  • Match ai1 with a space
  • Match bj1 with a space

Choose the move that maximizes the similarity of
the two sequences
12
Alignment Scoring System
  • 1 for a character match
  • -1 for a mismatch (substitution)
  • -2 for using a space (indel)
  • or
  • a bk for a gap of k spaces (affine gap penalty)

13
Global Alignment Matrix
14
Optimal Alignment
15
Other Bioinformatics Algorithms
  • Palindromes
  • Tandem Repeats
  • Longest Common Subsequence
  • Double Digest (NP complete)
  • Shortest Common Superstring (NP complete)

16
References
  • Clote and Backofen, Computational Molecular
    Biology, Wiley
  • Gusfield, Algorithms on Strings, Trees, and
    Sequences, Cambridge University Press
  • Mount, Bioinformatics, Cold Spring Harbor Press
  • Setubal and Meidanis, Introduction to
    Computational Molecular Biology, PWS
  • Waterman, Introduction to Computational Biology,
    CRC Press
Write a Comment
User Comments (0)
About PowerShow.com