Bioinformatics PhD. Course - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Bioinformatics PhD. Course

Description:

Bioinformatics PhD. Course Summary (approximate) 1. Biological introduction 2. Comparison of short sequences ( – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 9
Provided by: LCL6
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics PhD. Course


1
Bioinformatics PhD. Course
Summary (approximate)
  • 1. Biological introduction
  • 2. Comparison of short sequences (lt10.000 bps)
  • 3 Comparison of large sequences (up to 250 000
    000)
  • 4 Sequence assembly
  • 5 Efficient data search structures and algorithms
  • 6 Proteins...

2
2. Comparison of short sequences (lt10.000 bps)
Summary (more or less)
  • 2.1 Dot matrix
  • 2.2 Pairwise alignment.
  • 2.3 Hash algorithms.
  • 2.4 Multiple alignment.

3
2. Dot matrix
Given two sequences, how we can analyse their
degree of identity?
By searching those parts that match
1/0
1 if both characters coincide
4
2. Dot matrix
Given two sequences, how we can analyse their
degree of identity?
By searching those parts that match
5
2.1 Dot matrix
Lwindow length
What is the cost of the algorithm?
When are the matchings relevant?
6
2.1. Dot matrix algorithm cost
  • long(S1)long(S2) L in other words O(n2
    L)
  • can long(S1)long(S2) be possible?
  • can we also say that O(n2 ) is independent of L?

7
2.1. Dot matrix signals
A transposons
When are signals statistically significant?
8
2.1. Dot matrix statistical significance
We need to define a random model against which
to compare the signals
we define RV X number of characters that
coincide,
then Prob(Xk)comb(L,k) pk (1-p)L-k
What is its expected value?
Write a Comment
User Comments (0)
About PowerShow.com