BIOINFORMATICS AND GENE DISCOVERY - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

BIOINFORMATICS AND GENE DISCOVERY

Description:

Child AB. Child BA. Parent A. Parent B. crossover point. Mutation ... Gene Discovery Exercise. http://metalab.unc.edu/pharmacy/Bioinfo/Gene. Bibliography ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 28
Provided by: MML47
Category:

less

Transcript and Presenter's Notes

Title: BIOINFORMATICS AND GENE DISCOVERY


1
BIOINFORMATICSANDGENE DISCOVERY
UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL
Bioinformatics Tutorials
  • Iosif Vaisman

1998
2
(No Transcript)
3
From genes to proteins
4
From genes to proteins
DNA
PROMOTER ELEMENTS
TRANSCRIPTION
RNA
SPLICE SITES
SPLICING
mRNA
START CODON
STOP CODON
TRANSLATION
PROTEIN
5
From genes to proteins
6
Comparative Sequence Sizes
  • Yeast chromosome 3
    350,000
  • Escherichia coli (bacterium) genome
    4,600,000
  • Largest yeast chromosome now mapped
    5,800,000
  • Entire yeast genome
    15,000,000
  • Smallest human chromosome (Y)
    50,000,000
  • Largest human chromosome (1)
    250,000,000
  • Entire human genome
    3,000,000,000

7
Low-resolution physical map of chromosome 19
8
Chromosome 19 gene map
9
Computational Gene Prediction
  • Where the genes are unlikely to be located?
  • How do transcription factors know where to bind a
    region of DNA?
  • Where are the transcription, splicing, and
    translation start and stop signals?
  • What does coding region do (and non-coding
    regions do not) ?
  • Can we learn from examples?
  • Does this sequence look familiar?

10
Artificial Intelligence in Biosciences
Neural Networks (NN) Genetic Algorithms
(GA) Hidden Markov Models (HMM) Stochastic
context-free grammars (CFG)
11
Information Theory
0
1
1 bit
12
Information Theory
00
01
1 bit
11
10
1 bit
13
Information Theory
1 bit
1 bit
14
Scientific Models
Physical models -- Mathematical models
15
Neural Networks
  • interconnected assembly of simple processing
    elements (units or nodes)
  • nodes functionality is similar to that of the
    animal neuron
  • processing ability is stored in the inter-unit
    connection strengths (weights)
  • weights are obtained by a process of adaptation
    to, or learning from, a set of training patterns

16
Genetic Algorithms
Search or optimization methods using simulated
evolution. Population of potential solutions is
subjected to natural selection, crossover, and
mutation
choose initial population evaluate each
individual's fitness repeat select individuals
to reproduce mate pairs at random apply
crossover operator apply mutation
operator evaluate each individual's
fitness until terminating condition
17
Crossover
Mutation
18
Markov Model (or Markov Chain)
A
G
A
T
C
T
Probability for each character based only on
several preceding characters in the sequence
of preceding characters order of the Markov
Model Probability of a sequence P(s) PA
PA,T PA,T,C PT,C,T PC,T,A PT,A,G
19
Hidden Markov Models
States -- well defined conditions Edges --
transitions between the states
ATGAC ATTAC ACGAC ACTAC
Each transition asigned a probability. Probabilit
y of the sequence single path with the highest
probability --- Viterbi path sum of the
probabilities over all paths -- Baum-Welch method
20
Hidden Markov Model of Biased Coin Tosses
  • States (Si) Two Biased Coins C1, C2
  • Outputs (Oj) Two Possible Outputs H, T
  • p(OutputsOij) p(C1, H), p(C1, T), p(C2, H)
    p(C2, T)
  • Transitions From State X to Y A11, A22, A12,
    A21
  • p(Initial Si) p(I, C1), p(I, C2)
  • p(End Si) p(C1, E), p(C2, E)

21
Hidden Markov Model for Exon and Stop Codon (VEIL
Algorithm)
22
GRAIL gene identification program
23
Suboptimal Solutions for the Human Growth Hormone
Gene (GeneParser)
24
Measures of Prediction Accuracy
Nucleotide Level
25
Measures of Prediction Accuracy
Exon Level
WRONGEXON
CORRECTEXON
MISSING EXON
REALITY
PREDICTION
26
GeneMark Accuracy Evaluation
27
Bibliography http//linkage.rockefeller.edu/wli/ge
ne/list.html and http//www-hto.usc.edu/software/p
rocrustes/fans_ref/
Gene Discovery Exercise http//metalab.unc.edu/pha
rmacy/Bioinfo/Gene
Write a Comment
User Comments (0)
About PowerShow.com