Lecture 6: Gene Prediction - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Lecture 6: Gene Prediction

Description:

... UTRs (terminator sites, trailer) cDNA ... terminator. Upstream. Downstream. DNA template strand (nucleus) Pre-mRNA ... 1, 2, 3 -1, -2, -3 ... – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 19
Provided by: MICHELLE6
Category:

less

Transcript and Presenter's Notes

Title: Lecture 6: Gene Prediction


1
Lecture 6 Gene Prediction
  • Chapter 6
  • Part 1 Prokaryotic gene organization and gene
    prediction

2
Review of Molecular Genetics
3
  • Promoter
  • -10 and 35 regions from TSS where sigma factor
    recognizes a promoter
  • Operator (where regulator binds) found between
    TSS and SD
  • Transcriptional start site (TSS) occurs at the
    tip of the yellow arrow.
  • Shine Dalgarno sequence
  • Start codon and stop codon and concept of ORF
  • UTR of the transcript.

4
  • Which strand?
  • Template strand
  • Coding (Sense) Strand
  • Recognizing theoretical ORFs
  • Start and Stop in same frame

5
Operon Model
  • Operon
  • Polycistronic mRNA unique to prokaryotes
  • EX) Lac operon beta galactosidase, lactose
    permease, and lactose transacetylase all under
    control of same promoter?regulated together via
    pLacI repressor
  • Weak versus strong promoters

6
Review of Terms
  • Template strand
  • Coding strand (sense strand)
  • Software typically looks for ORFs on coding
    strand in 5 to 3 direction 5ATGTGA3
  • (TAA)
  • (TAG)

7
Terminology
  • ORF
  • A series of DNA codons, including a 5 initiation
    codon and a termination codon, that encodes a
    putative or known gene.
  • Exons
  • Portions of the ORF that are transcribed and when
    combined form the coding sequence (CDS) for the
    gene
  • Introns
  • Portions of the ORF that are transcribed and are
    spliced out of the mRNA before translation.
  • Untranslated regions (UTRs)
  • Non-coding regions that are transcribed and flank
    the ORF (for DNA) and CDS (for mRNA)
  • 5 end (relative to mRNA) UTRs (leader,
    regulatory sites)
  • 3 end UTRs (terminator sites, trailer)
  • cDNA (Complementary DNA)
  • Get cDNA from reverse transcription of CDS
  • CDS
  • How to get it in the laboratory.
  • How to get it on paper.

8
Eukaryotic Gene
DNA template strand (nucleus)
promoter
Exon
intron
Exon
intron
Exon
Downstream
Upstream
transcription
Trc start site
terminator
Pre-mRNA (nucleus)
RNA processing
leader
trailer
3
AAA
mRNA
G
P
P
P
5 cap
Start codon
Stop codon
9
Another Example
10
3
AAAAAA
5
mRNA
Add an oligo(dT) primer compliments poly A tail
3
5
AAAAAA
5
TTTTTT
3
Bottom strand synthesized by reverse
transcriptase (DNA)
3
mRNA
5
AAAAAA
5
3
TTTTTT
DNA
Ribonuclease H degrades RNA.
5
3
DNA
Second strand of DNA synthesized by DNA
Polymerase I
cDNA
DNA
3
5
5
3
DNA
11
Finding Genes in Prokaryotes
  • Prediction Strategies

12
Prediction Strategies for Prokaryotes
  • Start and stop codons
  • 83 of E. coli start codons are AUG in mRNA (UUG
    and GUG occur less often)
  • Start in DNA coding strand (5?3)?
  • Stop in DNA?
  • Size
  • Stop codons occur randomly every 21 codons in
    noncoding DNA
  • If you have a run of greater than 30 sense codons
    then you may have a coding region (an ORF is
    possible)
  • Average length of coding region is 317 codons
    long less than 1.8 of all genes are shorter
    than 60 codons
  • -35 and 10 recognition for Sigma Factors
  • Transcriptional termination signals
  • Inverted repeats followed by a run of uracils
    found at 3 end
  • Forms a stem loop structure, which signifies
    termination

13
Prediction Strategies for Prokaryotes
  • Comparison to a database of known sequences
  • Look for homology if it shows homology then it
    cannot be junk
  • Problem the database is incomplete
  • Lack of homology doesnt mean it isnt a true
    gene
  • Shine Dalgarno recognition
  • AGGAGGU
  • Upstream from start codon downstream from trc
    start site
  • Regulatory sites found upstream from trc start
    site

14
Where does the Reading Frame Start?
  • Use a 6 frame search b/c we dont know
  • 3 reading frames on each strand
  • DS DNA
  • 5tacgtactcaacaatcatgagctggccattttaa3
  • 3atgcatgagttgttagtactcgaccggtaaaatt5
  • Search for atg in 5 to 3 direction, which will
    represent your start codon
  • Top strand
  • 5tacgtactcaacaatcatgagctggccattttaa3
  • Bottom Strand
  • 5ttaaaatggccagctcatgattgttgagtacgta3

15
Top strand 5tacgtactcaacaatcatgagctggccattttaa3
Bottom Strand 5ttaaaatggccagctcatgattgttgagtacgta
3 In reality there needs to be enough codons
between the start and stop to represent a real
ORF. How many? At least 30 codons (90
bases). Average 317 codons (951) Less than 2 are
less than 60 codons (180 bases) ORFinder will
designate frames. It assumes the single strand
you put into the program is the coding (sense)
strand in the 5to3 direction and it calls this
the plus strand. It figures the opposite strand
and calls this minus. 1, 2, 3 -1, -2, -3
16
Question to Ponder!
  • What if you find an ORF in a prokaryote with
    several supporting criteria, but you dont find
    the promoter region close-by upstream?

17
Gene Prediction Software
  • ORF Finder
  • http//www.ncbi.nlm.nih.gov/gorf/gorf.html
  • Finds all start and stop codons
  • Sorts by size
  • Links for easy BLAST
  • Useful for Prokaryote ORF finding
  • Not very useful for Eukaryote DNA
  • Prok Practice together

18
Practice
  • Download the prok practice sequence from the
    course web page.
  • Copy the sequence and open ORF Finder.
  • Past the sequence into ORF Finder and run.
  • Look at Blast output of each possible ORF.
  • Look at sizes of putative ORFs
Write a Comment
User Comments (0)
About PowerShow.com