Genome Composition Questions - PowerPoint PPT Presentation

1 / 6
About This Presentation
Title:

Genome Composition Questions

Description:

Assume uniform base composition of 25 ... To have an ORF (no TAG, TGA or TAA), probability is 61/64. per codon. To have 199 codon ORF probability is (61/64)199 ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 7
Provided by: OHUI
Category:

less

Transcript and Presenter's Notes

Title: Genome Composition Questions


1
Genome Composition Questions
Q1. How many sites for EcoR1 (a
six-cutter) Assume uniform base composition of
25 each A,C,G,T For any position the probability
of a particular nucleotide is 0.25. One quarter
of sites have G. Of those with G, another
quarter have A immediately following. One
sixteenth of sites have GA. For any exact
sequence ABCDE.., the probability of occurrence
is p(A).p(B).p(C).p(D).p(E).. where p(A)
indicates the probability of As etc. In this
case, p(A)p(C)p(G)p(T)0.25, then (1) EcoR1
sites occur with probability (0.25)62.4 x10-4
48.6kb of sequence should contain
(48600)(2.4x10-4) 12 EcoR1 sites. Likewise
GGTNACC occurs with the same probability (2)
EcoR1 sites are underrepresented in lambda
because the host, E.coli produces the EcoR1
enzyme which must be avoided by the phage if it
is to survive.
2
Q2.
The spacers could be different because Promoters
are present and interference between genes must
be avoided. Similarly DNA unwinding needs to
occur ahead of the gene to facilitate
transcription. The mean spacer size is 450bp
(300bp600bp)/2 182 genes give 450x182 81.9kb
of spacer Therefore 315kb - 81.9kb 233kb are
coding Average gene length is 233kb/182.
1.28kb This can encode 427 amino acids.
3
Q3
The mean GC content of the whole unit determined
from the combination of the GC contents of the
component units 8200bp x 67.3 GC is composed of
the following bp GC 1825 x 53.8
162 x 59.9 4082 x 65.4 2131 x
(X) therefore X 83
4
Q4
The average GC in genes is 73 let n
proportion noncoding DNA Overall GC is
GC-coding GC-noncoding Overall 0.72 0.73 x
(1-n) 0.67n therefore 0.72 0.73 - 0.73n
0.67n -0.01-0.06n n1/6 noncoding
5
Q5
For a specific region cDNA has detected a
proportion 125/483 of the genes. Therefore
the 4615 unique cDNAs represent only the
125/483 part of the complete transcriptome.
There must be 17800 genes in C.elegans 483
genes are found in 2.181Mb of DNA 17800 genes are
found in X Mb of DNA Then X (17800 x
2.181)/483 Mb 80.5Mb
6
Q6.
To obtain any 3bp sequence probability is 1/64 To
obtain ATG (initiator) prob is 1/64 To have an
ORF (no TAG, TGA or TAA), probability is
61/64 per codon. To have 199 codon ORF
probability is (61/64)199 To have stop codon in
200th codon position has probability
3/64 Combined probability is 1/64 x (61/64)199 x
3/64 3 x (61)199/(64)201 (i) This can be
rewritten as 3 x 10(199 log61 201 log 64) (ii)
is given by (61/64)199 7.1 x 10-5 b (iii) is
given by (1 - 7.1 x 10-5 ) (iv) Let L be this
length then (61/64)L 1 (61/64)L (61/64)L
0.5 L log (0.5)/log (61/64) 14.4
Write a Comment
User Comments (0)
About PowerShow.com