Gene Prediction in Eukaryotes Simplified - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Gene Prediction in Eukaryotes Simplified

Description:

Translate DNA sequence in all 6 reading frames ... Protein compared against nucleic acid database ... http://arete.ibb.waw.pl/PL/html/gene_lang.html. GenLang ... – PowerPoint PPT presentation

Number of Views:499
Avg rating:3.0/5.0
Slides: 61
Provided by: Ole105
Category:

less

Transcript and Presenter's Notes

Title: Gene Prediction in Eukaryotes Simplified


1
Gene Prediction in Eukaryotes Simplified
  • For highly conserved proteins
  • Translate DNA sequence in all 6 reading frames
  • BLASTX or FASTX to compare the sequence to a
    protein sequence database
  • Or
  • Protein compared against nucleic acid database
    including genomic sequence that is translated in
    all six possible reading frames by TBLASTN,
    TFASTX/TFASTY programs.
  • Note Approximation of the gene structure only.

2
  • Transcript-based prediction
  • How it works
  • Align transcript data to genomic sequence using
    a
  • pair-wise sequence comparison

Gene Model
EST
cDNA
3
  • Transcript-based gene
  • prediction algorithm
  • BLAST (Altshul) (36 hours)
  • Widely used and understood
  • HSPs often have ragged ends so extends to the
    end of the introns
  • EST_GENOME (Mott) (3 days)
  • Dynamic programming post-process of BLAST
  • Slow and sometimes cryptic
  • BLAT (Kent) (1/2 hour)
  • Next generation of alignment algorithm
  • Design for looking at nearly identical sequences
  • Faster and more accurate than BLAST

4
  • Peptide-based gene prediction algorithm
  • BLAST (Altshul)
  • Widely used and understood
  • Smith-Waterman
  • Preliminary to further processing
  • Used in preference to DNA-based similarities for
    evolutionary diverged species as peptide
    conservation is significantly higher than
    nucleotide

5
Gene prediction in eukaryotes
  • When assessing a gene prediction program, two
    criteria are used
  • Sensitivity proportion of true sites (e.g.
    exon or donor splice sites)
  • predicted correctly
  • Specificity proportion of predicted sites that
    are correct
  • Most gene prediction programs concentrate on the
    prediction of protein-
  • coding exons and distinguish 4 types of exons
  • 1. Initial exon initiation codon _ first 5
    splice junction
  • 2. Internal exon 3 splice site _ 5 splice site
  • 3. Terminal exon 3 splice site _ stop codon
  • 4. Single exon intronless gene
  • The ideal gene finding program would perfectly
    mimic the cells
  • transcription, splicing and translation
    machinery. This is not yet possible
  • but a number of biologically important signal
    sequences inform todays
  • algorithms

6
Unfortunately, only 70 of human promoters
actually contain the core signal sequences cited
above. Moreover, the AATAAA polyadenylation
signal is absent from 50 of untranslated 3
regions. Hence it is hard to determine the
beginning and the end of a gene. Translational
Signals Two signals are important here 1. Start
codons (ATG) 2. The optimal context for
initiation of translation in vertebrate mRNA
is GCCACCatgG. This is sometimes referred to
as the Kozak signal. 3. Termination codon
TGA, TTA, TAGthey should be absent from
exons. Splicing Signals Nuclear pre-mRNA
introns are excised at spliceosomes. These are
large ribonucleoprotein complexes that recognize
three kinds of sites 1. 5 donor site GT 2. 3
acceptor site AG 3. branch point internal
site In addition, upstream of the acceptor site
there is a bias towards pyrimidines (T,C). The
rules about the donor and acceptor sites are
almost universal. However, splice-site usage is
often influenced by exonic and intronic signals
that are located away from the splice
junctions.
7
Gene Finding Challenges
  • Need the correct reading frame
  • Introns can interrupt an exon in mid-codon
  • There is no hard and fast rule for identifying
    donor and acceptor splice sites
  • Signals are very weak

8
(No Transcript)
9
Overpredicting Genes
  • Easy to predict all exons
  • Report all sequences flanked by ..AG and GT.. as
    exons
  • Sensitivity 100
  • Specificity 0

10
Methods for GENE Identification
  • Homology based (e.g. Procrustes)
  • sequence similarity with known proteins (need
    close
  • homologs)
  • coding regions fairly well conserved
  • average identity at AA level of human and mouse gt
  • 85
  • TBLASTX used to find exons
  • does not attempt to find complete gene structure
  • (I.e. doesnt effectively find actual splice
    boundaries)
  • Similarity searches misses some genes!!!!!
  • HMMs (GenScan HMMgene VEIL)
  • probabilistic model
  • uses description of gene structure (e.g. splice
    junctions, coding regions, start/stop codons)
  • mixed HMMs and other probabilistic models
  • Neural Nets (GRAIL NetGene2 (splice sites)

11
HMMgene 1.1
http//www.cbs.dtu.dk/services/HMMgene/
The methods used are described in the paperA.
Krogh Two methods for improving performance of
an HMM and their application for gene finding.
In Proc. of Fifth Int. Conf. on Intelligent
Systems for Molecular Biology, ed. Gaasterland,
T. et al., Menlo Park, CA AAAI Press, 1997, pp.
179-186.
  • The program predicts whole genes, so the
    predicted exons always splice correctly.
  • It can predict several whole or partial genes in
    one sequence, so it can be used on
  • whole cosmids or even longer sequences. HMMgene
    can also be used to predict
  • splice sites and start/stop codons.
  • The program is based on a hidden Markov model,
    which is a probabilistic model of
  • the gene structure.
  • Apart from reporting the best prediction, HMMgene
    can also report the N best gene
  • predictions for a sequence. This is useful if
    the there are several equally likely gene
  • structures and may even indicate alternative
    splicing.
  • HMMgene takes an input file with one or more DNA
    sequences in FASTA format.
  • It also has a few options for changing the
    default behavior of the program.
  • The output is a prediction of partial or complete
    genes in the sequences.
  • The output is in a standardized format that is
    easily read by other programs,
  • which specifies the location of all the
    predicted genes and their coding regions and

SEQ1 HMMgene1.1 firstex 692 702
0.347 2 bestparsecds_1 SEQ1 HMMgene1.1
exon_1 2473 2711 0.421 1
bestparsecds_1 SEQ1 HMMgene1.1 exon_2 2897
3081 0.544 0 bestparsecds_1 SEQ1
HMMgene1.1 exon_3 10376 10563 0.861
2 bestparsecds_1 SEQ1 HMMgene1.1 exon_4 11841
11891 0.857 2 bestparsecds_1 SEQ1
HMMgene1.1 exon_5 12387 12483 0.993
0 bestparsecds_1 SEQ1 HMMgene1.1 exon_6 13076
13211 0.970 1 bestparsecds_1 SEQ1
HMMgene1.1 exon_7 13332 13415 0.926
1 bestparsecds_1 SEQ1 HMMgene1.1 exon_8 13515
13603 1.000 0 bestparsecds_1 SEQ1
HMMgene1.1 exon_9 14180 14235 1.000
2 bestparsecds_1 SEQ1 HMMgene1.1 exon_10 14321
14408 0.999 0 bestparsecds_1 SEQ1
HMMgene1.1 exon_11 14483 14579 0.877 1
bestparsecds_1 SEQ1 HMMgene1.1 exon_12 14697
14764 0.639 0 bestparsecds_1 SEQ1
HMMgene1.1 exon_13 14901 15030 0.835 1
bestparsecds_1 SEQ1 HMMgene1.1 lastex
15643 15704 0.987 0 bestparsecds_1
SEQ1 HMMgene1.1 CDS 692 15704
0.132 . bestparsecds_1
12
(No Transcript)
13
  • GENSCAN
  • differs from the majority of gene finding
    algorithms as it can identify
  • complete, partial and multiple genes on both
    DNA strands.
  • The program is based on a probabilistic model of
    gene structure/
  • compositional properties and does not make use
    of protein sequence
  • homology information.
  • The program is suitable for vertebrate, maize and
    Arabidopsis sequences.
  • The vertebrate version also works fairly well
    for Drosophila sequences.

http//genome.dkfz-heidelberg.de/cgi-bin/GENSCAN/g
enscan.cgi
14
http//genes.mit.edu/GENSCAN.html
15
(No Transcript)
16
(No Transcript)
17
.
TIGRscan has now been replaced by a new gene
finder called                                  
                                                  
   
www.genezilla.org
18
GenLang
  • GenLang is a syntactic pattern recognition
    system, which uses the
  • tools and techniques of computational
    linguistics to find genes and other
  • higher-order features in biological sequence
    data.
  • Patterns are specified by means of rule sets
    called grammars, and a
  • general purpose parser, implemented in the logic
    programming language
  • Prolog, then performs the search.

http//arete.ibb.waw.pl/PL/html/gene_lang.html
19
VEIL (Viterbi Exon-Intron Locator(Henderson et
al)
-used Expectation Maximization (E-M) to train the
model -Viterbi algorithm(dyn. Prog) to align new
sequences -I.e. finds the most likely sequence
of states EXPERIMENTAL RESULTS -correctly
located both ends of 53 of coding exons - 49
of exons that VEIL predicted were exactly correct
http//www.tigr.org/salzberg/veil.html
20
Exon and Stop Codon models in VEIL
2 blank states on either side can output any base
(allow alignment to proper reading frame)
21
Intron model (VEIL)
22
Neural Networks
23
  • GrailEXP is a software package that predicts
    exons, genes,
  • promoters, polyAs, CpG islands, EST
    similarities, and repetitive
  • elements within DNA sequence.
  • GrailEXP is used by the Computational Biosciences
    Section at
  • Oak Ridge National Laboratory to annotate the
    entire known
  • portion of the human genome (including both
    finished and
  • draft data).

GrailPro
http//compbio.ornl.gov/grailexp/
Not in our package
Score of 6-m in candidate Score of 6-m in
flanks Markov model score Flanks GC Candidate
GC Score for splicing/aceptor
Neural Networks
Exon score
Output
Hidden layer
Input layer
GC-reach regions preference score correction
24
(No Transcript)
25
(No Transcript)
26
GeneParser
Snyder, E. E., Stormo, G. D. (1995)
Identification of Coding Regions in Genomic

DNA. J. Mol. Biol. 248 1-18.
  • The program scores all subintervals in a sequence
    for content statistics
  • indicative of introns and exons and for sites
    which identify their boundaries.
  • This information is weighted by a neural network
    to approximate the
  • log-likelihood that each subinterval exactly
    represents an intron or exon
  • (first, internal or last).
  • A dynamic programming (DP) algorithm is then
    applied to this data to find
  • the combination of introns and exons which
    maximizes the likelihood
  • function.

Display of suboptimal solutions for the human
growth hormone gene.
27
Integrated Systems
Adding Homology
  • Can try to include information from databases of
    known proteins
  • to help decide whether an exon is coding
  • For each candidate exon, increase the score if
    there is homology
  • with a known protein
  • This approach used by Genie, GeneID,
    GeneParser3, Grail

Adding ESTs
  • Can try to include information from EST databases
  • EST (Expressed Sequence Tag) databases show
    sequences that are known to be present in mRNA
    (cDNA)
  • For each candidate exon, increase the score if it
    matches to an EST
  • Used by AAT, Grail

Drawbacks
Using homology or ESTs may bias results toward
genes similar to known genes (homology) or
highly expressed genes (ESTs)
28
Homology-based gene prediction
TWAIN is a new syntenic genefinder which employs
a Generalized Pair Hidden Markov Model (GPHMM) to
predict genes in two closely related eukaryotic
genomes simultaneously. 
http//www.tigr.org/software/pirate/twain/twain.ht
ml
29
GeneSeqer Brendel et al.
http//deepc2.psi.iastate.edu/cgi-bin/gs.cgi
Spliced Alignment Algorithm
  • Perform pairwise alignment with large gaps in
    one sequence (introns)
  • Align genomic DNA with cDNA, EST or protein
  • Score semi-conserved sequences at splice
    junctions
  • Score coding constraints in translated exons

Genomic Sequence
Fast Search
Spliced Alignment
EST or protein database
Output
Assembly
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
Evaluation of Gene Prediction Methods
  • What to consider when comparing
  • type of analysis (neural nw, linear discriminant
    etc.)
  • and types of sequences user for training and
    test
  • Also, parameters affect the predictions..
  • An ideal method should use
  • A known set of gene structures (training)
  • A different set for test
  • Evaluation is more stringent when
  • Test set includes a gene and neighboring
    sequence, rather than sequence between the first
    and the last exons

34
Evaluation of Gene Prediction
Am I finding the things that Im supposed to
find
What fraction of my predictions are true?
35
Ideal Distribution of Scores
More Realistically
36
  • of actual positives APTPFN
  • of actual negatives AN FPTN
  • Predicted of positives PPTPFP
  • Predicted of negatives PNTNFN
  • Sensitivity SN TP/APTP/(TPFN)
  • Specificity SP TP/PPTP/(TPFP)
  • Correlation coefficient -1,1

37
  • In a later study (Zhang 97)
  • Programs including protein sequence DB searches
  • (GeneID, GeneParser3) achieved substantially
    greater
  • accuracy (Burset 96)
  • Gene prediction programs reliably locate genomic
    regions,
  • but provide only an approximation of gene
    structure

38
  • 2001 Rogic redid comparison with better test set
    of 195 genes
  • http//www.cse.ucsc.edu/rogic/evaluation.html

REFERENCES
A. Krogh. 1998. Gene finding putting the
parts together. http//www.cbs.dtu.dk/krogh/publi
cations/ps/Krogh98b.pdf D. Haussler. 1998.
Computational genefinding. http//www.cse.ucsc.e
du/haussler/grpaper.pdf
39
Exons Predicted in an Arabodopsis Genomic Sequence
Note Arabodopsis UVH1 gene (with approx. 250 bp
upstream from the first exon and 200 bp
downstream from the last exon) used. NOT to be
taken as a measure of reliability of these
programs.
x not predicted includes the termination
codon
40
About GeneZilla
http//www.genezilla.org/
About TWAIN
http//www.tigr.org/software/pirate/twain/twain.ht
ml
41
  • Some of the best programs
  • GenScan http//genes.mit.edu/GENSCAN.html
  • GeneMark http//opal.biology.gatech.edu/GeneMark
    /
  • Other programs
  • AAT
  • EcoParse
  • Fexeh
  • Fgeneh
  • Fgenes
  • Finex
  • GeneHacker
  • GeneID-3
  • GeneParser 2
  • GeneScope
  • Genie
  • GenLang
  • Glimmer, GlimmerM
  • Grail II

42
BCM GeneFinder Baylor College of
Medicine Houston, TX
http//searchlauncher.bcm.tmc.edu/seq-search/gene-
search.html
http//www.genefinding.org/software.html
http//www.tigr.org/software/
http//www.fruitfly.org/seq_tools/genie.html
INDEX SITES
http//restools.sdsc.edu/biotools/biotools16.html
http//www.bioinformatics.vg/index.shtml
43
PROMOTER PREDICTION IN EUKARYOTES
44
(No Transcript)
45
  • Transcriptional Regulation in Eukaryotes
  • Transcription involves the interaction of TFs
    (Transcription Factorsprotein complexes) with
  • Each other
  • DNA-binding sites in the promoter region
  • Degree of expression of gene is influenced by
  • the region upstream from transcription start
    point
  • the region downstream
  • A TATA box is present in most eukaryotes (75 in
    vertebrates)
  • A TATA box HMM trained for vertebrates has the
    consensus sequence TATAWDR starting at 17 bp
    from TSS
  • W A/T D not C R G/A

46
  • INR also influences the start position of
    transcription.
  • a loosely defined sequence around TSS
  • may be recognized by other protein subunits of
    TFIID (a TF that recognizes and binds to the
    promoter DNA)
  • CCAAT and GC boxes also discovered around TSS (at
    variable distances)
  • Many different TFs may be involved in the
    regulation of a particular eukaryotic gene.
    DNA-binding sites for many of these TFs are
    unknown, which limits promoter pred.

47
  • Gene expression is also influenced by the region
    upstream of the core promoter and other enhancer
    sites.
  • Eukaryotic sequences show variation not only b/w
    species but also among genes within a species.
    Hence, a set of promoters in an organism that
    share a common regulatory response is analyzed
  • The programs can predict 13-54 of the TSSs
    correctly, but also each program predicted a
    number of false-positive TSSs.

48
Finding Less-conserved Binding Sites
  • In E.coli the sequences could be aligned by TSS,
    -10 and 35 regions. In many cases, it is not
    possible to find conserved binding site by
    aligning the sequences.
  • Similar to finding patterns common to a set of
    protein sequences that cannot be aligned.
    However, more difficult.
  • Methods
  • Expectation maximization
  • Guess an initial scoring matrix of estimated
    length.
  • Scan each sequence, calculate probability of
    matches, update (sequence pos. x probability)
    scoring matrix, then repeat until no change.
  • Hidden Markov Models
  • Statistical Method of Finding Patterns
  • A dinucleotide analysis performed to reduce
    background noise. A Gibbs sampling method
    considering inverted repeats (e.g. for lexA) is
    applied
  • Hertz, Stormo and Hartzell Method

49
Hertz, Stormo and Hartzell Method (for
DNA-binding Sites)
  • Object find the 4-mer in each sequence that
    constitutes as nearly as can be found in ALL
    seq.s

Information content
Consensus sequence
50
  • Methods
  • Neural nw trained on TATA and Inr Sites allowing
    a variable spacing between sites. NN-GA approach
    to identify conserved patterns in RNA PolII
    promoters and conserved spacing among them
    (PROMOTER2.0).
  • TATA box recognition using weight matrix and
    density analysis of TF sites.
  • Usage of linear (TSSD and TSSW) /quadratic
    (CorePromoter) discriminant function. The
    function is based on
  • TATA box score
  • Base-pair frequencies around TSS (triplet)
  • Frequencies in consecutive 100-bp upstream
    regions
  • TF binding site prediction
  • Searches of weight matrices for different
    organism against a test sequence (TFSearch/
    TESS). MatInspector and ConInspector allows
    user-provided limits on type of weight matrix,
    generation of new matrices etc.
  • Testing for presence of clustered groups (or
    modules) of TF binding sites which are
    characteristics of a given pattern of gene
    regulation.

51
Brain tissue Functional promoters Scoring
matrices TEST and Selection Log it value
of the promoter (0-1)
Neural Networks (PROMOTER 2.0)
http//www.cbs.dtu.dk/services/promoter/
Density of TF from EPD (PromoterScan)
http//bimas.dcrt.nih.gov/molbio/proscan/
exercise
Searches of weight matrices against a test
sequence (TFSearch/TESS)
http//www.cbil.upenn.edu/cgi-bin/tess/tess
52
Promoter Databases TRANSFAC is a database on
eukaryotic cis-acting regulatory DNA elements
and trans-acting factors. It covers the whole
range from yeast to human. Biological
Databases/Biologische Datenbanken GmbH In
release 4.0, it contains 8415 entries, 4504 of
them referring to sites within 1078 eukaryotic
genes, the species of which ranging from yeast
to human. Additionally, this table comprises 3494
artificial sequences which resulted from
mutagenesis studies, in vitro selection
Procedures starting from random oligonucleotide
mixtures or from specific theoretical
considerations. And finally, there are 417
entries with consensus binding sequences given in
the IUPAC code.
http//www.gene-regulation.com/
Free registration
MatInspector Search for potential transcription
factor binding sites in your own sequences
with the matrix search program MatInspector
using the TRANSFAC 4.0
matrices. FastM A program for the generation
of models for regulatory regions in DNA
sequences. FastM using the TRANSFAC 3.4
matrices. PatSearch Search for potential
transcription factor binding sites in your own
sequences with the pattern search program
using TRANSFAC 3.5 TRRD 3.5 sites. FunSiteP Run
interactively FunSiteP. Recognition and
classification of eukaryotic promoters by
searching
transcription factor binding sites using a
collection of Transcription factor consensi.
53
TESS Transcription Element Search System
Computational Biology and Informatics
Laboratory, School of Medicine, University of
Pennsylvania, 1997
http//www.cbil.upenn.edu/cgi-bin/tess/tess33?WELC
OME
Eukaryotic Promoter Database Swiss Institute for
Experimental Cancer Research
  • The Eukaryotic Promoter Database is an annotated
    non-redundant
  • collection of eukaryotic POL II promoters, for
    which the transcription start
  • site has been determined experimentally.
  • The annotation part of an entry includes
    description of the initiation site
  • mapping data, cross-references to other
    databases, and bibliographic
  • references.
  • EPD is structured in a way that facilitates
    dynamic extraction
  • of biologically meaningful promoter subsets for
    comparative sequence
  • analysis.
  • EPDEX is a complementary database which allows
    users to view
  • available gene expression data for human EPD
    promoters.
  • EPDEX is also accessible from the ISREC-TRADAT
    database entry server.

http//www.epd.isb-sib.ch/
Prediction of transcription factor binding sites
by constructing matrices on the fly from
TRANSFAC 4.0 sites.
AliBaba
http//darwin.nmsu.edu/molb470/fall2003/Projects/
solorz/
http//www.epd.isb-sib.ch/TRADAT.html
54
McPromoter MMII -- The Markov Chain Promoter
Prediction ServerMassachusetts Institute of
Technology
http//genes.mit.edu/McPromoter.html
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
Predicting Genes - Basic steps
  • Obtain genomic DNA sequence
  • Translate in all 6 reading frames
  • Compare with protein sequence database
  • Also perform database similarity search
  • with EST cDNA databases, if available
  • Use gene prediction programs to locate
  • genes
  • Analyze gene regulatory sequences

60
AACGGACTTCCACTGAGCGATGTGAAAACGTTACAGGTTCAGTACTTCCA
AAGGAAGAAACCTCCAAACCCAAAAAAGAATAAA TATGAATTTGTATTT
TTGAAGAATGTGAAATAATGGTGTTTGCTTAATTGCTCATTTTGTATAAA
CTTAATATTGTACTTTAAAATATCTGCTAAAAAGTGAAAATTTAACTTTT
TGGAATTGAAAAAGCAATATTAAATACTAATGAAATCCTAATTAAATGCT
TATTTAAATCTGGTAGTATCTGTGGCATTTCTTACCAACCCTGCCCATAG
TTGACATTTTTCCACCACTCCCCCCTTCCCAGCCATCAGTCTTGGAGAGG
GGACAGAAAGGAAACGTCGGTCACCAGGAGAGTCTGCAGGTTTCCTTTTA
ATCAAGGCTCTACTGAAGGTGTTTTGTGGGGCTAAAAGCCCCCAAAACAT
GAAATGGACATGTAACACCACCTGGATCCCCCATAGCAGGCCAGACCACT
CTGGCGAGCACTGCTGGTCTGCCCAAATCTGGGTAATCAGACTGGGTATT
CATTGGCTGCATTTCAAAGCACAGCACTGCTTTCAGCCAGGATGAAGTGG
GAGTGAACCCAGCTGCTAGCAGAGCTGCCACTCCAGGCTGAGAGCCAAGT
ACCAGCCACTGCCAGTGAAGACTGGCCCCTTTACTGAAGGGAGTTGTTCA
GAGTCCAGCCACCGGCCCTGGGGAGGGAGAGAAGTCAGGGTATTCTGCTC
GGGGATGGTCAGGGCTCCGCAGCTCCATCGCCAGCATCCTTTGGAAAGCC
GCCTCTGGCGGAGACAGCCGGCTGGGGGGGCGCTCCAGGTTTGGCTGAGA
CGTTCTAGTTGGAACAGAAAGGAAAAAAGTGAGGCTGGGAGGCAAGGCCT
TGGATTAGGCCCCACAAGGATGTGGCCATTTGGCATTTGGATAGTATTAA
CTTTTTCGAAACCTCTCACCAGATCAAAGGAGGTTAGGGATAAAGCGGCG
GAGACATACTTCCCCCCTCCAGGGTAAGCTAGGGCTTGGCCAGCCTAGCC
AGTGGGCAGACCCCACCCCACCCCAGCCCAGCCCAGGGTGGGCACTAACC
CCGCCACCAGCCGGCTCCGGGCGCCGGCGGCCCAGCTGCCGTAACATCTC
CTCGCAGGCTGCGATGGTGTCCAGGAGCTGCCGCTGCCGCTGCTCCACCG
CGTCCAGCAGCTGCTGGGCGCGCTCCTCCCGGGGCGGCTGTGGGGGTGGC
CTCCCGCCGAGCCCCAGCCCCGCCTTCCCGCGGTCCACGCCGGCAGCCTC
CCTGCCCGGGAGAGAGCGAGAGACAGACGGTCAGGGCCGGCGCTTGCGCG
GGGCCAAGCCCCTTCCTCCCGCCCCGACGGGCCCCCTCTCACCCCCGTGA
CCAGTCTGAGCCCGGGCCCCATTCCATCTCCGCTTGCGCGGCCCGACCAC
CGCCCCCCTTTCGGCCGCCCCCCTCCCCAGCGCTGCGTTAGGGCTTCGCA
AGGCTGCGCCCCGCCCCGTCCCCACCGGTCTCCTTCAATCCTCCTGGGGG
TCGTGGTCCCTTTAAGCTGCCCGGCGCAGAGGCGGGGCCGAGTCTCCTGG
ACCGGAAGCTGGCTGGGAGCGTCACTTCCTCCCGGAAGCGGGCCTGGGCG
G
Write a Comment
User Comments (0)
About PowerShow.com