Title: In silico analysis of expressed sequence tags (EST) from Trichostrongylus vitrinus (Nematoda): comparison of the automated ESTExplorer workflow platform with database searches.
1In silico analysis of expressed sequence tags
(EST) from Trichostrongylus vitrinus (Nematoda)
comparison of the automated ESTExplorer workflow
platform with database searches.
- Shivashankar H. Nagaraj and
- Shoba Ranganathan
- Professor and Chair Bioinformatics
- Biotechnology Research Institute and Adjunct
Professor - Dept. of Chemistry Biomolecular Sciences Dept.
of Biochemistry - Macquarie University National University of
Singapore - Sydney, Australia Singapore
- (shoba.ranganathan_at_mq.edu.au) (shoba_at_bic.nus.edu.s
g) -
2Expressed Sequence Tags (ESTs)
- Unedited, short, single pass sequences generated
from 5' or 3' end of randomly selected cDNA
libraries in desired cells/tissues/organ. - Length 200-700 bp (average 360 bp)
- Can be quickly generated at low cost (poor-mans
genome) - EST data is highly fragmented
- EST annotations have very little biological
information - High-throughput in nature
3EST Applications
- Gene Discovery
- Gene Structure Prediction
- Expression Maps
- Alternative Splicing
- Identification and characterization of SNPs
- Gene expression studies
- tissue or disease specific
- developmental stage
- Proteomics (for example peptide mass
fingerprinting) - Identification of drug and vaccine candidates
4Properties of ESTs
Genomic DNA
mRNA
cDNA
ESTs
An EST sequence
5 ESTs
3 ESTs
vector
repeats
high quality sequence
vector
50 - 500 bp
500- 700 bp
1-50 bp
High quality
5EST data resources
- Available in plenty
- Several dedicated databases
- Fragmented
- Quality dubious
- Need cleaning
- Clustering
- Annotation!
6EST data repositories
dbEST release 061507 (June, 2007)
www.ncbi.nlm.nih.gov/dbEST/ 43,396,096 ESTs from
659 different organisms Homo sapiens (human)
8,119,106 Mus musculus (mouse) 4,850,243 Danio
rerio (zebrafish) 1,350,105 Bos taurus (cattle)
1,318,208 Arabidopsis thaliana (thale cress)
1,276,692 Xenopus tropicalis 1,271,375 Oryza
sativa (rice) 1,211,418 Zea mays (maize)
1,161,241 Triticum aestivum (wheat) 1,050,267
7(No Transcript)
8Overview of EST sequence analysis
Submit Data
Visualize results
Contamination check
Raw EST sequence data
Vector clipping
Gene annotation RNAi Gene mapping Alternative
splicing SNPs
Poly-A removal
Repeat Masking
Clustering
Assembly
Peptide annotation Protein interactors Gene
Ontologies KEGG
Consensus generation
Conceptual translation
9Evolution of ESTExplorer
10 Comparison of current methods for EST analysis
Critical evaluation of contemporary tools and
EST analysis pipelines
Benchmarking of tools using EST datasets
Lack of downstream functional annotation at
DNA and protein levels
ESTExplorer
11Description of ESTExplorer
12ESTExplorer features
- Suite of programs to pre-process, assemble and
functionally annotate ESTs - User-defined input and analysis parameter
control - Species-specific analysis
- Input ESTs or assembled contigs
- Output Assembled ESTs, Gene Ontologies, mapping
to Domains/Motifs, Pathway mapping
13 Phase I (EST pre-processing)
Short sequences removed from the analysis
Input Option 1 EST sequences
SeqClean
RepeatMasker
Quality values (.qual)
CAP3
Workflow
Input Option 2 assembled ESTs
Assembled ESTs
Phase II (DNA level Annotation)
Phase III (Protein level Annotation)
ESTScan
BLASTX
InterProScan
KOBAS
BLAST2GO
Final output Annotation summary for
assembled ESTs
ESTExplorer analysis and annotation workflow,
showing Phase I (pre-processing and assembly),
Phase II (nucleotide-level annotation) and Phase
III (protein-level annotation).
14estexplorer.biolinfo.org
15Annotation summary page
16The worm in question
- Trichostrongylus vitrinus (order
Strongylida) is a parasitic nematode. - Principal causative nematode associated with
parasitic diseases in sheep and cattle - Current treatment for the disease
chemotherapeutic agents (anti-helmintics) - Disadvantages with current treatments
- a. Expensive and only partially effective
- b. Anthelmintics drug resistance over the last
decade - c. Residue problems in meat and milk
- Possible alternative the development of
anti-parasite drugs and/or vaccines
Nisbet AJ, et al. Int J Parasitol, 2004
17Creation of cDNA libraries and EST generation
from the parasite Trichostrongylus vitrinus
Bioinformatics Analysis of the ESTs
Phase I
Comparative genomics with nematodes
Categorization of Differentially expressed ESTs
Subset of potential drug target genes
Phase II
- Isolation of full length genes
- Functional Genomics via RNAi
- Biochemical activity assays
- Proteomics
Phase III
Virtual and High-throughput screening
Pre-clinical and clinical evaluation
Phase IV
18EST analysis schema
19EST analysis schema
20Results of overall EST analysis
Number of ESTs analysed 1776 ( male 910
female 866)
Caenorhabditis elegans homologues 290 (41)
Homologues in parasitic nematodes 329 (42)
Homologues in non-nematodes 202 (28) No
significant match to any sequence 218 (31) in
the current databases Gene Ontologies (GO)
assigned 267 (38) Pathway associations
established 230 (33) Of the C. elegans
homologues, 90 entries had observed
non-wildtype RNAi phenotypes, including
embryonic lethality, maternal sterility, sterile
progeny, larval arrest and slow growth.
21Results from BLAST vs. ESTExplorer
Manual annotation using BLAST
EST ID E-value BLAST results
TVm02_C07 2.00E-37 PP1-gamma serine/threonine protein phosphatase
Trichostrongylus vitrinus (Nematoda
Strongylida) Molecular characterization and
transcriptional analysis of Tv-stp-1, a
serine/threonine phosphatase gene. Hu M, Abs
El-Osta YG, Campbell BE, Boag PR, Nisbet AJ,
Beveridge I, Gasser RB. Exp Parasitol. 2007 Mar
24
22Results from BLAST vs. ESTExplorer
Annotations obtained automatically from ESTExplorer
Manual annotation using BLAST Annotations obtained automatically from ESTExplorer
BLAST results E-value Gene Ontologies Metabolic Pathway Mapping Domain/Motif data
protein phosphatase catalytic gamma isoform isoform 1 1.00E-36 chromatin modification, protein amino acid dephosphorylation, embryonic cleavage, cytokinesis, meiosis, oviposition, manganese ion binding, protein phosphatase type 1 activity, mitochondrial outer membrane, protein binding, mitosis, glycogen metabolic process, iron ion binding, nucleus Long-term potentiation, Regulation of actin cytoskeleton, Focal adhesion, Insulin signaling pathway Metallophosphoesterase, Serine/threonine-specific protein phosphatase and bis(5-nucleosyl)-tetraphosphatase
EST ID E-value BLAST results BLAST results E-value Gene Ontologies Metabolic Pathway Mapping Domain/Motif data
TVm02_C07 2.00E-37 PP1-gamma serine/threonine protein phosphatase protein phosphatase catalyticgamma isoform isoform 1 1.00E-36 chromatin modification, protein amino acid dephosphorylation, embryonic cleavage, cytokinesis, meiosis, oviposition, manganese ion binding, protein phosphatase type 1 activity, mitochondrial outer membrane, protein binding, mitosis, glycogen metabolic process, iron ion binding, nucleus Long-term potentiation, Regulation of actin cytoskeleton, Focal adhesion, Insulin signaling pathway Metallophosphoesterase, Serine/threonine-specific protein phosphatase and bis(5-nucleosyl)-tetraphosphatase
Trichostrongylus vitrinus (Nematoda
Strongylida) Molecular characterization and
transcriptional analysis of Tv-stp-1, a
serine/threonine phosphatase gene. Hu M, Abs
El-Osta YG, Campbell BE, Boag PR, Nisbet AJ,
Beveridge I, Gasser RB. Exp Parasitol. 2007 Mar
24
23Redefining parameters for possible drug/ vaccine
targets in parasitic nematodes
Secreted Proteins
- Parasites must secrete biologically active
mediators to manipulate the host - environment in order to survive immune attack
- Inhibit host antigen-processing pathways
- Examples
- Aspartyl protease inhibitor (API-1)
- Cystatin (cysteine protease inhibitor)
- Acetylcholinesterase (AChE)
Absence of homologues in mammalian host
(nematode specific genes)
Strong RNAi phenotypes in C. elegans
- Embryonic lethality
- Larval lethality
- Sterile progeny
- Larval arrest
- Maternal sterility
- Slow growth
- Genes with specificity to nematodes may serve as
excellent targets for drugs/vaccines with low
toxicity to humans and other vertebrates. - Better understanding of the unusual nematode
biochemistry can also have industrial or
therapeutic value.
Harcus YM, et al. Genome Biol, 2004 Delaney A, et
al. Int J Parasitol 2005 Vanholme B, et al. Gene
2004
24T. vitrinus male EST data comparison
C. elegans
Venn diagram
169 (39.21)
19
6
55
89
3
45
2
191 (44.31)
100 (23.20)
Parasitic nematodes
Non-nematodes
T. vitrinus female EST data comparison
C. elegans
Venn diagram
121 (45.6)
6
24
6
85
3
8
26
Parasitic nematodes
Non-nematodes
102 (38.4)
138 (52.1)
25SimiTri visualizing similarity relationships
for groups of sequences
Database 1
BLAST
Query dataset (EST sequences in this study)
SimiTri provides a two-dimensional display of
relative similarity relationships among three
different datasets.
Database 2
Database 3
vizualization
- Java/Perl-based application
- Display of relative similarity relationships
- Analysis of relative similarity relationships
- Based on raw bit score from BLAST output
-
Parkinson J, et al. Bioinformatics,
2003 Parkinson J, et al. Nat Genetics, 2004
26Color scale of maximal BLAST scores for tiles
a. SimiTri Male dataset
SimiTri results T. vitrinus ESTs are closer to
parasitic nematodes and C. elegans than to other
non-nematode organisms.
C. elegans
431 male ESTs
19
169 (39.21)
100
200
150
250
300
No match for 114 ESTs
55
6
100
89
3
45
2
Parasitic nematodes
Non-nematodes
191 (44.31)
100 (23.20)
Color scale of maximal BLAST scores for tiles
b. SimiTri Female dataset
C. elegans
6
121 (45.6)
265 female ESTs
100
200
150
250
300
No match for 78 ESTs
24
6
100
85
8
26
3
Parasitic nematodes
Non-nematodes
102 (38.4)
138 (52.1)
27 BLAST vs. ESTExplorer
- ESTExplorer reliably and rapidly annotated 301
ESTs, with pathway and GO information,
eliminating 60 low quality hits from database
searches.
1776 ESTs
1776 ESTs
Analysis using semi-automated approach via
ESTExplorer
Analysis of individual ESTs using BLAST
- Slow (took several weeks)
- BLAST results are the only evidence
- for functional assignment
- Peripheral annotation
- Fast (took few minutes)
- Multiple evidences for annotation
- supported by GO, InterProScan
- and Pathway Mapping
- In depth annotation
28Secreted protein analysis
Number of putative secreted proteins 40
Immune-response related genes
Signalling molecules
Ion channels
Proteases
Protease inhibitors
29Candidate target genes in Trichostrongylus
vitrinus
EST contig/ singletons Seq Length ( in aa) Homology (Wormpep) RNAi phenotype (Wormbase) Gene Ontology Mammalian homolog Secreted Protein
Tvmale_Contig9 113 Translation initiation factor 3, subunit f (eIF-3f) embryonic lethal (Emb) larval arrest (Lva) sterile progeny (Stp) slow growth (Gro) GO0003743translation initiation factor activity NO YES
Tvfemale_Contig105 115 pbs-2 - (Proteasome Beta Subunit) embryonic lethal (Emb) locomotion abnormal larval arrest (Lva) maternal sterile larval lethal (Let) GO0005839 proteasome core GO0006511 ubiquitin-dependent protein catabolism GO0008233 peptidase activity GO0004175 endopeptidase activity YES (weakly similar) YES
Tvmale 04_F02 96 asb-2 - (ATP Synthase B homolog) embryonic lethal (Emb) larval arrest (Lva) sterile progeny (Stp) slow growth (Gro) maternal sterile GO0046933 ATP synthase activity YES (weakly similar) YES
Tvmale 02_C01 136 RNA splicing factor - Slu7p embryonic lethal (Emb) early emb (Emb) molt defect (Mlt adult early lethal (Adl) larval arrest (Lva) GO0006375 nuclear mRNA splicing NO YES
30Results from BLAST vs. ESTExplorer
Manual annotation using BLAST
EST ID E-value BLAST results
TVm02_C07 2.00E-37 PP1-gamma serine/threonine protein phosphatase
Trichostrongylus vitrinus (Nematoda
Strongylida) Molecular characterization and
transcriptional analysis of Tv-stp-1, a
serine/threonine phosphatase gene. Hu M, Abs
El-Osta YG, Campbell BE, Boag PR, Nisbet AJ,
Beveridge I, Gasser RB. Exp Parasitol. 2007 Mar
24
31Results from BLAST vs. ESTExplorer
Annotations obtained automatically from ESTExplorer
Manual annotation using BLAST Annotations obtained automatically from ESTExplorer
BLAST results E-value Gene Ontologies Metabolic Pathway Mapping Domain/Motif data
protein phosphatase catalytic gamma isoform isoform 1 1.00E-36 chromatin modification, protein amino acid dephosphorylation, embryonic cleavage, cytokinesis, meiosis, oviposition, manganese ion binding, protein phosphatase type 1 activity, mitochondrial outer membrane, protein binding, mitosis, glycogen metabolic process, iron ion binding, nucleus Long-term potentiation, Regulation of actin cytoskeleton, Focal adhesion, Insulin signaling pathway Metallophosphoesterase, Serine/threonine-specific protein phosphatase and bis(5-nucleosyl)-tetraphosphatase
EST ID E-value BLAST results BLAST results E-value Gene Ontologies Metabolic Pathway Mapping Domain/Motif data
TVm02_C07 2.00E-37 PP1-gamma serine/threonine protein phosphatase protein phosphatase catalyticgamma isoform isoform 1 1.00E-36 chromatin modification, protein amino acid dephosphorylation, embryonic cleavage, cytokinesis, meiosis, oviposition, manganese ion binding, protein phosphatase type 1 activity, mitochondrial outer membrane, protein binding, mitosis, glycogen metabolic process, iron ion binding, nucleus Long-term potentiation, Regulation of actin cytoskeleton, Focal adhesion, Insulin signaling pathway Metallophosphoesterase, Serine/threonine-specific protein phosphatase and bis(5-nucleosyl)-tetraphosphatase
Trichostrongylus vitrinus (Nematoda
Strongylida) Molecular characterization and
transcriptional analysis of Tv-stp-1, a
serine/threonine phosphatase gene. Hu M, Abs
El-Osta YG, Campbell BE, Boag PR, Nisbet AJ,
Beveridge I, Gasser RB. Exp Parasitol. 2007 Mar
24
32ESTExplorer applications so far ..
1. In silico analysis of expressed sequence tags
(EST) from Trichostrongylus vitrinus (Nematoda)
comparison of the automated ESTExplorer workflow
platform with database searches. Nagaraj SH,
Gasser RB, Ranganathan S. 2. A transcriptomic
analysis of the adult stage of the bovine
lungworm, Dictyocaulus viviparus. Ranganathan S,
Nagaraj SH, Hu M, Strube C, Schnieder T and
Gasser RB. BMC Genomics, 2007, accepted 3.
Gender-enriched transcripts in adult Haemonchus
contortus (Nematoda) predicted functions and
genetic interactions based on comparative
analyses with Caenorhabditis elegans. Campbell
BE, Nagaraj SH, Hu M, Zhong W, Sternberg PW, Ong
EK, Loukas A, Ranganathan S, Beveridge A and
Robin B. Gasser. 4. Transcriptional changes in
the third-stage larva of Ancylostoma caninum
(Nematoda) following in vitro serumstimulation,
employing a suppressive-subtractive
hybridisation-based microarray approach. Datu
BJD, Gasser RB, Nagaraj SH, Eng K. Onge,
ODonoghue P, McInnes R, Ranganathan S and Loukas
A 5. Trichostrongylus vitrinus (Nematoda
Strongylida) Molecular characterization and
transcriptional analysis of Tv-stp-1, a
serine/threonine phosphatase gene. Hu M, Abs
El-Osta YG, Campbell BE, Boag PR, Nisbet AJ,
Beveridge I, Gasser RB. Exp Parasitol. 2007,
accepted
33Ref papers
34Acknowledgements
Prof. Robin Gasser (University of
Melbourne) Genetics Technologies Pty. Ltd.
Australian Research Council LINKAGE PROJECT
(LP0667795)
35(No Transcript)
36Some more examples of secreted proteins
M41 family metalloproteasemitochondrial membrane
proteinase Schistosoma Pathogenesis related
protein similar to helminth venom allergen
homologues Schistosoma