Title: Viruses in ?? : How can ICTVdB facilitate the construction of a virus database in TaiBNET?
1Viruses coming to TaiBNET
- Viruses in ?? How can ICTVdB facilitate the
construction of a virus database in TaiBNET?
Cornelia Büchen-Osmond Australian National
University Columbia University
2How did we learn about viruses
- symptoms known since thousands of years
- as devastating diseases
- in humans (small pox, polio)
- in animals (foot-and-mouth disease)
- in plants
- tulip color breaking (potyvirus)
- grapevine chrome mosaic (comovirus)
3How old are viruses?
- determination of true age of viruses
- no fossils to determine
- genome sequence mutation, pair-wise comparison
- phylogenetic tree analysis
- 1 decade in potyviruses
- molecular clock says old
- potyvirus in Australia introduced 60,000 years
- dsDNA animal virusesmuch older
4Virology began with plant pathology
Mosaic disease in tobacco plants
5110 years of virology
- In 1892 the Russian scientist Iwanowski
recognized that an infectious agent passing
through a filter with a pore size of less than
250 nm was responsible for the mosaic disease
affecting tobacco plants. - Beijerinck was the first who associated the term
virus with the filterable infectious agent in
tobacco plants. - he proposed that a virus was a culturable
contagium vivum fuidum which multiplied in close
association with the host's metabolism and was
distributed in phloem vessels together with plant
nutrients - his theory was in stark contradiction to the
prevailing germ theory based on the metabolic
pattern of bacterial diseases - only in the mid 1930s the true nature of viruses
was revealed as nucleoproteins
6110 years of virology
- The first filterable infectious agent isolated
from animals was the Foot-and-mouth-disease virus
reported by Loeffler and Frosch in 1898 - By the beginning of the twentieth century, the
concept of viruses as agents of human disease was
established when Reed and Carroll recognized
Yellow fever virus (Panama Canal) - Bacterial viruses were discovered independently
in 1915 by Twort and by d'Hérelle in 1917 who
coined the term bacteriophage, meaning "bacteria
eater," to describe the agent's bacteriocidal
ability .
7Discovery of the causative agent
- The cause of smallpox was understood much later
- 1886 elementary bodies visualized in LM
- 1925 multiplication of poxvirus in cultured cells
and chick embryo chorioallantoic membranes
(Parker and Nye Goodpasture) - 1935 purification and chemical composition of
vaccinia virus (Smadel Hoagland) - 1943 EM of negatively stained particles (Ruska,
Siemens) - 1954 EM thin sections of virus-infected cells
(Morgan) - 1967 RNA polymerase in infectious particles
- 1974 structure of poxvirus genome
- 1994 complete genome sequence of Variola virus
8What is a virus
- viruses are found in all forms of life
- subcellular entities consisting of
- protein capsids
- may have a lipid envelope
- nucleoprotein/genome
- dsDNA, ssDNA, dsDNA-RT, dsRNA, ssRNA, ssRNA-RT
- totally dependent on the host
- for genome transcription and replication
- for assembly, maturation and egression
9Virus infection is host specific
- they can only infect a specific host
- one or more host families
- species specific
- they can have a high mutation rates
- they can recombine
- they can acquire genes from the host
- they can transfer genes
Although much reduced forms of life, viruses are
master explorers of the evolutionary space and
are perhaps even a driving force in evolution and
speciation.
10Classification of Organisms
- Traditional Taxonomy
- based on morphology (using the naked eye
and handheld lens) - currently attempting to use molecular data
(resulting in unclear relationships) Virus
Taxonomy - based also on morphology (using EM, x-ray
diffraction and crystal structure) - currently mainly using genomic sequence data
11Early Classification Systems
- In 1927 the need for a system of virus
nomenclature and classification was recognized - Initially the classification scheme was based on
plant, animal, and bacterial viruses - The earliest efforts to classify within a host
group were based on - common pathogenic properties (symptoms)
- common organ tropisms (liver, leaves etc)
- common ecological and transmission
characteristics - Viruses causing hepatitis were simply lumped
together as the hepatitis viruses - This approach is still retained in the
International Code of Diseases in which all virus
diseases causing hepatitis are still lumped
together under one basic code
12Taxonomic Virus Properties
- Since the founding of ICTV (1961) the taxonomic
status of a virus has been defined by - Virion properties
- morphology
- genome, protein, carbohydrates and lipids
- Genome organisation and replication
- metabolic interaction between virus and host
- sequence annotations
- Biological properties
- host range and vectors
- cyto- and histopathology (disease expression)
- transmission, epidemiology, geographic
distribution
13Taxonomy of emerging viruses
- 2 virus families24 floating genera 16 plant
virus groups - 38 virus families138 genera/groups
- 1 order50 families164 genera
- 3 orders56 families233 genera
- 3 orders73 families287 genera
- 5 orders84 families314 genera
- 1971 1st Report
- 1990 5th Report
- 6th Report
- 7th Report
- 8th Report
- 2008 ICTVweb
14Virus nomenclature
- The International Committee on Taxonomy of
Viruses - rules on classification and nomenclature
- does not accept Linnaean style binomial
nomenclature(genus name followed by species
name) - recognizes taxonomic levels of Order, Family,
Subfamily, Genus and Species with standardized
Latinized endings - includes host, symptom, and/or location in
species names - italicizes only a species name ending with
virus
15Examples of virus species names
- species name Tobacco mosaic virus
- Alt. name Tobacco mosaic tobamovirus
- virus name Tobacco mosaic virus
- species name Cercopithecine herpesvirus 1
- synonym Herpesvirus simiae
- (early attempt for true
binomial nomenclature) - virus name Cercopithecine herpesvirus 1
- species name Tomato yellow leaf curl Sardinia
virus - synonym Tomato leaf curl virus-Sardinia
- synonym Tomato leaf curl virus -
Sardinia - synonym Tomato leaf curl virus - Spain
- synonym Tomato leaf curl virus
Sardinia Spain
16ICTV-online since 2007
- a new database maintained by ICTV
- each year, after final approval by all ICTV
members, the latest Master Species list will be
published online by ICTV - links to the ICTVdB Index of Viruses and virus
descriptions
17ICTV-online entry for White spot syndrome virus
first reported in shrimp aquaculture from Taiwan
in 1992. This entry is based on ICTVdB Index of
Viruses and this year updated by ICTV
18Index of Viruses in ICTVdB
- Family Names in Taxonomic (genomic) Order
19ICTVdB uses a decimal code to uniquely identify
each virus
- The decimal code
- gives every virus in ICTVdB a unique IP number
- indicates its taxonomic status and level
- serves as a link within the whole database
- serves as a surrogate accession number in ICTVdB
on the web and as hyperlink from other databases
e.g., NCBI and SWISS-PROT or taxonomic databases
such as Species2000 and GBIF - records changing taxonomic decisions by ICTV
expert Study Groups, but retains old codes to
chart the history of virus taxonomy
20The decimal code in ICTVdB
The decimal code for White spot syndrome virus
indicates its taxonomic context
virales
Order
00.
00.103.
Nimaviridae
Family
Subfamily
.virinae
00.103.0.
Whispovirus
Genus
00.025.0.01.
White spot syndrome virus
00.103.0.01.001.
Species
00.103.0.01.001.00. 003.
Isolate
WSSV-1-TW (1992)
21(No Transcript)
22Virus descriptions in ICTVdB
23Interoperability in ICTVdB
- Interoperability is achieved in descriptions
- via decimal code within ICTVdB
- from other databases to ICTVdB
- on species level and above via
- NCBI TaxID to retrieve nucleotide sequences,
genomes and PubMed references - below species level via
- sequence accession numbers
- specific accession codes to
- Databases CDC, VIPERdB, VIDEdB, DPV (CMI/AAB)
- Catalogs ATCC, DSMZ, dHerelle
- Publications ProMed, journals
24(No Transcript)
25(No Transcript)
26and ICTVdB lists are the accepted world standard
for virus names
27ICTVdB in DELTA Format
- three basic flat files plus many directives
- character list (gt 3000 questions to describe a
virus) - specification file (specifies types of characters
and dependencies) - Items file (coded data of gt4000 virus
descriptions) - dependencies make characters applicable or
inapplicable, depending on choice and correspond
to tables in relational databases - character list can be translated into other
languages, including Chinese - easy transport of data set from
- one language to another
- one database to another
The new ICTVdB platform will be in a relational
database format using MySQL
28Regional data sets
- virus descriptions on isolate level with links
- to species/genus level descriptions
- to fact sheets
- to sequence data
- to host databases
- to distribution maps for virus, host, vector
- to images of virus, host vector
- to references
29Viruses of Plants in Australia
- DELTA formatted database
- regional data on viruses
- on hosts and agronomic impacts
- introduction to Australia
- distribution in Australia
- extensive host lists
- on the WWW since 1992
- links to generic descriptions
30Before viruses are entered in TaiBNET we need to
have
- prepare lists of viruses in Taiwan
- in humans
- in agriculture
- in husbandry
- in aquaculture
- in nature
- in all forms of life
- obtain data from taxonomic hierarchy tree in ICTV
or GBIF - prepare short descriptions of isolate data
- customize links to ICTVdB and genomic databases
31Current ICTVdB DELTA System
- The current system is based on the DELTA format
(DEscription Language for TAxonomy). At its core,
Delta is based on linear lists of information
(flat files) which specify taxa and their
defining characteristics. The Delta system has
been engineered to uniquely suit the needs of the
worldwide taxonomic community and is used for the
classification of plants, animals, viruses,
etc... Unfortunately, this taxonomic format and
the associated software are no longer being
developed. Updates to the Delta database require
a highly trained curator with an in depth
knowledge of the system. Publication of the
database to the web is done using a mixture of
specialized programs, scripts and hand editing.
As a result the web-based ICTVdB is actually a
set of static web pages which must be regenerated
each time data are released. Interactive virus
identification is currently through the Windows
application Intkey. Intkey is the interactive
taxonomic keying system that is shipped with
Delta. It allows a user to identify an organism
by successive pruning of taxa. As details are
entered about the organism, the number of taxa
matching the specified information is listed.
This is a valuable tool but the lack of
cross-platform compatibility (Windows only) is a
major complaint. Isolate data, describing
viruses found around the world, are submitted
through EntVir, a MySQL database system feed by
PHP-based forms which also must be regenerated
each time the database is published. Isolate data
ultimately ends up as an email that is imported
into the Delta system manually. In the current
system, all isolate review and database entry is
handled through a single curator.
32Proposed ICTVdB System
- The proposed replacement architecture utilizes a
relational database (MySQL) where the flat files
have been translated to their equivalents in a
relational database schema. The relational
database will capture the taxonomic hierarchy,
descriptive data for taxa, and isolate
descriptions. Users will interact with the MySQL
database through a custom web application with
the following functions - Browse - A taxonomic tree will be used to
navigate through viral taxa. This will allow
visual browsing of the taxonomic hierarchy.
Viruses will also be indexed by name, host and
genome organization. - Query - A basic search will allow users to query
the taxonomic hierarchy, virus names and other
data. The stand-alone IntKey application will be
replaced by an advanced search function with a
flexible system of forms and search refinement. - Data Entry - An improved data entry system will
be used to keep the ICTVdB up to date and to make
data entry as simple as possible. Until this
system is fully functional, the current EntVir
system will remain in place.
33Proposed System
- The proposed replacement architecture
utilizes a relational database (MySQL) where the
flat files have been translated to their
equivalents in a relational database schema. The
relational database will capture the taxonomic
hierarchy, descriptive data for taxa, and isolate
descriptions. Users will interact with the MySQL
database through a custom web application with
the following functions Browse - A taxonomic
tree will be used to navigate through viral taxa.
This will allow visual browsing of the taxonomic
hierarchy. Viruses will also be indexed by name,
host and genome organization. Query - A basic
search will allow users to query the taxonomic
hierarchy, virus names and other data. The
stand-alone IntKey application will be replaced
by an advanced search function with a flexible
system of forms and search refinement. Data
Entry - An improved data entry system will be
used to keep the ICTVdB up to date and to make
data entry as simple as possible. Until this
system is fully functional, the current EntVir
system will remain in place.
34The database will be populated with data from
several sources
- One time conversion of current Delta
format flat files to the database Annual
taxonomy and nomenclature updates from the ICTV
executive committee Virus annotations made by
Curators Contributors submission of virus
isolates - An important feature of the revised system is
the concept of decentralized data entry and
review. Isolate submission will be reviewed by
the Head Curator and/or other specialists with
knowledge of particular viral families. These
Curators will be given the ability to review
Pending isolate submissions for correctness and
approve them for transfer from a Pending status
to Approved isolates for release in the next
ICTVdB version. Curators will be volunteers and
will have the ability to decline to review
isolates, in a manner similar to the peer review
system used by journals. Yearly updates to the
taxonomy will be made using the Master Species
List maintained by the ICTV. The Master Species
List contains the current description of virus
taxonomy down to the species level and is
updated, as needed, by the ICTV executive
committee (EC). Decentralization is expected to
greatly improve the accuracy and speed of the
ICTVdB update process.