Title: DNA Barcoding and the Information Infrastructure of Taxonomy
1DNA Barcoding and the Information Infrastructure
of Taxonomy
- David E. Schindel, Executive Secretary
- Consortium for the Barcode of Life
- National Museum of Natural History
- Smithsonian Institution
- Schindeld_at_si.edu http//www.barcoding.si.edu
- 202/633-0812 fax 202/633-2938
2Biodiversity InformaticsFragmented, Unconnected
3Growth of Biodiversity Databases
Museum databases of associated data
Authority files of taxonomic names
4What? Where? When?
Distributed Database serving data through nodes
and partners
5Census of Marine Life (CoML)and the Ocean
Biogeographic Information System (OBIS)
Museum databases of associated data
Databases of species occurrences and distribution
Authority files of taxonomic names
6DNA BarcodesA Key Variable for Biodiversity
Informatics
Museum databases of associated data
Databases of species occurrences and distribution
(OBIS)
Authority files of taxonomic names
7- Background on
- DNA Barcoding
- The DNA Barcode Initiative
- Consortium for the Barcode of Life (CBOL)
- Current and planned activities
- Data standards for barcode data
- Impact on the information infrastructure of
taxonomy
8A DNA barcode is a short gene sequence taken
from standardized portions of the genome, used
to identify species
9Characteristics of Barcode Regions
- Low intraspecies variability
- Discontinuous variation between species
- Flanked by conserved regions
- Easy to amplify
- Long enough to work in all groups
- Short enough for single reads
10When species have disjunct ranges of variation,
barcodes are an efficient diagnostic tool
From Meyer and Paulay, PLoS Biology, 2005
11The Mitochondrial Genome
12Reactions to Barcoding 2004
- From ecologists and other users This is what
we need! How soon can we get started? - From traditional taxonomists Species should be
based on lots of characters, not just barcodes
- From forward-looking taxonomists Using
molecular data as species diagnostics isnt new,
but standardization and broad implementation are
great! - From barcoding practitioners I had my doubts
at the beginning, but it really works as a tool
for identification (96 accurate in a recent
mollusc paper) and it is at least as good as
traditional approaches to discovering new
species.
13Uses of DNA Barcodes
- Applied tool for identifying regulated species
- Disease vectors, agricultural pests, invasives
- Environmental indicators, protected species
- Using minimal samples, damaged specimens, gut
contents, droppings - Research tool for improving species-level
taxonomy - Associating all life history stages, genders
- Testing species boundaries, finding new variants
- Triage tool for flagging potential new species
- Undescribed and cryptic species
14Uses of DNA Barcodes
- Applied tool for identifying regulated species
- Disease vectors, agricultural pests, invasives
- Environmental indicators, protected species
- Using minimal samples, damaged specimens, gut
contents, droppings - Research tool for improving species-level
taxonomy - Associating all life history stages, genders
- Testing species boundaries, finding new variants
- Triage tool for flagging potential new species
- Undescribed and cryptic species
15Associating Life Stages, Processed Parts,
Dimorphic Genders
16Steatogenini until the early 90s
Hypopygus lepturus Hoedeman 1962
Steatogenys elegans Steatogenys duidae
17Color patterns in Hypopygus
Nijssen Isbrüker 1972
18Steatogenini during the 90s
Hypopygus lepturus Hoedeman 1962
Hypopygus neblinae Mago-Leccia 1994
Steatogenys
19Steatogenini during the 90s / today
Hypopygus lepturus Hoedeman 1962
Hypopygus neblinae Mago-Leccia 1994
Stegostenopos Triques 1997
Steatogenys
20R. Bernhard, 2004
8a
21RAG 1 MP/ML/Dist
Stegostenopus
Hypopygus neblinae
A
H. lepturus
C
D
Steatogenys
2212S16S Strict of ML/MP/Dist
Stegostenopus
H. neblinae
A
C
H. lepturus
D
E
Steatogenys
23D-loop MP/ML/Dist
H. lepturus
D
E
24COI - BARCODE MP
H. lepturus
Eigenmannia sp.
25Uses of DNA Barcodes
- Applied tool for identifying regulated species
- Disease vectors, agricultural pests, invasives
- Environmental indicators, protected species
- Using minimal samples, damaged specimens, gut
contents, droppings - Research tool for improving species-level
taxonomy - Associating all life history stages, genders
- Testing species boundaries, finding new variants
- Triage tool for flagging potential new species
- Undescribed and cryptic species
26Using DNA Barcodes
- Establish reference library of barcodes from
identified voucher specimens - If necessary, revise species limits
- Then
- Identify unknowns by searching against reference
sequences - Look for matches (mismatches) against library on
a chip - Before long Analyze relative abundance in
multi-species samples
27Analytical chain
- Databasing
- Labeling
- Imaging
- Tissue sampling
- DNA extraction
- PCR
- PCR check
- Sequencing reaction
- Sequencing cleanup
- Sequencing
- Trace editing submission
28Methods
29Barcode of Life Database
30Analytical chain
- Databasing
- Labeling
- Imaging
- Tissue sampling
- DNA extraction
- PCR
- PCR check
- Sequencing reaction
- Sequencing cleanup
- Sequencing
- Trace editing submission
31Current Norm High throughput
Large capacity PCR and sequencing reactions
ABI 3100 capillary automated sequencer
32Cost of Reagents and Disposables
33Producing Barcode Data 2008 Faster, more
portable Hundreds of samples per hour
Integrated DNA microchips
Table-top microfluidic systems
34Producing Barcode Data 2010?Barcode data
anywhere, instantly
- Data in seconds to minutes
- Pennies per sample
- Link to reference database
- A taxonomic GPS
- Usable by non-specialists
35Methods
36What DNA Barcoding is NOT
- Barcoding is not DNA taxonomy no single gene (or
character) is adequate - Barcoding is not Tree of Life barcode clusters
are not phylogenetic trees - Barcoding is not just COI standardizing on one
region has benefits and limits - Molecules in taxonomy is not new but large-scale
and standardization are new - Barcoding can help to create a 21st century
research environment for taxonomy
37(No Transcript)
38(No Transcript)
39What DNA Barcoding is NOT
- Barcoding is not DNA taxonomy no single gene (or
character) is adequate - Barcoding is not Tree of Life barcode clusters
are not phylogenetic trees - Barcoding is not just COI standardizing on one
region has benefits and limits - Molecules in taxonomy is not new but large-scale
and standardization are new - BUTBarcoding can help to create a 21st century
research environment for taxonomy
40Consortium for the Barcode of Life (CBOL)
- First barcoding publications in 2002
- Cold Spring Harbor planning workshops in 2003
- Sloan Foundation grant, launch in May 2004
- Secretariat opens at Smithsonian, September 2004
- First international conference February 2005
- Now an international affiliation of
- 120 Members Orgs, 40 countries, 6 continents
- Natural history museums, biodiversity
organizations - Users e.g., government agencies
- Private sector biotech companies, database
providers
41Consortium for the Barcode of Life
(CBOL)Smithsonian Institution/Sloan
Foundation125 Member Organizations, 40 Countries
as of June 2006
42Outreach Activities
- Regional meetings in
- Cape Town, South Africa, 7-8 April 2006, SANBI
- Nairobi, Kenya, 18-19 October 2006
- Brazil, February 2007
- Taiwan, September 2007
- Second International Barcode Conference
- Taiwan, September 2007
- Support from CBOL, host governments and
international development agencies
43Goals of Regional Meetings
- Raise awareness
- Explore potential applications in the region
- Assess greatest needs and opportunities in the
region - Identify highest priorities, construct national
and regional action plans - Start intra-regional networks and
intercontinental partnerships
44CBOL-Initiated Projects
- Fish Barcode of Life (FISH-BOL)
- 30,000 marine/freshwater species by 2010
- All Birds Barcoding Initiative (ABBI)
- 10,000 species by 2010
- Tephritid fruit flies
- 2,000 pest/beneficial species and relatives by
2008 - Mosquitoes
- 3,300 species by 2008
- African Scale Insects
- Endangered vertebrates (bushmeat)
45Projects initiated by others
- CMarZ Marine habitat, multiple taxa
- All-Leps Multiple regions/habitats, single taxon
- BioCode, Moorea Single location, multiple
habitats, multiple taxa
46CBOLs Working Groups
- Database Designing/constructing the Barcode
Section of GenBank - DNA Protocols for formalin-fixed and old museum
specimens Producing LIMS for dissemination - Data Analysis Beyond phenetic methods
population genetics perspective - Plants Identify gene region(s) for barcoding
47Barcode Data Standards
- Consensus results of Front Royal meeting
- GBIF ? ITIS ? GRIN
- NBII ? Species2000 ? IPNI
- ICZN ? ZooRecord ? OBIS
- Structured link to voucher specimen
- Species name selected from authority
- Online access to metadata
- Trace files and quality scores
- Minimum sequence length
48BARCODE Records in INSDC
Voucher Specimen
Species Name
Specimen Metadata
GeoreferenceHabitatCharacter setsImagesBehavio
rOther genes
Indices - Catalog of Life - GBIF/ECAT Nomenclato
rs - Zoo Record - IPNI - NameBank Publication
links - New species Databases - Provisional sp.
Barcode Sequence
Trace files
Primers
Other Databases
Literature(link to content or citation)
PhylogeneticPopn GeneticsEcological
49Digitizing Taxonomic Literature
- CBOLs catalytic efforts
- Library-Laboratory meeting in London on
electronic access to taxonomic literature - Led to formation of Biodiversity Heritage Library
initiative - Proactive steps with PubMed to add taxonomic
journals to online abstracts - Aggressive negotiation with publishers of
barcoding papers
50CBOLs Working Groups
- Database Designing/constructing the Barcode
Section of GenBank - DNA Protocols for formalin-fixed and old museum
specimens Producing LIMS for dissemination - Data Analysis Beyond phenetic methods
population genetics perspective - Plants Identify gene region(s) for barcoding
51CBOL Formalin Workshop
- May 8-9 workshop in Washington, National Research
Council - Chemists, biochemists, biophysicists, biomedical
researchers - Literature survey of DNA recovery protocols from
formalin-fixed specimens - Create a new research agenda
- Workshop report in July
- Follow-on with SPNHC as major partner
52The Barcode Assembly Line 2008Opening the
museum treasure-trove
Freshly collected specimens
Formalin-fixed specimens
Older museum specimens
Young museum specimens
Frozen tissue
DNA Barcode Data
53CBOLs Working Groups
- Database Designing/constructing the Barcode
Section of GenBank - DNA Protocols for formalin-fixed and old museum
specimens Producing LIMS for dissemination - Data Analysis Beyond phenetic methods
population genetics perspective - Plants Identify gene region(s) for barcoding
54Some Technical Challenges
- Using character-based barcodes
- Optimizing sample size
- Specimen identification versus species
discovery - Measuring confidence
- Shrinking the barcode
Workshop, July 2006, NMNH Paris Rutgers/DIMACS,
European Science Fn.
55CBOLs Working Groups
- Database Designing/constructing the Barcode
Section of GenBank - DNA Protocols for formalin-fixed and old museum
specimens Producing LIMS for dissemination - Data Analysis Beyond phenetic methods
population genetics perspective - Plants Identify gene region(s) for barcoding
by December 2006
56Progress toward Plant Barcode
- Kress 2005 proposal for ITS and trnh-psbA
- Kew Garden receives Sloan/Moore Foundation
support - Phase 1 screens 100 genes across 50 sibling
species pairs - Phase 2 tests of matK, rpcoC1, rpoB, ndhJ, and
accD - Canadian proposal for rbcL
- CBOL protocols for approving barcode regions
57Wider Impacts of Barcoding
- Catalyzing interoperability of databases
- Barcode data standards link sequences, specimens,
species names and publications - Improving the information infrastructure
- Digital library initiative in taxonomy
- Renewing the mission of museums
- DNA recovery from formalin-fixed specimens
- Promoting the growth of DNA banks
- Expanding analytical toolbox for taxonomy
58Taipei Barcode Conference
- Academia Sinica, week of 17 September
- Second International Barcode Conference
- Regional Barcode Meeting for South/SE Asia
- CBOL Working Groups
- FISH-BOL/Marine Fisheries workshop
- Short course on biodiversity informatics