Title: DNA Barcoding: An Emerging Global Standard for Species Identification
1DNA Barcoding An Emerging Global Standard for
Species Identification
- Consortium for the Barcode of Life
- National Museum of Natural History
- Smithsonian Institution
- http//www.barcoding.si.edu
- 202/633-0808 fax 202/633-2938
2A DNA barcode is a short gene sequence taken
from standardized portions of the genome, used
to identify species
3Characteristics of Barcode Regions
- Flanked by conserved regions
- Easy to amplify
- Low intraspecies variability
- Discontinuous variation between species
- Long enough to work in all groups
- Short enough for single reads
4The Mitochondrial Genome
5Using DNA Barcodes
- Establish reference library of barcodes from
identified voucher specimens - If necessary, revise species limits
- Then
- Identify unknowns by searching against reference
sequences - Look for matches (mismatches) against library on
a chip - Before long Analyze relative abundance in
multi-species samples
6Analytical chain
- Databasing
- Labeling
- Imaging
- Tissue sampling
- DNA extraction
- PCR
- PCR check
- Sequencing reaction
- Sequencing cleanup
- Sequencing
- Trace editing submission
7BoLD Data System
- Developed/hosted by Univ. Guelph
- Workbench for most barcode projects
- Laboratory Information Management System (LIMS)
for assembling data - Management and Analysis System
- Identification system for matching unknowns to
reference records - Uploading to GenBank
8Methods
9Barcode of Life Database
10Analytical chain
- Databasing
- Labeling
- Imaging
- Tissue sampling
- DNA extraction
- PCR
- PCR check
- Sequencing reaction
- Sequencing cleanup
- Sequencing
- Trace editing submission
11Current Norm High throughput
Large capacity PCR and sequencing reactions
ABI 3100 capillary automated sequencer
12Cost of Reagents and Disposables
13Producing Barcode Data 2008 Faster, more
portable Hundreds of samples per hour
Integrated DNA microchips
Table-top microfluidic systems
14Producing Barcode Data 2010?Barcode data
anywhere, instantly
- Data in seconds to minutes
- Pennies per sample
- Link to reference database
- A taxonomic GPS
- Usable by non-specialists
15Methods
16Uses of DNA Barcodes
- Applied tool for identifying regulated species
- Disease vectors, agricultural pests, invasives
- Environmental indicators, protected species
- Using minimal samples, damaged specimens, gut
contents, droppings - Research tool for improving species-level
taxonomy - Associating all life history stages, genders
- Testing species boundaries, finding new variants
- Triage tool for flagging potential new species
- Undescribed and cryptic species
17Uses of DNA Barcodes
- Applied tool for identifying regulated species
- Disease vectors, agricultural pests, invasives
- Environmental indicators, protected species
- Using minimal samples, damaged specimens, gut
contents, droppings - Research tool for improving species-level
taxonomy - Associating all life history stages, genders
- Testing species boundaries, finding new variants
- Triage tool for flagging potential new species
- Undescribed and cryptic species
18Associating Life Stages, Processed Parts,
Dimorphic Genders
19Steatogenini until the early 90s
Hypopygus lepturus Hoedeman 1962
Steatogenys elegans Steatogenys duidae
20Color patterns in Hypopygus
Nijssen Isbrüker 1972
21Steatogenini during the 90s
Hypopygus lepturus Hoedeman 1962
Hypopygus neblinae Mago-Leccia 1994
Steatogenys
22Steatogenini during the 90s / today
Hypopygus lepturus Hoedeman 1962
Hypopygus neblinae Mago-Leccia 1994
Stegostenopos Triques 1997
Steatogenys
23R. Bernhard, 2004
8a
24RAG 1 MP/ML/Dist
Stegostenopus
Hypopygus neblinae
A
H. lepturus
C
D
Steatogenys
2512S16S Strict of ML/MP/Dist
Stegostenopus
H. neblinae
A
C
H. lepturus
D
E
Steatogenys
26D-loop MP/ML/Dist
H. lepturus
D
E
27COI - BARCODE MP
H. lepturus
Eigenmannia sp.
28Uses of DNA Barcodes
- Applied tool for identifying regulated species
- Disease vectors, agricultural pests, invasives
- Environmental indicators, protected species
- Using minimal samples, damaged specimens, gut
contents, droppings - Research tool for improving species-level
taxonomy - Associating all life history stages, genders
- Testing species boundaries, finding new variants
- Triage tool for flagging potential new species
- Undescribed and cryptic species
29Wider Impacts of Barcoding 2008
- Catalyzing interoperability of databases
- Barcode data standards link sequences, specimens,
species names and publications - Improving the information infrastructure
- Digital library initiative in taxonomy
- Renewing the mission of museums
- DNA recovery from formalin-fixed specimens
- Promoting the growth of DNA banks
- Expanding analytical toolbox for taxonomy
30What DNA Barcoding is NOT
- Barcoding is not DNA taxonomy no single gene (or
character) is adequate - Barcoding is not Tree of Life barcode clusters
are not phylogenetic trees - Barcoding is not just COI standardizing on one
region has benefits and limits - Molecules in taxonomy is not new but large-scale
and standardization are new - Barcoding can help to create a 21st century
research environment for taxonomy
31(No Transcript)
32(No Transcript)
33What DNA Barcoding is NOT
- Barcoding is not DNA taxonomy no single gene (or
character) is adequate - Barcoding is not Tree of Life barcode clusters
are not phylogenetic trees - Barcoding is not just COI standardizing on one
region has benefits and limits - Molecules in taxonomy is not new but large-scale
and standardization are new - BUTBarcoding can help to create a 21st century
research environment for taxonomy
34Consortium for the Barcode of Life (CBOL)
- First barcoding publications in 2002
- Cold Spring Harbor planning workshops in 2003
- Sloan Foundation grant, launch in May 2004
- Secretariat opens at Smithsonian, September 2004
- First international conference February 2005
- Now an international affiliation of
- 130 Members Orgs, 40 countries, 6 continents
- Natural history museums, biodiversity
organizations - Users e.g., government agencies
- Private sector biotech companies, database
providers
35CBOL Member Organizations June 2006 120 Member
Organizations, 40 countries
36CBOLs Working Groups
- Database Designing/constructing the Barcode
Section of GenBank - DNA Protocols for formalin-fixed and old museum
specimens Producing LIMS for dissemination - Data Analysis Beyond phenetic methods
population genetics perspective - Plants Identify gene region(s) for barcoding
37Infrastructure of TaxonomyFragmented,
Disconnected
- Collections and databases of specimens
- Compilations of taxonomic names
- Data repositories (characters, gene sequences,
images, trees) - Monographs
- Floristic and faunistic surveys/inventories
- Revisions
- The (undigitized) Taxonomic Literature
38Barcode Records in INSDC
- Consensus results of Front Royal meeting
- GBIF ? ITIS ? GRIN
- NBII ? Species2000 ? IPNI
- ICZN ? ZooRecord ? OBIS
- Structured link to voucher specimen
- Species name selected from authority
- Online access to metadata
- Trace files and quality scores
- Minimum sequence length
39BARCODE records in GenBank
Voucher Specimen
Species Name
Specimen Metadata
GeoreferenceHabitatCharacter setsImagesBehavio
rOther genes
Indices - Catalog of Life - GBIF/ECAT Nomenclato
rs - Zoo Record - IPNI NameBank Publication
links - New species
Barcode Sequence
Trace files
Primers
Other Databases
Literature(link to content or citation)
PhylogeneticPopn GeneticsEcological
40Digitizing Taxonomic Literature
- CBOLs catalytic efforts
- Library-Laboratory meeting in London on
electronic access to taxonomic literature - Led to formation of Biodiversity Heritage Library
initiative - Proactive steps with PubMed to add taxonomic
journals to online abstracts - Aggressive negotiation with publishers of
barcoding papers
41CBOLs Working Groups
- Database Designing/constructing the Barcode
Section of GenBank - DNA Protocols for formalin-fixed and old museum
specimens Producing LIMS for dissemination - Data Analysis Beyond phenetic methods
population genetics perspective - Plants Identify gene region(s) for barcoding
42The Barcode Assembly Line 2006
Freshly collected specimens
Young museum specimens
Frozen tissue
DNA Barcode Data
43The Barcode Assembly Line 2008Opening the
museum treasure-trove
Freshly collected specimens
Formalin-fixed specimens
Older museum specimens
Young museum specimens
Frozen tissue
DNA Barcode Data
44CBOL Formalin Workshop
- Literature survey of DNA recovery protocols from
formalin-fixed specimens - Solicited proposal from National Research Council
- May 8-9 workshop in Washington
- Chemists, biochemists, biophysicists, biomedical
researchers - Create a new research agenda
45CBOLs Working Groups
- Database Designing/constructing the Barcode
Section of GenBank - DNA Protocols for formalin-fixed and old museum
specimens Producing LIMS for dissemination - Data Analysis Beyond phenetic methods
population genetics perspective - Plants Identify gene region(s) for barcoding
46Data analysis protocols in 2008 A Bigger, Better
Analytical Toolkit to handle the Barcode Data
Explosion
- Collaboration of statisticians, computer
scientists, population geneticists - Sampling issues
- Sample size versus confidence level
- Sample size in light of geography, gene flow
- Analytical tools and protocols
- Treatment of missing DNA site data
- Identification versus species delimitation
(classification versus clustering)
47CBOLs Working Groups
- Database Designing/constructing the Barcode
Section of GenBank - DNA Protocols for formalin-fixed and old museum
specimens Producing LIMS for dissemination - Data Analysis Beyond phenetic methods
population genetics perspective - Plants Identify barcode gene region(s) for
land plants
48Progress toward Plant Barcode
- Kress 2005 proposal for ITS and trnh-psbA
- Kew Garden receives Sloan/Moore Foundation
support - Phase 1 screens 100 genes across 50 sibling
species pairs - Phase 2 tests of matK, rpcoC1, rpoB, ndhJ, and
accD - Canadian proposal for rbcL
- CBOL protocols for approving barcode regions
49Current and Planned CBOL Barcoding Projects
- FishBOL and All Birds Initiatives
- Demonstrator Systems by 2008
- Tephritid fruit flies (agricultural pests)
- Mosquitoes (disease vectors)
- African Scale Insect Barcoding Initiative
(planned at Cape Town Regional Meeting) - Barcoding for Conservation Committee
50Launching CBOL Projects
- Assembling Steering Committee
- Users
- Taxonomists, collection curators
- Service providers (BoLD, analytical labs)
- Plan for scope, timetable, logistics
- Pilot tests of primers, PCR amplification
- Assemble pipeline of specimens to lab
51ABBI and FISH-BOL
- Global initiatives to create reference library
- Enable users to adopt barcode ID systems
- All-species barcode database will
- Strengthen specimen/species data
- Improve collections, tissue/DNA resources
- Attract users to barcoding for specimen IDs
- Regional Working Groups
- Small Steering Committee and CBOL
52Planned Outreach
- Regional meetings in
- Cape Town, South Africa, 7-8 April 2006, SANBI
- Nairobi, Kenya, 18-19 October 2006, NMK
- Sao Paolo, Brazil, February 2007, INPA
- Southern/SE Asia, mid-2007
- Second International Barcode Conference
- Southeast Asia (?), September 2007 (?)
- Support from CBOL, host governments and
international development agencies
53Milestones for 2008
2007
2008
2006
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
200K records
500K records
100K records
Database
Formalin Study
Advanced Lab Protocols
DNA WG
Development of Consensus Plant Barcode Region
Plant WG
International Conference
Demonstrator System Launched
Data Analysis Protocols and S/W
Data Analysis WG
BoLI Data Portal Launched
Extended DB Interoperability
Data Standards
Database WG
Campaigns
Regional Groups Operational
First Data Releases
10K birds30K fish