Title: Einfrastructure for organism names to facilitate data sharing
1E-infrastructure for organism names to facilitate
data sharing
- Project presentation
- Joint project of the Nordic GBIF Nodes by
NordForsk 2008-2010
2Why regional cooperation makes sense?
- Neighbours need to learn from each other because
they share biodiversity - Share similar problems and are often at same
level of science and technology - Language and cultural barriers and cost of
meetings are low - There often is need data sharing within regions
- In the Nordic region data is being shared, but
data flows are moving to European level - There are funding sources that work at regional
level - Several sources in the Nordic region
3A Nordic Code Centre Operated in 1981-94. It
issued unique numbers for taxa and 9 letter codes
(ab-breviations for field work) for names. These
can still be downloaded, but they are updated on
several places and updates are not publicly
available.
4Proposal funded by
5Scientific names in observational databasespose
problems
- Cannot identify taxonomic concept unequivocally
- Antropomorphic, mix contents with the key, and
thus prone to change - Do not scale up to integrate hundreds of
databases automatically - Life Science Identifier technology seen as
solution
6(No Transcript)
7(No Transcript)
8(No Transcript)
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
17Life Science Identifiers (LSID) for every
taxonomic concept
- Example urnlsidgbif.fitaxon12345678901
- Identifier for a taxon concept for example
Vanessa cardui L., sensu Saarenmaa Saarenmaa
2027. - The six elements of an LSID are separated by a
colon sign - States that this is an Universal Resource Name
(URN). - Denotes that this URN is an LSID.
- Authority identification (normally, who has
issued the LSID and that provides the web service
to resolve it). - Namespace identification (normally, name of a
database at the authority). - Object identification (meaningless GUID or
database key). - Optional revision number.
- Resolving LSID means retrieving the description
from an authority. - LSID authorities are now being set up by many
Species Databases and regional centres.
18(No Transcript)
19(No Transcript)
20(No Transcript)
21Purpose of the joint project
- Build an e-infrastructure for resolving
scientific names of organisms to facilitate
biodiversity data use and data sharing in the
Nordic region and beyond. - Set up service on Internet that will issue
globally unique identifiers for scientific names
(?) and the underlying taxonomic concepts (!)
based on the LSID specification. - Environmental authorities, research groups, and
mobile observers out in the wild can then use
these identifiers to remove ambiguities in data
exchange. Among the benefits will be that large
integrated studies that need to combine data, for
instance for global change studies, become more
feasible. - Project increases the interaction of the
participants that already are major research
infrastructure elements into new electronic
frontiers. This is the first joint project of
the Nordic GBIF nodes and the project is aimed
also at strengthening Nordic cooperation in the
global GBIF process.
22The proposed e-infrastructure will offer the
following services
- Issue LSID for each scientific name and each
circumscription, and resolve the LSIDs upon
request. - Integrate globally the services through the TDWG
infrastructure project. - Coordinate integration of name lists in the
Nordic region, initially for Lepidoptera. - Develop guidelines for incorporating LSIDs in
datasets. Offer helpdesk and training.
Disseminate results. - Promote data sharing in the region.
23Participants are GBIF Nodes in the region plus NW
Russia
- Natural History Museum of Denmark
- Estonian Life Sciences University
- Finnish Museum of Natural History (coordinator)
- University of Oslo, Natural History Museums
- Swedish Museum of Natural History
- Russian Academy of Sciences, Zoological
Insititute, Sankt Petersburg
24Advisory Group
- Starri Heidmarsson (GBIF-Iceland)
- Liisa Tuominen-Roto (Finnish Environment
Institute) - Søren Roug (European Environment Agency)
- Kevin Richards (GBIF New Zealand)
- Per Alström (Artdatabanken)
- Ricardo Pereira (TDWG)
- Frank Bisby (Species 2000)
25Timeline - Steps
- Month 1 Hire workforce. Hold kick-off meeting
with the Steering Group. - Month 3 Acquire and set up server computers.
Install LSID software tools. Study experiences
from other LSID projects. Acquire relevant name
lists. - Month 6 Issue LSIDs for all names and
tentatively generate naked LSID (without link to
description) for the corresponding concept. Link
simple synonyms together. Using several test
databases for observations, integrate them
together. Have entered 25 descriptions by hand
from literature for homonyms and split species. - Month 9 Open test service for LSID issuance for
new names and their resolution. Present
demonstration of simple data integration. Design
linking to descriptions from literature. Have
entered 50 descriptions by hand from literature
for homonyms and split species. Hold meeting of
IT Advisory Group. - Month 12 Acquire links to literature references
from uBio and BHL, and link them to the concept
LSIDs. Have entered 100 descriptions by hand from
literature. Start linking together split species
and homonyms. Set up the TAPIR/TCS name
provider. Hold meeting of the Coordinating Group
in particular looking at lessons learnt and new
opportunities for Nordic GBIF cooperation. - Month 15 Have entered 150 descriptions by hand
from literature and automatic sources. Write
guidelines how to incorporate LSIDs in
observation databases and data collection tools.
Test these guidelines with global fish and
Danish, Estonian, and Norwegian all species
lists. - Month 18 Open service on Internet for LSID
resolution. First operational use of LSID in
data exchange. Write papers. Present results at
TDWG and other meetings. Prepare proposals
Steering Group about continuation projects. - Month 24 Have entered 300 descriptions by hand
from literature and automatic sources. Start
working on other lists than Lepidoptera to test
the procedures. Hold meeting of Steering Group
and explore for continuation the feasibility of a
Nordic Catalogue of Life project that would build
permanent infrastructure for biodiversity
informatics in the region. - Month 30 Experiment with using LSIDs in
observation databases. Write thesis. Write
about lessons learned. Propose latest at this
stage further Nordic GBIF projects. - Month 36 Final meeting of the Coordinating
Group. Training workshop to disseminate results.
Transition of project to other bodies.
26Contributions by Partner
- Denmark (SNM-UKBH) Contribute names of all
Danish taxa (preparations for a national dataset
of all species is being finalised, to be shared
through the Danish node portal of GBIF
www.danbif.dk in the near future). Feed updates
on Danish inventory of Lepidoptera to the
Norwegian dynamic checklist of Nordic
Lepidoptera. - Estonia (EMU) Contribute names of all Estonian
taxa. Test observation databases to increase
availability of data. - Finland (FMNH) Good Lepidoptera data exists and
is already being shared. Contribute names for
Finnish Lepidoptera. Carry out project
management and technical development. - Norway (University of Oslo and Artsdatabanken)
Some Lepidoptera data exists and is already being
shared. Possibly establish data collection
portal to increase availability of data.
Contribute names for all Norwegian taxa. - Russia (ZIN) Contribute Lepidoptera names. Test
observation databases and digitize sample
collection data. - Sweden (NRM) Good Lepidoptera data exists and
is already being shared. Contribute names for
Nordic Lepidoptera and global fish. - All Put LSIDs in selected observation datasets
according to guidelines that will be developed by
the project. Share experience between GBIF nodes
in the region.
27Project organisation
- Project leader will oversee the work of the
technological development unit, communicate with
the Coordinating Group, Advisory Group, and the
funding organisation, and ensure the quality of
the project deliverables. - Coordinating Group will be set up of project
leaders of all partners. This group is six
people, and will meet at least annually, but will
work intensively on email. - Advisory Group consisting of major users and
experts of informatics solutions will be invited
internationally. It will constitute of about 5
people from international organisations, such as
GBIF, TDWG, Catalogue of Life, European
Environment Agency, national environment
authorities, and research groups who are well
known leaders of this area. This group will meet
at least once in person, but will comment on
plans and deliverables electronically. - Technological development unit Nina Markus.
28Use of funds
- 1 million NOK 125,000
- Finland 71.8
- 79 to salaries, 8 travel, 3 computers, 10
overhead - Others 5.6 each
- One travel annually
- Deliver name lists
- Enter LSIDs in databases
- Mobilise databases
29Questions
- How are name databases handled in the
participating countries? - Do these databases make explicit the taxonomic
concepts? - Can we increase availability of data in targeted
groups? - Any LSID experience so far?