Title: Focal Persons point of view
1Focal Persons point of view Interaction with
EURISCO
- Iva Faberová
- September 2003
2Introduction
- EURISCO structure and application
- Real contact with the data conversion, upload or
finding information - different experience - Web application should firstly serve to the user
3User category
-
- Primary user
- NFP collecting, exporting and
- delivering data active
- password protected area
- Secondary user
- common user looking for seeds,
- PGR information, open information
- access
4Documentation system levels 1
- Existing or developed ex situ PGRFA National
Inventories - Country level of standardization- all crops
- passport standards
- - variable database platforms
- - centralized / distributed systems
- NI are developed since the eighties, when wide
computerization started
5Documentation system levels 2
- Crop level across Europe
- Development of CCDBs - centralized
- more or less its own structure based on
standards - -the database manager merges data
- high requirement of manual work
- The oldest published Avena (Seidewitz,
Braunschweig, 1985), Barley (H. Knüpffer,
Gatersleben, 1987) - at present 55 CCDBs
6Documentation system levels 3
- NI and CCDBs share a lot information, part of
data is unique - How to merge variable data sources and different
layers into one ? - EURISCO high level of standardization
7Documentation standards 1
- Descriptor Lists - FAO/IBPGR/IPGRI standards for
PGRFA collections - Descriptor lists tools for ordering and
- evaluation of collections (thirties, forties)
- Passport inventory data - universal
- characterization/evaluation crop specific
- storage/regeneration data storage specific
- Unique identifier - accession number
8Documentation standards 2
- FAO/IBPGR standards - computerization
- The IBPGR 1977-1982 first published descriptor
lists for PGRFA collections - passport sets of descriptors slightly different
for different crops - 1985 descriptors for main crops published ca
25-27 descriptors in set - Unique identifier - accession number
9Documentation standards 3
- IPGRI/FAO MCPDL February 1997 after the meeting
in Budapest (1996) - Computerized most part of data
- 19 IPGRI descriptors 6 FAO descriptors
- Information exchange across crops
- Unique identifier - accession number
10Documentation standards 4
- IPGRI/FAO MCPDL 2001 ver. 2 -revised (December
2001) - 28 descriptors
- provides an explanation of content, coding
- scheme and suggested field names
- refinment of field content
- Unique identifier - accession number
11Documentation standards 5
- EURISCO descriptor list 34 descriptors
- Followed the MCPDL ver.2 - enlarged MCPDL ver. 2
by 6 descriptors, four are newly created fields
for non standard institutes, (donor, breeder,
collector, safety duplication institute) - URL link to the original data source and
the NI code - Unique accession identifier
- NI codeInstcodeGenusAccession number (EURISCO
obligatory fields)
12EURISCO New Descriptors 1
NI code data merging on higher than national
level URL qualitative step in data
utilization via web direct link to the original
data source, characterization/evaluation data and
link to collection holder, where plant material
could be requested
13EURISCO New Descriptors 2
- Institutions
- Temporarily satisfying solution of the List of
Institutes - cooperation with FAO WIEWS new
INSTCODE continuously updated list. - EURISCO has similar character of update like FAO
WIEWS mechanism - one NFP responsible for all
updates per country, but the system in not
fully automatic, updates are made on-line via web
application
14Development of passport descriptor standards 1
15Development of passport descriptor standards 2
16Development of passport descriptor standards 3
17Development of passport descriptor standards 4
- 7 descriptors without any change
- (accession number, donor, genus, collecting
- number, collecting institute, location of
- collection site, altitude)
-
- 9 descriptors disappeared from the list
- (collector name, collecting country,
- province/state, local/vernacular name,
- photograph, three FAO availability codes
- and acquisition type)
18Data conversion 1
- Data conversion
- Clear and definite requirements for data format
and structure necessary for automatic update
procedure - Revision and update of the data and their higher
level of standardization. - Partly manual time-consuming work, which should
be done once with advantage for the future - Development of conversion scripts for automatic
conversion procedures, which will be reused for
the following EURISCO updates
19Data conversion 2
- Tools for data checking
- Taxcheck - confirmation of accepted
names and synonyms using GRIN taxonomic system - (genus, species, taxon detection of synonyms
and correction of misspellings in taxon names) - http//pgrdoc.ipgri.cgiar.org/taxcheck/grin/
20Data conversion 3
- Tools for data checking
- Validator.xls developed 2002 by Theo van
Hintum - tool for data validity check
comparison of data to standards and localization
non standard values
21Data conversion 4
- Standard tables should be regularly updated
(extended country code ISO 3166 ROU or ROM?, FAO
WIEWS Instcode) - Tab delimited text problems with additional
columns after conversion from Excel - Impact full standardization and data quality
improvement, data applicability for wide exchange
22Data delivery 1
- EURISCO import and upload mechanism is well
developed - An automatic data import to EURISCO requires
strict standardization - Size limit 8MB (GZIPP)
- Import takes long time, the loading meter would
be needed - Difficulties with the local network setting
limitation by firewall, proxy server
23Data delivery 2
- Optional setting of the error level
- Visualization of wrong fields/records upload
helps to improve data quality data integrity
check - Wrong records should be completed by the
accession numbers - EURISCO geographic coordinates checker
24Data delivery 3
- Impact
- Tool and motivation for data quality improvement
- Speeding up development of new standardized PGR
documentation systems - Easy data exchange - once converted data could be
used in many CCDBses - Motivation of NFP to continuous update
- Web presentation NI, passport part of CCDBs on
the same level of update
25Secondary user 1
- CCDB manager
- This EURISCO part is still in development and at
present is not very user-friendly. Similarity of
application to SINGER. - SINGER adjusted to the character of CGIAR
centres. But the content offers more detail
information - higher number of passport fields-
higher number of records downloadability
required - Taxonomy not standardized
26Secondary user 2
- CCDBs manager
- Apropriate accession inventory is a prerequisite
for the next step - characterization/evaluation
data, which should be the main task for CCDBs
managers - Difficult update of CCDBs - once collected data
are hardly updated improvement of quality level - EURISCO offers immediately the last update
27Secondary user 3
- CCDBs manager
- Passport data download from EURISCO for purposes
of CCDBs solution of the problem data update in
CCDBs the first data collection relatively
easybut the update is difficult -
rationalization of work of managers - Update Avena (1984) , Hordeum (1983), Brassica-
minimal update in EWDB (1997) - Interaction NI and database requests
28Secondary user 4
- CCDBs managers of newly developed databases can
download the data directly from the EURISCO and
can spend their work on characterization and
evaluation data - (example Fagopyrum CCDB)
- Profit of EURISCO
29Secondary user 5
- Common user - information and plant material
seeker - Quick orientation in all European PGRFA
collections and direct link to the
genebank/institution holding the seeds or plant
material
30Summary
- At the beginning many skeptical visions
concerning merging data from variable sources
different structures, different software,
different level and data quality centralized or
distributed model - Quick orientation in all European PGRFA
collections and link to the original record - Rationalization of work with data- download from
EURISCO