Brief Notes from Kew - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Brief Notes from Kew

Description:

Agree and document policy and procedures. Establish core fields (HISPID ... A3 flatbed scanner, inverted. Cradle for specimens. Distributed throughout Herbarium ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 20
Provided by: w2kt3
Category:
Tags: brief | kew | notes

less

Transcript and Presenter's Notes

Title: Brief Notes from Kew


1
Brief Notes from Kew
  • Mark Jackson Software Applications Manager

2
Focussing on...
  • Herbarium digitisation
  • electronic Plant Information Centre

3
Kew Herbarium
  • Guesstimated
  • 7 million specimens
  • 250,000 types
  • Less than 5 specimens databased
  • A variety of personal databases

4
Preparation for Digitisation
  • Computerise transactions
  • Agree and document policy and procedures
  • Establish core fields (HISPID pending ABCD)
  • Develop hardware and software infrastructure
    (e.g. catalogue database, mass storage)

5
Digitisation Strategy
  • Curators to barcode, database and image types for
    loan
  • Repatriation research projects
  • to use infrastructure and core fields
  • data to be imported into Catalogue (eventually)
  • Pursue digitisation projects

www.kew.org/data/repatbr
6
Specimen imaging
  • Decision to try to match Cibachrome prints in
    terms of quality (e.g. suitable for many
    diagnostic purposes)
  • 600 dpi delivers 200MB images
  • Stored as uncompressed (but bzipped) TIFFs
  • Acquisition of mass storage

7
HerbScan
  • A3 flatbed scanner, inverted
  • Cradle for specimens
  • Distributed throughout Herbarium

8
Pros and cons
200 MB master images (600 dpi scans), based on
capturing the level of detail of Cibachromes.
Camera HerbScan
  • 30-40,000
  • 200MB images barely achievable
  • 1 image per minute
  • Fixed
  • Versatile
  • 7,500
  • 200MB images easily achievable
  • 10 images per hour
  • Some mobility
  • Suited to flat items

9
HerbCat enquiries
image enquiries
Client
Image Server
HerbCat
Images
Metadata
10
Focussing on...
  • Herbarium digitisation
  • electronic Plant Information Centre

11
  • UK government funding for delivery of services
    electronically
  • Resource-discovery interface to multiple Kew data
    sources (not necessarily at Kew)
  • Data sources are heterogenous
  • Simple interface overlaying other systems

ePIC Interface
Data source
Data source
Data source
Data source
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
Architecture
Interface (java servlet)/JSPs
Requests
Results
Multi-threaded Java server
Request queue
Data sources
Data sources
Handlers one per data source one for logging one
for spell-checking
Configuration files (XML)
16
Texts
  • Web documents indexed using Lucene
  • Flora Zambesiaca digitised and marked-up with XML
  • Experimentation with options for query and output
    via Java servlet
  • using XSL to output selections
  • using Lucene to index the XML
  • importing the XML into a database
  • Other texts - jury still out, but Lucene route
    looks promising

17
Feedback
  • Email mechanisms
  • Web usability testing/focus groups
  • Logging
  • Quantitative success
  • levels of usage, patterns trends
  • beware crawlers, testing development staff,
    harvesters
  • referring URLs, Google link popularity of site
  • country, domain
  • Qualitative success
  • success of queries esp. zero hits (spelling,
    common names, families)
  • performance system monitoring
  • number of queries per session, return visits
  • results pages viewed

18
World distribution of queries
19
Future
  • www.kew.org/epic
  • More data sources, including texts and images
  • Hierarchical browsing front-end based around
    revamped Brummitt Families Genera with
    phylogenetic classification
  • Looking forward to
  • using the GBIF Names Service
  • links with DiGIR/BioCASE resources...
Write a Comment
User Comments (0)
About PowerShow.com