Cocoa: Components for Constructing Open Archives - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Cocoa: Components for Constructing Open Archives

Description:

Basic harvester. Exercises harvesting API to process records from OAI servers ... Basic harvester and harvester API mostly finished. Not yet linked to indexer ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 12
Provided by: joefut
Category:

less

Transcript and Presenter's Notes

Title: Cocoa: Components for Constructing Open Archives


1
Cocoa Components for Constructing Open Archives
  • Joe Futrelle,
  • National Center for Supercomputing Applications

2
Rationale
  • Easy on-ramp for Open Archive providers of
  • Data archives
  • Services harvesting, mirroring, adding value
  • Turnkey or in-a-box packaging of OAI
    functionality
  • Integration of Open Archives data into existing
    applications for
  • Analysis
  • Data mining

3
Overview
  • Java implementation of Open Archives protocol
  • Server-side access to archives
  • Client-side harvesting, analysis
  • Plug-in architecture
  • Server-side servlet implementation with APIs
    for
  • Constructing records
  • Responding to OAI verbs
  • Client-side API for processing harvested records

4
Components Server side
  • Open Archive in a Box
  • Tool for making RDBMS accessible through Open
    Archives
  • No programming required tool is configured using
    XML configuration file
  • Configuration wizard in the works
  • Minimal assumptions about database schema
  • Requires modification-date information
  • Requires at least one unique key column
  • Allows for JOINs

5
Components Client side
  • Basic harvester
  • Exercises harvesting API to process records from
    OAI servers
  • Experiments underway to link with Lucene
    open-source full-text indexer
  • Does not yet provide retrieval service
  • D2K module
  • NCSAs Data to Knowledge rapid application
    development (RAD) framework
  • Allows OAI records to be processed through a
    variety of machine learning algorithms

6
Applications Turning an SQL DB into an OA
XML config.
OAI clients
DB
OAI-in- a-box
OAIB
7
Prototype OAIB Configuration Wizard
8
Applications Mirroring an OA
Basic harvester
OA
OAI clients
DB
OAIB
specialized code
9
Applications D2K
10
Status
  • Protocol implementation 100 complete
  • Passes Repository Explorer automated tests
  • OAIB 90 complete
  • Ironing out minor bugs
  • Configuration wizard not complete
  • Basic harvester and harvester API mostly finished
  • Not yet linked to indexer (Lucene)
  • Mirroring application not begun
  • D2K modules complete, but could use others
  • Caveat Alpha-quality code!

11
Links
  • For information about Cocoa and pre-release
    downloadshttp//emerge.ncsa.uiuc.edu/
  • For information about D2K and the Automated
    Learning Grouphttp//www.ncsa.uiuc.edu/TechFocu
    s/Projects/
Write a Comment
User Comments (0)
About PowerShow.com