Fedora Repository - PowerPoint PPT Presentation

About This Presentation
Title:

Fedora Repository

Description:

The pre-indexing program populates the Search database with sorting and path ... http://lefty.scc-net.rutgers.edu/portals/psu/ Princeton University ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 11
Provided by: rul
Category:

less

Transcript and Presenter's Notes

Title: Fedora Repository


1
Fedora Repository
2
Pre-indexing Program
The pre-indexing program populates the Search
database with sorting and path information from
the Fedora database as well as collection
information from the Objects.
Fedora Objects
3
Create Indexes
The indexing program uses the search database to
find Objects for particular collections and then
combines Descriptive metadata with XML full text
datastreams to Create search objects for
indexing with Amberfish.
Search Objects
Filter
Atomic Amberfish Indexes
4
Amberfish
  • Amberfish is text retrieval software distributed
    as open source software under the terms of
    version 2 of the GNU General Public License
    (GPL).
  • Automatic searching across multiple databases
    (allowing modular indexing). We refer to this as
    atomic indexes.
  • Indexing/search of semi-structured text (i.e.
    both free text and multiply nested fields)
  • Boolean queries, right truncation, phrase
    searching, relevance ranking, support for
    multiple documents per file, incremental
    indexing.
  • Read more - http//www.etymon.com/tr.html

5
Searching
User Search Interface Coll1Coll2Coll3
Ambersearch.php


Coll1 Index
Coll2 Index
Coll3 Index
Sort Results
6
Collection Hierarchy
A MySQL database is used to create and display
parent/child collection relationships. The
database is a compact straightforward relational
model. Functions were written to build
collection hierarchies in the search interface
and create structure maps of the collection when
needed. Collection Hierarchy Search Interface A
start point (collection id) along with the max
depth are defined in a function call. The
collection tree is then built in the search
interface. Structure Map (SMAP) Generation When
a collection objects structure/hierarchy is
changed a structure map(XML) can be generated and
stored in the collection object in the
repository. In the event a collections hierarchy
needs to be rebuilt we have preserved the
collections lineage in the repository. To create
the SMAP a start point (collection id) need only
be defined. A function then probes the database
to determine the collections maximum depth. Once
this is discovered an appropriate SMAP is
generated and appended to the object.
7
Partner Portals
  • Background
  • Provide the capability to allow partners, other
    institutions and individuals to attach the
    repository search engine with selected
    collections to their website
  • Built off existing search code used on NJDH and
    RUcore sites.
  • An extension offered to NJDH partners and RUcore
    participants.
  • Minimal systems requirements.
  • Simple setup and maintenance for partners,
    assumed they are not technically orientated.
  • Ability to customize their collection list,
    subscribe.
  • How it works
  • Username/password and a unique key are generated
    and assigned.
  • Partner has access to subscribing to collections
    of their choice.
  • Partner embeds a URI, IFrame, on their web site
    that allows for access.

8
Partner Portal URLs
Northwestern University http//lefty.scc-net.rutge
rs.edu/portals/nwu/ Penn State
University http//lefty.scc-net.rutgers.edu/portal
s/psu/ Princeton University http//lefty.scc-net.
rutgers.edu/portals/princeton/
9
Electronic Theses Dissertations
Example of a project using all the capabilities
outlined. ETD is a product developed at RUL and
is based off of the open source Open Journal
System. Theses Dissertations are uploaded and
minimal amounts of metadata are supplied by the
student. Our graduate school uses the ETD
application to track and approve theses
dissertations. Once a theses or dissertations
is approved and all levels of sign off are
completed the metadata and document are packaged
up and delivered to the WMS for further
cataloging. This is done using the XML import
functionality built into the WMS. The paper is
then ingested into the repository and is
searchable from RUcore. The graduate school has
the ability to create a portal on their website
to access these papers as well. Additional
exporting takes place from the repository to our
library catalog and to UMI ProQuest.
10
Contact
Chad cmmills_at_rci.rutgers.edu Jeffery
triggs_at_rutgers.edu
Write a Comment
User Comments (0)
About PowerShow.com