Title: Fedora Repository
1Fedora Repository
2Pre-indexing Program
The pre-indexing program populates the Search
database with sorting and path information from
the Fedora database as well as collection
information from the Objects.
Fedora Objects
3Create Indexes
The indexing program uses the search database to
find Objects for particular collections and then
combines Descriptive metadata with XML full text
datastreams to Create search objects for
indexing with Amberfish.
Search Objects
Filter
Atomic Amberfish Indexes
4Amberfish
- Amberfish is text retrieval software distributed
as open source software under the terms of
version 2 of the GNU General Public License
(GPL). - Automatic searching across multiple databases
(allowing modular indexing). We refer to this as
atomic indexes. - Indexing/search of semi-structured text (i.e.
both free text and multiply nested fields) - Boolean queries, right truncation, phrase
searching, relevance ranking, support for
multiple documents per file, incremental
indexing. - Read more - http//www.etymon.com/tr.html
5Searching
User Search Interface Coll1Coll2Coll3
Ambersearch.php
Coll1 Index
Coll2 Index
Coll3 Index
Sort Results
6Collection Hierarchy
A MySQL database is used to create and display
parent/child collection relationships. The
database is a compact relational model. A class
was written to build collection hierarchies in
the search interface and create structure maps of
the collection when needed. Collection Hierarchy
Search Interface A start point (collection id)
along with the max depth are defined in a
function call. The collection tree is then built
in the search interface. Structure Map (SMAP)
Generation When a collection objects
structure/hierarchy is changed a structure
map(XML) can be generated and stored in the
collection object in the repository. In the
event a collections hierarchy needs to be rebuilt
we have preserved the collections lineage in the
repository. To create the SMAP a start point
(collection id) need only be defined. A function
then probes the database to determine the
collections maximum depth. Once this is
discovered an appropriate SMAP is generated and
appended to the object.
7Collection Hierarchy Methods
A PHP class of methods is used by the WMS,
dlr/EDIT and search interfaces to add, update,
delete and display collection hierarchies. List
of methods AddChild AddNewCollection AddSearchT
erm AreRelated BuildCollTree BuildQueryCollHiera
rchy ChangeChild ChangeParent DeleteChild Delete
Collection DeleteDeadChildren DeleteOrphans Dele
teSearchTerm DisplayCollectionSearchTerms GetColl
ectionInfo GetCollectionSearchTerms GetCollection
StructureInfo GetCollidMySQL GetCollidMySQLByFedor
aID GetCollidWMS GetCollidWMSByFedoraID GetDeadC
hildren GetDirectChildren GetDirectParents GetOr
phans GetSearchTermFields UpdateSearchTerm
8Partner Portals
- Background
- Provide the capability to allow partners, other
institutions and individuals to attach the
repository search engine with selected
collections to their website - Built off existing search code used on NJDH and
RUcore sites. - An extension offered to NJDH partners and RUcore
participants. - Minimal systems requirements.
- Simple setup and maintenance for partners,
assumed they are not technically orientated. - Ability to customize their collection list,
subscribe. - How it works
- Username/password and a unique key are generated
and assigned. - Partner has access to subscribing to collections
of their choice. - Partner embeds a URI, IFrame, on their web site
that allows for access.
9Contact
Chad Mills cmmills_at_rci.rutgers.edu Jeffery
Triggs triggs_at_rutgers.edu