Repositories, Data and the RQF - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Repositories, Data and the RQF

Description:

The repository contained records created as part of our 2005 and 2006 trail-runs ... Matching repository content models to RQF Technical Specification ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 28
Provided by: omc2
Category:

less

Transcript and Presenter's Notes

Title: Repositories, Data and the RQF


1
Repositories, Data and the RQF (or How to tackle
some of the RQF with FezFedora) Andrew
Bennett University of Queensland Library
2
A Brief Overview
Some architectural decisions a view from 6000
metres up and making systems play together
The Institutional Repository a few things to
know about UQ eSpace
Enabling UQ eSpace for RQF where DID we get all
that data from?
Bundling it all up Checking, de-duplication and
verification of publication data
What do you mean its all over ? what else to do
with some of the data until next time ?
www.uq.edu.au
3
Some Early Decisions
We decided that we would NOT attempt to make the
repository do everything
Library Would concentrate on developing and
enhancing the functionality of the Institutional
Repository and work with the Office of the DVC
Research to identify and ingest as much data as
possible to save manual re-entry Information
Technology services Would develop the
institutional Evidence Portfolio System and
systems to facilitate feedback from academic
community. Would also work on functionality to
provide the institutional submission to the DEST
IMS Office of the DVC Research Staff from here
would take a coordinating role and work with both
development teams to analyse and determine the
required functionality and to recruit, check and
if necessary, create content
4
Early information workflow
5
Early information workflow
Where we started

6
Core Systems and Data Sources

Light - Weight DSS
7
Core Systems and Data Sources

Light - Weight DSS
8
(No Transcript)
9
UQ eSpace RepositoryPublications Data
Consolidation Project
UQ eSpace was to become the authoritative UQ
source of bibliographic data for RQF UQ eSpace
will also eventually replace functionality
currently provided from ResearchMaster as the
primary data source for the DEST HERDC
submission
  • Why was eSpace population so important for the
    RQF?
  • Body of Work is one very significant area where
    RQF requirements were fairly predictable
  • RQF Submission of bibliographic data was
    critically dependent on the success of this
    project
  • RQF Review Panel access to Research Outputs was
    critically dependent on the success of this
    project
  • We already had a reasonable start on data from
    our practice runs and trial assessments

10
UQ eSpace RepositoryPublications Data
Consolidation Project
11
Sources of Data
Data was ingested into the repository from
multiple sources.
Existing publications records The repository
contained records created as part of our 2005 and
2006 trail-runs of the internal UQ research
assessment Exercise (RAE)
Thompson National Citation Report data
set Bibliographic and citation performance data
was purchased for Australian Universities up to
October 2006.
Publications Records from our Research Management
Solution Approximately 55,000 publication records
were ingested from the Universitys Research
Management system (Research Master)
Data from Academic CV, Citation Analyses and
Endnote Libraries Additional records were able to
be imported from other sources including
literature searches, citation reports, endnote
Libraries and curriculum vitae This required
specialised tools and filters to be built which
are now part of the release of FezFedora
12
(No Transcript)
13
UQ eSpace RepositoryRecord De-duplication and
checking

14
Deduplication and Checking
In many instances multiple records now existed
with overlapping and sometimes conflicting data.
Needed a mechanism to try to match records which
were duplicates Algorhythmic matching based on a
combination of factors including ISI-LOC, and
fuzzy logic on keyword/title/name/publication
Developed a mechanism to present lists of
duplicates Some records could be automatically
merged but other required human oversight
Matching the right fields from different
records A major factor in the merge process was
ensuring that when a duplicate record was
discarded, no data was lost
Matching repository content models to RQF
Technical Specification Many of the existing
content models needed to be updated to handle new
fields and display methods
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
UQ eSpace RepositoryPublications Data
Consolidation Project

21
(No Transcript)
22
Passing Data back to the EPS
The Evidence Portfolio system needs to extract
and update data based on records in the repository
Simple web services were created to expose
records to EPS via XML A number of services make
records available based on arguments passed,
including a list of authors, single records,
collections and communities
XACML based policies protect data from
unauthorised changes The data checking team is
able to edit only the fields that need to be
changed
Workflow tracking and management The data
checking team uses our issue tracking system and
messages can be exchanged between that and the
EPS to indicate records needing updates or which
have been completed
Academic Community is able to view/check
publications in the EPS Each academic is
presented with a single view of all their RQF
eligible publications in one place in the EPS
yet the underlying data is sourced from the
institutional repository
23
Fitting It All Together

24
Not quite the end of the story
Users of the DEST IMS need to be able to directly
access the published outputs which are held in
the repository
The DEST IMS communicates directly with the
repository via basic authentication using a
secret username and password A web service in the
repository can then present the PDF data-stream
directly into the web browser of the DEST
Assessor without displaying any local repository
metadata or graphical presentation. Publisher
outputs are stored on the record in the
repository The same XACML policy engine which
protects key fields can also be used to
selectively hide the datastream containing a
published version of the output to all BUT the
authorised DEST user OF COURSE . HARVESTING or
OBTAINING the published versions is a completely
kettle of fish . . .
25
A couple of final words on IRs
Scalability and performance become even more
critical when you are relying on your IR so
heavily
Test your system and architecture to see how it
scales Be sure that your hardware and system is
capable of scaling to not just hold 5000 objects,
but also serve them up in a timely fashion. What
about 10000? 55,000? FezFedora is currently
being tested with an ingest of over 200,000
records ! Backup, archiving and preservation
remain critical Issues of backup, archiving and
preservation of records which have been submitted
to RQF require careful consideration .. Can you
correct/edit records for the public repository
yet retain the archival version submitted? What
strategies have you in place to ensure that the
full-text versions remain readable and usable ?
AONS2 of course is a good answer
26
Other benefits of all this data
Development of the Universitys Research Profile
system With the migration of some content from
Research Master to the Repository it has created
an opportunity to redevelop some of the other
services which make ise of publications
information
Critical mass of publications in your IR Using
your IR to support RQF will most likely end up
filling it with an emormous amount ofd valuable
publications information .. Great for recruiting
further content or for showcasing aspects of your
institutions research If you are also able to
expose the metadata to harvester such as Google,
OAISTER etc you will see an enormous increase in
traffic to your IR and records too
Use the IR make your Research more Accessible By
depositing publications in your IR and including
legal version sof the outputs, you dramatically
increase the accessibility of your research
outputs
27
All done .. Thankyou!
For more information about UQ eSpace or this
presentation Andrew Bennett or Belinda
Weaver The University of Queensland Library The
University of QueenslandBrisbane QLD
4072 Australia Telephone 07 33464342 Web
http//espace.library.uq.edu.au Email
a.bennett_at_library.uq.edu.au
Write a Comment
User Comments (0)
About PowerShow.com