Title: Building Quality Assurance into Metadata Creation An Analysis Based on the Learning Objects and ePri
1Building Quality Assurance into Metadata
CreationAn Analysis Based on the Learning
Objects and e-Prints Communities of Practice
- Jane Barton, Centre for Digital Library Research
- Sarah Currier, Centre for Academic Practice
- University of Strathclyde, UK
- Jessie M.N. Hey, Intelligence, Agents, Multimedia
Group and University Library - University of Southampton
-
2Researchers from two countries discuss two
communities
3Scope of Paper
- Metadata creation for two parallel communities
learning object repositories and open e-Print
archives - The content of the metadata record, not the
structure - Human-generated metadata only
- Assuring the quality of this process
- Metadata will only support effective discovery if
it is accurate, consistent, sufficient, and thus
reliable - (Greenberg and Robertson (2002) Semantic Web
construction an inquiry of authors views on
collaborative metadata generation. Proceedings of
DC2002, 45-52.)
4 in the beginning (LO community)
- the authoring of metadata itself will be
straightforward for most course designers.
Because metadata files are machine-writable,
authors will simply access a form into which they
enter the appropriate metadata information. - (5 Downes, 2001)
5 in the beginning (e-Prints Community)
- Physicists deposited academic papers in global
arXiv - Interoperability framework created Open Archives
Initiative Protocol for Metadata Harvesting
(OAI-PMH) - Emphasis on examining and changing the culture
within academia to encourage deposit of e-prints - Wider goal of changing the unsustainable
economics of scholarly communication - Focus on participation - anything perceived as a
barrier between academics and institutions tends
to be played down (e.g. metadata creation issues)
6 but is it really so simple?assumptions in
e-learning e-prints
- Internet culture mediation by controlling
authorities detrimental undesirable - Time-consuming, costly, barrier to uptake of
technology (tedious and difficult) - Only authors/users understand their resources
- Deus ex machina?
7 but is it really so simple?some case studies
- Quality of author-generated metadata?
- Higher Level Skills for Industry Project (HLSI) -
University of Huddersfield - e-Prints service providers UPS and Arc
- Collaboration between authors and specialists?
- Bolton Woods Local History Project
- e-Prints data providers TARDIS
- Specialist help needed?
- Scottish electronic Staff Development Library
(SeSDL) - ePrints UK and TARDIS
8Quality Control? The HLSI Project
- 6,500 learning objects with IEEE LOM metadata
records created by authors - The same metadata records for many or all
components of a content package - Inconsistent terminology
- Description of facets and characteristics of the
educational object and not of the content - Over-use of software default values
- Information scientists brought in at Jun. 03
2,500 metadata records re-edited, taking ca 550
hours and costing ca 6500 (2.60 ea.) - (19 Ryan and Walmsley, 2003 Ryan, B. (2003)
Creating, Using and Re-using Learning Objects.
HLSI Project. ppt presentation Online
http//www.cetis.ac.uk/groups/20010809144711/FR200
30807121739)
9Quality Control? UPS Preprint Service
- UPS (Universal Preprint Service Prototype)
- Slightly pre-OAI used NCSTRL Protocol to
harvest ca. 200,000 records from existing
archives, made available through single user
interface - The lack of quality of the metadata available in
the UPS Prototype project has an important,
baleful influence on the creation of
cross-archive services as well as on the quality
of services that can be created. -
- (26 Van de Sompel,
H. et al., 2000)
10Quality Control? Arc search service
- Arc search service first prototype using OAI
- The effort of maintaining a quality federation
service is highly dependent on the quality of the
data providers. Some are meticulous in
maintaining exacting metadata records that need
no corrective actions. Other data providers have
problems maintaining even a minimum set of
metadata and the records harvested are useless. -
(27 Liu, X. et al., 2001)
11Quality Control? an aside
- Even when theres a positive benefit to creating
good metadata, people steadfastly refuse to
exercise care and diligence in their metadata
creation. Take eBay every seller there has a
damned good reason for double-checking their
listings for typos and misspellings. Try
searching for plam on eBay. Right now, that
turns up nine typoed listings for Plam Pilots.
Misspelled listings dont show up in correctly
spelled searches and hence garner fewer bids and
lower sale-prices. You can almost always get a
bargain on a Plam Pilot at eBay. - (17 Doctorow, 2002 Metacrap
Putting the Torch to the Seven Straw Men of the
Meta-Utopia)
12Collaboration? Findings from the Bolton Woods
Local History Project
- Study compared resource authors information
scientists metadata - Authors did not have a good understanding of
purpose or value of metadata - Authors understood the context of resources and
focused on these elements - Information specialists understood the purpose of
metadata and included a wider range of metadata
elements, but "struggled" with contextual aspects
of the metadata - Neither handled pedagogic aspects of the
resources well - (21 O'Beirne, 2002)
13Collaboration? The TARDIS project Targeting
Academic Resources for Deposit and Disclosure
- UK JISC funded FAIR Programme cluster of
projects exploring different aspects - Pilot departments metadata errors suggested
modifying approach - Exploring self-archiving and mediated deposit
together - Trialing simpler interface to GNU EPrints
software for author-generated metadata - Testing value of targeted help more logical
field order examples created by information
specialists fields required for good citation - Mediated service for daunted authors also being
trialled and evaluated.
14Aiding deposit process in TARDIS
15Collaboration? support from Semantic Web-based
DC research
- the integration of expert and author generated
descriptive metadata can advance and improve the
quality of metadata for web content, which in
turn could provide useful data for intelligent
web agents, ultimately supporting the development
of the Semantic Web. If such partnerships are
well planned and evaluated, they could make a
significant contribution to achieving the
Semantic Web. - (Greenberg and Robertson (2002) Semantic Web
construction an inquiry of authors views on
collaborative metadata generation. Proceedings of
DC2002, 45-52.)
16Specialists needed? Scottish electronic Staff
Development Library
- SeSDL Taxonomy Evaluation involved 6 users
subject classifying resources - Out of 106 classifications, only 35 had
agreement of more than one user - E.g. Resource defining VLE and MLE was
classified Student-Centred Learning and
Collaborative Learning by one user - Without adequate user support, classification is
likely to be so inconsistent as to make the
browse tree unusable - The whole exercise has given me more admiration
and respect for librarians--(user) - (23 Currier, 2001)
17Specialists needed? ePrints UK and TARDIS
- TARDIS examined current diverse subject
classification practices of e-Print archives is
experimenting with simple standard and additional
specialised subject community options and
mediated entry - ePrints UK is experimenting with use of an
automatic subject-classification Web service
offered by OCLC -
18Specialists needed? Research in commercial
database abstracting indexing services shows
- that authors may lack knowledge of indexing
and cataloguing principles and practices, and are
more likely to generate insufficient and poor
quality metadata that may hamper resource
discovery - (Greenberg and Robertson (2002) Semantic Web
construction an inquiry of authors views on
collaborative metadata generation. Proceedings of
DC2002, 45-52.)
19Lets revisit thoseassumptions in e-learning
e-prints
- Some expert mediation may be beneficial.
(Metadata does not control access to resources,
it provides access to resources) - Cost-benefit analysis necessary metadata
metrics. - Authors/users expertise can be incorporated
into the process but metadata specialists have a
role to play. - All problems not resolvable by machine.
20Conclusion
- The metadata creation process is not trivial and
needs appropriate planning and management to
assure quality and thus enable sharing and reuse
of resources - Further research is needed to understand how this
can best be achieved - What constitutes good quality metadata?
- Who should create metadata and how?
- How can metadata tools support the process?
- How can support and training be facilitated?
21Resources
- This paper is built on Quality Assurance for
Digital Learning Object Repositories How Should
Metadata Be Created? (Currier, Barton ALT-C
2003) http//metadata.cetis.ac.uk/guides/usage_su
rvey - For further info / discussion on LO metadata, see
CETIS Metadata SIG http//metadata.cetis.ac.uk/ - For e-Prints metadata developments, see
- FAIR Focus on Access to Institutional
Resources (FAIR) Programme - TARDIS http//tardis.eprints.org
22Our help required!
- I am a great believer in working towards
quality assurance. I just never get there. -
- University of Washington faculty member
- September 2003