Parallel Session on Metadata

About This Presentation
Title:

Parallel Session on Metadata

Description:

Varied or no methods of central co-ordination (2 sites or campuses) ... Harder to co-ordinate, easier to resource? More often out of date? ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 28
Provided by: denn116

less

Transcript and Presenter's Notes

Title: Parallel Session on Metadata


1
Parallel Session on Metadata
  • The Value of Metadata and how to Realise it..
  • Date 18th June 2002
  • Facilitator Dennis Nicholson
  • Centre for Digital Library Research

2
Notes and Slides
3
Theme Examine, Discuss
  • .the value of using metadata as a aid to
    reliable retrieval both within individual Web
    sites and across distributed sites
  • .what the barriers to effective use of metadata
    are and how they can be overcome
  • .Who should be responsible for creating and
    maintaining metadata - resource creators
    web-masters librarians?

4
Theme Examine, Discuss
  • .Whether embedding and harvesting or a central
    database is the best approach.
  • plus (if time allows)
  • A step beyond, the value of Content Management
    Systems
  • Focus General
  • My background...

5
Responsibility to...
  • Stimulate
  • Thought Discussion Debate
  • Draw out the important points
  • Impart ability to apply what weve discovered
  • Ensure participation
  • So

6
Individual needs and circumstances?
7
Effective Retrieval
  • What is it?
  • Balance of precision and recall best suited to a
    given problem
  • High precision and low recall usually preferred
    but in some cases (e.g. patents) there may be an
    advantage in lowering precision to boost recall
  • Level of precision and recall should be under the
    users control not a side effect of poor metadata

8
Effective Retrieval
  • Why does it matter?
  • Costs University, public purse to create the
    material - a waste if the people it is aimed at
    cant find it
  • Strategic/PR considerations - If they cant find
    your courses or expertise registers or digital
    images for sale if and when you want or need them
    to they wont use you or talk or write about you

9
Effective Retrieval
  • When does it matter?
  • Only if it is stuff you want found
  • The bigger they come, the sooner they fail
  • The more stuff you have, the more campuses, or
    organisations in a collaboration,the harder it is
    to ensure effective retrieval
  • Especially with no or poor metadata

10
What is metadata?
  • Metadata is data about data
  • Consists of things like
  • Author Title Subject Description Level
    Language Viewer
  • Appropriate to function
  • The route to effective retrieval
  • Maybe...

11
What can go wrong?
  • Limited penetration (i.e. only some available
    documents covered)
  • Misleading results for users
  • Different metadata record formats
  • Can the software cope? Is there a cross-walk?
  • Incompatible core field sets
  • Cross-walk not possible

12
What can go wrong?
  • Different field sub-sets used (Both use DC but
    different field set)
  • Full service limited to common fields
  • Different fields used for same data element (I
    put subject headings in subject field and free
    form keywords in the keyword field but you put
    subject headings in the keyword field)
  • Misleading results

13
What can go wrong?
  • Different or no standards applied in creating
    data element content (e.g. Darwin, C. or Charles
    Darwin)
  • Reduced retrieval varied results
  • Different or no subject schemes and/or category
    lists (Educational levels, LCSH v. UNESCO v. made
    up)
  • Reduced retrieval varied results
  • Insufficient granularity (If everything physical
    is physics)
  • Poor precision, high recall

14
What can go wrong?
  • Varied or no methods of central co-ordination (2
    sites or campuses)
  • Can cause some of the other problems listed above
    and below
  • Different sites index different fields (One has
    subjects, keywords in one index, another in
    separate indices)
  • Misleading for users

15
What can go wrong?
  • Missing indices (Nothing on the subject in the
    index or no subject index? (2 sites))
  • Misleading retrieval
  • Humans can cope but machines cant (A machine
    finds it harder to spot different usages of the
    same word or alternative words for the same
    thing than a human does)
  • Semantic web wont work

16
Safeguards against
  • Limited penetration
  • Policy? Training? DC Dot? Human monitor?
  • Different formats
  • Discover need, agree policy, set standards,
    ensure software can cope with formats
  • Incompatible core field sets
  • Identify formats (DC, IMS, MARC?) then agree core
    set of fields (e.g. 15 in DC base)

17
Safeguards against
  • Different field sub-sets used
  • Agree, monitor, one core set
  • Different fields used for same data element
  • Templates and examples, Central co-ordination,
    Guidelines, Training

18
Safeguards against
  • Different or no standards applied in creating
    data element content
  • Template with examples
  • Different or no subject schemes and/or category
    lists
  • Agree single schemes or lists, have drop down
    lists, upgrade centrally

19
Safeguards against
  • Insufficient granularity
  • Agree usable level, training, examples
  • Varied or no methods of central co-ordination
    (2 sites or campuses)
  • Make sure it doesnt happen!
  • Different sites index different fields
  • Agree approach, implement and monitor standards

20
Safeguards against
  • Missing indices
  • Agree not to do this, and warn users if you cant
    agree
  • Humans can cope but machines cant (semantic web)
  • Use standard schemes, ontologies in standard ways
    and map between different ones in a way that your
    software can process

21
Where to keep it?
  • Pros and Cons of
  • Embedding and harvesting
  • Metadata creation more likely? Harder to
    co-ordinate, easier to resource? More often out
    of date? Harder to ensure standardised metadata?
  • A central database
  • Easier to co-ordinate, more expensive to
    resource? Easier to maintain standards? How to
    ensure new stuff notified?

22
Where to keep it?
  • Pros and Cons of
  • A mix of the two?
  • Worst of both worlds? Or best? How to ensure the
    latter? Optimise author input of embedded
    metadata but allow central upgrades by metatada
    experts? I this feasible? Is it cost-effective?
  • Depends on other factors?
  • A question of designing to be fit for purpose?

23
Whose Responsibility?
  • Candidates Their pros and cons
  • Resource creators?
  • Au fait with the resource Labour saving
  • Web-masters?
  • Au fait with the technical landscape
  • Librarians?
  • Au fait with knowledge and metadata domains
  • Public Relations?
  • Au fait with the needs of the University
  • Anybody else?
  • All of the above? Co-ordinated by?

24
Other Related Issues
  • A CMS would ensure
  • Currency Accuracy Legality Authority of
    Content retrieved by metadata
  • Not to mention
  • Uniform look and feel control easy total
    redesign and global changes all content tracked
    joint authorship across departments, units,
    different institutions easy repurposing
  • All who have some responsibility can be involved
    in controlled way?

25
Facilities
  • It would provide
  • Content authoring collaborative authoring
    editing and workflow preventing unauthorised
    editing or creation scheduling publication
    tracking changes personalising repurposing
    metadata creation knowledge management through
    semantic control

26
Closing Discussion
  • Who has/plans to have a CMS?
  • What does it/will it cost?
  • Are they
  • Essential? Optional? Impractical? A threat to
    academic freedom?
  • Do they help solve the metadata problem?

27
Useful URLs
  • Metadata
  • http//content.lib.washington.edu/METADATA/ (Why
    should we care?)
  • http//www.ukoln.ac.uk/metadata/dcdot/
  • http//www.ukoln.ac.uk/web-focus/metadata/seminar-
    materials/exercises/dc-dot/dc-dot.doc
  • http//www.ukoln.ac.uk/metadata/dcassist/
  • Content Management Systems
  • http//www.ukoln.ac.uk/nof/support/help/papers/cms
    .htm (what are they?)
  • http//www.ariadne.ac.uk/issue30/techwatch/ (Who
    needs them?)
  • http//www.cultivate-int.org/issue5/cms/ (CMSs
    available)
Write a Comment
User Comments (0)