Title: New Approaches To Resource Discovery In The UK HE Community
1New Approaches To Resource Discovery In The UK HE
Community
- Aims of Talk
- Review approaches taken by UK HE community
- Overview of eLib phase 3 projects and development
of the DNER - Discussion of architectural models, software
development and funding regimes
- Brian Kelly
- UK Web Focus
- UKOLN
- University of Bath
- Bath, BA2 7AY
- Email B.Kelly_at_ukoln.ac.uk
- URL http//www.ukoln.ac.uk/
UKOLN is funded by Resource The Council for
Museums, Archives and Libraries, the Joint
Information Systems Committee (JISC) of the
Higher and Further Education Funding Councils, as
well as by project funding from the JISC and the
European Union. UKOLN also receives support
from the University of Bath where it is based.
2Contents
3Which To Choose?
Can choose byreading reviews, web sites, etc. or
by looking at usage in community
- Glimpse
- Harvest
- ht//Dig
- ICE
- iHound (ICATT)
- Index Search (Xavatoria)
- Index Server (Microsoft)
- IndexMySite (remote)
- Infoseek - Ultraseek
- Intermediate Search
- intraSearch (remote)
- I-Search
- Isearch
- ITMS
- Isysweb
- Java Applets
- JHLSearch
- JObjects QuestAgent
- Lycos / InMagic
- Alkaline (Vestris)
- AltaVista - Search Intranet
- ASTAWare SearchKey
- atomz Search (remote)
- BooleanSearch
- BBDBot
- BRS/Search (Dataware)
- Compass Server (Netscape)
- Cybotics
- DataWare BRS/Search
- DocFather (formerly SiteSearch)
- dtSearch Web
- Excalibur RetrievalWare
- EWS (Excite)
- Excerpt (Obsolete)
- Extense
- FAST Search Server
- Findex (code library)
- Folio siteDirector
- Magnifi Enterprise Server
- Matt's SimpleSearch
- Microsoft Index Server
- Microsoft Site Server
- MiniSearch (remote)
- MondoSearch
- Muscat
- NetResults (now SearchKey Plus)
- Netscape - Compass Server
- OpenText - LiveLink
- Perl Scripts
- Perlfect Search
- Phantom (Maxum)
- PicoSearch (remote)
- Etc.
Software from lthttp//searchtools.com/tools/tools.
htmlgt Which to choose? What software may be
obsolete? What does remote mean?
4Findings UK HE Web Sites
- Main findings of 3 surveys
Nos. inAug 2000
Nos. inMar 2000
Nos. in Jul 1999
Software
32
42
ht//Dig
25
?
17
eXcite
19
9
?
15
Microsoft
12
18
?
6
3
Harvest
8
?
9
11
Ultraseek
7
?
34
31
Other
29
None
60
50
44
?
160
163
Totals
163
- Article published in Ariadne issue 21 -
lthttp//www.ariadne.ac.uk/issue21/webwatch/gt - Results (including update on survey) available
fromlthttp//www.ukoln.ac.uk/web-focus/surveys/uk
-he-search-engines/gt
5Popular Product ht//Dig
- ht//Dig
- Now used at 42 (up from 25 then 32) UK HEIs
- Freely available
- Own domain with well-designed web site
- Robot to index multiple servers
See lthttp//www.htdig.org/gt
Oxford Case Study 131 servers 438,500
resources Indexes MS Office, PDF, etc. files
(external parser)
Issue Web community not interested in non-Web
resources?
6National Search Engines
- ACDC (Academic Directory)
- (Unfunded) pilot of index of ac.uk domain based
on distributed approach using Harvest - Set up in March 1996
- Lack of development effort resulted in degraded
service (e.g. indexer not aware of JavaScript
code)
http//acdc.hensa.ac.uk/
Issues Problems with volunteer effort once
enthusiasm wanes Lack of user involvement can
limit acceptance Lack of funding body involvement
can mean lessons learnt are lost
7Institutional Developments
- Maestro robot (Dundee)
- Indexes Scottish resources
- Individual or all sites
- Volunteer effort
- Interesting application for OS/2
8eLib Subject Gateways
- SOSIG is an example of subject gateway initially
funded by eLib - SOSIG provides access to manually catalogued
resources in Social Sciences - Involvement with Social Science community has
helped acceptance
9ROADS
- ROADS software used to support several gateways
- Key features
- Open source
- Support for whois
- Momentum behind software meant
- Uptake in other communities
- Additional developments (e.g. ROADS/Z39.50
gateway) - But
- Whois standard failed to take off
10Approaches Taken By Hybrid Libraries Projects
- Lets look at some of the approaches taken by
some of the eLib Phase 3 Hybrid Libraries
projects which help users find electronic and
"real world" resources - Agora
- Use of Z39.50 and Collection Level Descriptions
- Working with a commercial software vendor
- Headline
- Provision of a personalised interface
- An open source approach
- BUILDER
- Searching across Hybrid Library Web sites
- Authenticated access to exam papers
- Making use of locally available applications
11Agora (1)
- In the Agora Hybrid Library the user can choose a
Landscape
12Agora (2)
- The landscape may be a collection of resources
individual collections can be selected
13Agora (3)
- Collections are defined using the Collection
Level Description agreed by eLib projects
14Agora (4)
- Results from local collections are usually
returned first
15Agora (5)
- The results can be viewed directly or requested
using ILL
16Agora (6)
- The results are retrieved simultaneously
17Agora (7)
- Results from AltaVista obtained using
HTML-scraping technique
18Headline (1)
http//www.headline.ac.uk/publications/pie/Pete's
Page1.html
- Headlines PIE (Personal Information Environment)
provides a personalised interface to Hybrid
Libraries resources. - Here is Petes (an Economics UG student) default
information landscape -
19Headline (2)
- Pete selects the All Resources link
- This gives a list of all the Library resources
and services that Pete is entitled to use
20Headline (3)
- Pete adds the Economic Systems Research journal
to his list of resources
21Headline (4)
- Pete now clicks on the Customise option near the
top of the window - He can now add the journal to his resources for
This Weeks Essay
22Headline (5)
- Pete now carries out additional research
- He selects collections of interest and then
searches for Japan and emerging markets
23Headline (6)
- Pete expands the results for Unicorn
24Headline (7)
- and then views a map showing the physical
location - This illustrates how Headline supports access to
physical objects as well as digital resources.
25Headline (8)
- Finally Pete expands the results from Decomate
- These are PDF documents which can be viewed
directly
26BUILDER (1)
- BUILDER (Birmingham University Integrated Library
Development and Electronic Resource) provides a
number of hybrid library demonstrators
The Microsoft SiteServer indexer is used to index
across other Hybrid Libraries (and Clumps)
projects Notice branding of the results
Authentication is provided using the Novell NDS
which provides access to the institutional
network
27Issues
- The different approaches to software development
- Make use of (and work with) commercial products
- Benefit from market-tested products
- More realistic awareness of commercial acceptance
- Relationships may be difficult
- May be sucked into use of proprietary solutions
- Develop open source software and use
complementing open source products - Flexibility in adopting emerging new standards
- Requires technical expertise to develop and
maintain - Management resistance, esp. if fails to gain
momentum - Pragmatic approach in using existing tools
- Makes use of existing tools and expertise
- Can quickly develop prototypes which can help
gain support for services - May be architecturally flawed and make use of
proprietary solutions
28Tools (1)
http//www.ukoln.ac.uk/metadata/dcdot/
- A variety of open source tools are being
developed within the community. - DC-dot, developed by UKOLN, can be used to assist
the creation of Dublin Core metadata. - The metadata can be generated in various formats
such as HTML and RDF.
29Tools (2)
http//www.ukoln.ac.uk/metadata/rslp/tool/
- UKOLN has also developed a tool for creating
collection level descriptions to support projects
funded by RSLP (Research Support Libraries
Programme), another HE funded programme
30From Hybrid Libraries to the DNER
- Hybrid Libraries projects are addressing
- Needs for users to find variety of resources
- Need to gain experiences from projects
- The DNER
- Distributed National Electronic Resource
- Building on Hybrid Libraries project experiences
- Focus on services rather than projects
- Aims to provide seamless access to quality
resources - Is developing a standards-based architectural
framework
31DNER Architecture
- Areas of interest include
- Collection descriptions
- User profiles
- Identifiers
- Emphasis on interoperability through use of
standards - Work currently in progress
32Currently...
Local content
National content
International content
End user
33Currently...
Local content
National content
International content
Collection Description(e.g. Agora)
User Profile(e.g. Headline)
Authentication (Athens)
34Future...
Content
Web
Web
Web
Web
Web
Collection description
User profile
End user
Authentication (Sparta)
35Future...
Content
Subject portal or institutional portal or MLE or
...
Collection description
Portal
User profile
End user
Authentication (Sparta)
36Sharing content
- How do portals and content servers interact?
- Technologies currently being investigated
- HTTP
- Z39.50 - Bath Profile
- OAI - Open Archive Initiative
- RSS - Rich Site Summary / RDF Site Summary
37Open Archives Initiative
- OAI Metadata Harvesting Framework
- Simple mechanism for sharing metadata records
- Records shared over HTTP...
- ... as XML
- Client can ask metadata server for
- all records
- all records modified in last n days
- info about databases, formats, etc.
- See lthttp//www.openarchives.org/gt
38RSS
- RSS (Rich Site Summary)
- XML application for syndicated news feeds
- Pointers and simple descriptions of news items
(not the items themselves) - Being transitioned to more generic RDF/XML
application (RSS 1.0) - No querying - just regular gathering of RSS
file - See lthttp//rssxpress.ukoln.ac.uk/gt
39Future... Z, OAI, RSS
Content
RSS
OAI
HTTP
Z39.50
Collection description
Portal or MLE or ...
User profiles
End user
HTTP
Authentication (Sparta)
40Content Identification
- Need to persistently identify stuff to
- Enable lecturers to embed it into learning
resources - Enable students to embed it into multimedia
essays - Enable people to cite it
- ... so lets look at a current example (from
VADS)
41Content Example
42Content example - the URL
http//vads.ahds.ac.uk/ixbin/hixclient?_IXDB_vads
_IXSPFX_t_MREF_3392_IXSR_ea1_IXSP_0_IXSS_
25242brec2bvads2band2bseaside2band2b2528
2528Basic2bDesign2bCollection2bin2btitle_vads_
collection25292bor2b2528Halliwell2bCollection
2bin2btitle_vads_collection25292bor2b2528Imp
erial2bWar2bMuseum2bConcise2bArt2bCollection
2bin2btitle_vads_collection25292bor2b2528Lond
on2bCollege2bof2bFashion2bCollege2bArchive2b
in2btitle_vads_collection252925292bsort2btitl
e2b3d252e26_IXDB_3dvads_IXRECNUM3392_IXASE
ARCHSUBMIT-BUTTONDISPLAY
Be nicer if the content URL was something
like http//vads.ahds.ac.uk/id137234-849783 http
//dx.doi.org/10.3456/1096493
43Identifiers
- Could use URLs, PURLs, DOIs, ... but...
- URLs are locators not identifiers
- DOIs and PURLs resolved centrally
- All resolve to same thing irrespective of
who/where the user is e.g. - 10.1045/october2000-granger always resolves to US
version even though D-Lib mirrored in UK - http//purl.org/dc always resolves to US version
even though DC pages mirrored in UK - DOI and PURL are resolved using a US resolver
44Identifiers
- Need some way to encode
- identifier
- citation
- in such a way that resolution happens in the
context of - The location of the end user
- The access rights of the end user
- this can be achieved with OpenURL and SFX
- See lthttp//www.sfxit.com/gt for further
information -
45Development of Standards
- As well as designing an architecture to support
interoperability based on open standards there is
a need to be involved in standards development
work - Warwick Framework
- A framework for metadata applications, which
informed W3Cs RDF work - Dublin Core
- eLib community has been actively involved with
Dublin Core development - Bath Profile
- Bath Profile for Z39.50 defines core attributes
for library applications
46Whats Happening Elsewhere?
- A number of EU-funded projects and joint UK/US
projects are involved in related activities,
including - Renardus
- EU project to develop an academic subject gateway
service for Europe - SCHEMAS
- SCHEMAS provides a forum for metadata schema
designers involved in EU-funded projects and
national initiatives - IMESH
- Joint JISC/NSF funded project to develop a
configurable, reusable and extensible toolkit
for subject gateway providers
47Renardus
http//www.renardus.org/
- Renardus
- Will build a pilot European broker service
offering subject-based access to collections of
information to support learning, teaching
research using Z39.50 - An open source approach e.g. making use of
Zebra (www.indexdata.dk)
48SCHEMAS
http//www.schemas-forum.org/
- To support EU projects SCHEMAS will
- Monitor metadata developments
- Organise workshops
- Provide a registry of schemas
- The use of RDF to store schemas in a
machine-readable way is being investigated - Will make use of commercial software (EOR from
OCLC)
49IMESH
http//www.desire.org/html/subjectgateways/commun
ity/imesh/
- A joint JISC/FSF funded project
- Will develop open sources tools for use by
developers of subject gateways
50Conclusions
- This talk has provided examples of new approaches
to resource discovery within the UK Higher
Education community - A number of case studies have been looked at and
the following issues addressed - Standards
- Approaches to software development
- The funding regime
51Standards
- There is
- Awareness of the importance of standards
- Some involvement in development of standards
(e.g. Dublin Core) and community agreements (e.g.
collection level descriptions) - Key standards
- XML Dublin Core
- Z39.50 political backing and by library
community, but less enthusiasm from software
developers - RDF some enthusiasts, used in some projects, but
also sceptics (too complex, lack of widespread
support) - DOIs, OpenURLs, etc Interest by early adopters
- Authentication (digital signatures, etc)
difficult - User profiles early days
52Software Development
- There are a variety of approaches to software
development - Development of Open Source software
- Use of commercial software / joint projects with
commercial software vendors, etc. - The pros and cons of these approaches are well
known There is probably no best single approach
applicable for all - Interoperability through use of open standards is
the key lets be agnostic over this argument
53Funding Regimes
- Volunteer effort by enthusiasts can be useful
(cf. the Web in 1993) but this approach has
limitations - Large scale programmes, such as eLib, can result
in significant developments - The transition from projects to services is
essential and may be difficult - Building on national initiatives through
international collaboration will provide fresh
insights and address unforeseen interoperability
issues
54Question Time
Acknowledgements Thanks to Andy Powell, Leona
Carpenter, Rachel Heery and my other colleagues
in UKOLN and members of eLib Hybrid Libraries
projects for their help with this presentation