Title: Seeding search engines with data from the Australian National Bibliographic Database Tony Boston Assistant Director-General Resource Sharing National Library of Australia 18 September 2006
1Seeding search engines with data from the
Australian National Bibliographic DatabaseTony
BostonAssistant Director-GeneralResource
SharingNational Library of Australia18
September 2006
2Outline
- Why seed search engines?
- National Library Digital Collections
- Libraries Australia and the ANBD
- Results to date
- Future directions
3Why seed search engines?
- To provide new discovery pathways for users of
Australian libraries and increase exposure of the
Libraries Australia Search service on the Internet
4Why people use search engines
- Self-service, satisfaction, seamlessness1
- 89 of US college students start research process
via Search Engines1 - Principle of Least Effort2
- Poor design of library systems, eg complexity,
lack of relevance ranking3 - 12003 OCLC environmental scan Pattern
recognition. C. De Rosa, L. Dempsey and A. Wilson
January 2004 - 2Improving user access to library catalog and
portal information final report. M. J. Bates
2003 - 3Rethinking How We provide Bibliographic Services
for the University of California. Final report
December 2005 Bibliographic Services Task Force
5National Library Digital Collections
- 100,000 items from the Librarys collection
digitised since 1996. - Pictures, maps, sheet music, manuscripts, some
books and serials
6More pathways, more users
- Three major pathways to collection items
- Via the Librarys catalogue
- Via federated discovery services, eg Libraries
Australia, MusicAustralia, PictureAustralia - Via Internet search engines
- URL lists for search engines to harvest
- Persistent URLs resolve to a page to be indexed,
eg - http//nla.gov.au/nla.pic-an22948286
7Search engine indexing and access paths
8(No Transcript)
9Libraries Australia and the ANBD
- Free Libraries Australia Search service
- Launched by Senator Helen Coonan on 27 February
2006 - Freely available to anyone with Internet access
- Easy to use, google like search
- Records exported to Search Engines
- March 2006 700,000 records matched to Google
Scholar - Simple XML format
- July 2006 1.4 M records matched to Google
Scholar - MARCXML format
- August 2006 Records added to Google Book Search
- End 2006 Records in main google.com and
yahoo.com index
10Googles Union Catalogue Program
- Data obtained from 12 union catalogues
- Australia, China, Czech Republic, Denmark,
Ireland, Israel, Hungary, Lithuania, Netherlands,
Taiwan, United Kingdom and United States
(WorldCat), - Links to union catalogues vary based on IP
address of user - Issues
- Unique material
- Matching algorithm
- Trust, profit and motivation
11(No Transcript)
12(No Transcript)
13Libraries and the long tail
- 80 of people want just 20 of any collection
- 80 of the collection requested rarely
- The long tail of sporadic usage
- Represents a new business model gt NetFlix
- Fewer, larger resources gt Union Catalogues
- Better fulfilment home delivery, universal Get
it button - Project library services into Web 2.0 world
14Future directions
- Enhancements to Libraries Australia
- Relevance Ranking
- Clustering and faceted browsing
- Annotation and links to value added services
- Improved getting
- Exposure of the ANBD in google.com, yahoo.com
- Libraries Australia search box
- Relationship with OCLC and Open WorldCat
15(No Transcript)
16(No Transcript)
17Libraries Australia search box
18Conclusions
- Seeding search engines generates
- More discovery pathways gt more users
- Union catalogues support
- The business of libraries
- Comprehensiveness gt Exposing the long tail
- More specialised services beyond search engines
- Improved getting of items