Title: Understanding library users you don't see
1Understanding library users you don't see
- Techniques for tracking and analyzing library Web
resources
Marshall Breeding Director for Innovative
Technologies and Research Vanderbilt
University http//staffweb.library.vanderbilt.edu/
breeding Marshall.breeding_at_vanderbilt.edu
Saturday June 24
2Theme
- For many libraries, the number of visitors of
their Web site and electronic resources exceeds
the numbers that visit their physical premises.
It's vital for libraries to understand how these
remote visitors approach the Web site, not only
to measure use but to improve the resources
themselves. Marshall Breeding will present a
number of practical techniques that libraries can
use to better understand the use of their
Web-based resources. - Topics will include the basics of analyzing the
server logs of the library's Web site,
transaction logs from the OPAC, the complexities
of measuring use of subscription-based electronic
resources, and techniques for enhancing
applications to better record how they are used.
3Understanding remote users
- Vital to providing relevant library services
- More libraries may use library resources remotely
through the Web than from physical library
facilities - Must work harder to ensure that Web-based
services meet patron needs - Move beyond hit counters and raw statistics to
more sophisticated analysis and assessment
4Analysis goals
- Improve usability
- Web site diagnostics
- Understand user needs
- Content selection decisions
- Improve quality of service
- Marketing
- Budget justification
- Strategy to increase interest and activity
5Data sources for tracking remote use
- Web server logs
- Application logs
- Remote tracking data (Google Analytics)
- Vendor provided use statistics (e-resources)
6Enterprise approach to analytics
- Multiplicity of Resources to track
- Web Servers
- OPACS
- E-Resources
- Databases
- Repositories
- Important to track the flow of use among all the
librarys Web-based resources - Beyond the library study flow to and from
higher-level Web sites and portals (University -gt
Courseware -gt Library)
7Web server logs
- Web servers are routinely configured to record
detailed information about each request. Common
elements include - File requested
- Date / time stamp
- Status code
- Request directive (get, post, head)
- Referrer (where the user came from)
- User agent (browser and platform data)
8Example Web log
- Raw data for analysis process
- 2006-06-20 050143 129.59.150.105 GET /index.pl
- 80 - c-69-250-131-199.hsd1.md.comcast.net
Mozilla/4.0(compatibleMSIE6.0WindowsNT5.1
SV1.NETCLR1.1.4322) http//www.google.com/sea
rch?hlenlrsafeoffqseptember11televisiona
rchive 200 0 0 11752
9Exploiting referral data
- The query string component of the referrer can be
parsed to reveal search terms and other
interesting information - http//www.google.com/search?hlenlrsafeoffq
september11televisionarchive - User typed september 11 television archive in
Google to find our site - Important to study how users get to your site
- example TV News Public Web queries vs OpenWeb)
10Analysis methodology
- Go beyond simply counting pages
- Identify Sessions
- Categorize users
- Determine use patterns
- Measure interest
- Time spent on Web site
- Bounce rate
- Page overlay analysis
11Move from measurement to impact
- Establish site goals
- Benchmark current use
- Implement goal oriented improvements
- Measure impact
- Repeat as needed
- (Example enhancement of TV News OpenWeb)
12Appropriate data filtering
- Requests from indexing bots (crawlers) can skew
statistics - Count user requests and bot requests separately
- Performance monitors
- Link checkers
- Monitoring crawler activity is an important
component of SEO and Web site discoverability
strategies.
13Resource Discovery
- How do users get to your site?
- Track performance of the Web site relative to
major search engines - SEO Search engine optimization
- Few users begin with library Web sites
14Troubling statistic
- Where do you typically begin your search for
information on a particular topic? - College Students Response
- 89 Search engines (Google 62)
- 2 Library Web Site (total respondents -gt 1)
- 2 Online Database
- 1 E-mail
- 1 Online News
- 1 Online bookstores
- 0 Instant Messaging / Online Chat
OCLC. Perceptions of Libraries and Information
Resources (2005) p. 1-17.
15Library Discovery Model
Web
Library Web Site / Catalog
Library as search Destination
16TV News OpenWeb project
- Dramatic increase in Web site activity and loan
requests through systematic and controlled
exposure of metadata to Google and other search
engines - SEO (Search Engine Optimization) strategy
- Helped the Archive become financially
self-sufficient.
17Examples of Web reporting and analysis tools
18Selected utilities
- Analog free, open source
- NetTracker enterprise level Web analysis
application - Google utilities
- Sitemap process for submitting Web pages for
optimized indexing by Google with some assessment
capabilities - Analytics Sophisticated approach for measuring
Web site performance
19Analog
- Free Open Source application
- Basic Web statistics application
- Includes fairly full set of static metrics
- Command line utility generates Web report
- Windows, Unix, Linux, etc.
20NetTracker
- Unica Corporation
- Enterprise level Web analytics
- http//www.sane.com/
21NetTracker Executive Dashboard
22NetTracker Bandwidth Trends
23NetTracker Content
24NetTracker Keyword Summary
25NetTracker Referrers
26NetTracker Pages Viewed
27Google SiteMaps
- XML specification for systematically submitting
URLs that represent a Web site - Makes indexing more efficient but does not affect
PageRank - SiteMap interface provides utilities for
monitoring how the site has been indexed with
some analytical information on terms used to find
your Web site.
28Google SiteMaps Top Searches
29Google SiteMaps Page Analysis
30Google Analytics
- Available at no cost from Google
- Must receive invitation code
- Slanted toward e-commerce
- Conversion University training on how to
optimize Web site for high conversion rates. - Allows Webmasters to establish site goals and
measure performance
31Google Analytics main
32Google Analytics overview
33Google Analytics Browser Versions
34Google Analytics Top Content
35Google Analytics Entrance-Bounce Rates
36Google Analytics Navigational Analysis
37Google Analytics Goal tracking
38Application-level reporting and analysis
- Content management systems and other dynamically
driven Web environments can provide additional
usage information. - Can offer additional information beyond raw Web
logs - More capabilities for identifying use based on
user categories - Reporting can be built into the business logic of
the application
39Examples from the TV News Web Site
- Reports of use by user category and institution
- Statistics on resource use
- Data on search types, query terms, etc.
- Ability to track all aspects of business activity
40Other sources of Use data
- ILS OPAC Logs
- Proxy Server logs and reports
- Link resolver logs and reports
41Limitations
- Cant know the intent of the user
- User success can only be estimated
- Difficult to obtain trends by user type
- More aggressive reporting might intrude on
privacy - Few libraries require the level of user
authentication needed to determine use by type of
patron
42Additional Information
- Breeding, Marshall. Strategies for Measuring and
Implementing E-use. ALA TechSource. May-June
2002. 79 pages. - Breeding, Marshall. Analyzing Web server logs to
improve a sites usage. Computers in Libraries.
Information Today. Medford, CT. October 2005.
43Handout
- Presentation will be available after the
conference at - http//staffweb.library.vanderbilt.edu/breeding/pr
esentations/ala2006.ppt