Title: Evaluation Workshop: Quantitative Evaluation Methods
1Evaluation WorkshopQuantitative Evaluation
Methods
Peter Dowdell NOF-digitise Technical Advisory
Service email p.dowdell_at_ukoln.ac.uk web
http//www.ukoln.ac.uk
UKOLN is supported by
2Aims of this presentation
To explain the need to establish a measurement
policy using a range of performance
indicators To explain some of the possible
measurements we can record To discuss some of
the pitfalls and problem areas in attempting to
measure performance
3Why Have Performance Indicators?
- Performance indicators for Web sites can be used
for several purposes - Use in management reports showing service growth
- For Service Level Agreements with funding
agencies - To identify gaps in service provision
- To predict and plan for future load patterns
- To monitor performance levels
- To advise on deployment of new technologies
- To inform and motivate contributors
4Why have a measurement policy?
- To create stable view points over your site.
- To make sense of the available data.
- To clarify and measure broader objectives.
- To answer questions.
5What are we logging?
- We log each request the web server receives.
- We record information like
- remote IP address
- date and time
- response code
- request string
- method GET, POST etc
- execution time
- data transfer
6How can we log this?
Each request is appended to a log file. There
are different accepted formats W3C
Alternatively requests can be logged to a
database. Log files can become very
LARGE! Should we start a new log according
to daily / weekly / monthly / by file size?
7Server logs
Software Microsoft Internet Information Server
4.0 Version 1.0 Date 1999-12-25 000021
Fields date time c-ip cs-username cs-method
cs-uri-stem cs-uri-query sc-status sc-bytes
cs(User-Agent) cs(Cookie) cs(Referer) 1999-12-25
000021 194.237.174.119 - GET /issue1/jobs/Defaul
t.asp - 200 20407 AltaVista-Intranet/V2.3A(www.al
tavista.co.ukjan.gelin_at_av.com) - - 1999-12-25
000339 194.237.174.119 - GET /statistics/ExpIntH
its1.asp - 200 10519 AltaVista-Intranet/V2.3A(www
.altavista.co.ukjan.gelin_at_av.com) - -
1999-12-25 002654 209.67.247.158 - GET
/robots.txt - 200 303 FAST-WebCrawler/2.0.9(crawl
er_at_fast.nohttp//www.fast.no/) - - 1999-12-25
003247 194.237.174.119 - GET /issue2/default.asp
- 200 5332 AltaVista-Intranet/V2.3A(www.altavist
a.co.ukjan.gelin_at_av.com) - - 1999-12-25
014954 206.186.25.7 - GET /resources/images/main
/bg.gif - 200 300 Mozilla/2.0(compatibleMSIE3.
02AKWindowsNT) ASPSESSIONIDGQQGQGADIIHCBIFDI
ECKPAPGICDEOJIISITESERVERID22e0a17296b8c2ed1f7
7460cde75c27f http//www.exploit-lib.org/issue1/we
btechs/ 1999-12-25 014954 206.186.25.7 - GET
/issue1/webtechs/Default.asp - 200 24659
Mozilla/2.0(compatibleMSIE3.02AKWindowsNT
) - http//www.statslab.cam.ac.uk/7Esret1/analog/
webtechs.html 1999-12-25 014954 206.186.25.7
- GET /resources/images/main/global_home_h.gif -
200 487 Mozilla/2.0(compatibleMSIE3.02AKWi
ndowsNT) ASPSESSIONIDGQQGQGADIIHCBIFDIECKPAPGICD
EOJIISITESERVERID22e0a17296b8c2ed1f77460cde75c
27f http//www.exploit-lib.org/issue1/webtechs/
1999-12-25 014954 206.186.25.7 - GET
/resources/images/main/global_search.gif - 200
534 Mozilla/2.0(compatibleMSIE3.02AKWindow
sNT) ASPSESSIONIDGQQGQGADIIHCBIFDIECKPAPGICDEOJI
ISITESERVERID22e0a17296b8c2ed1f77460cde75c27f
http//www.exploit-lib.org/issue1/webtechs/
1999-12-25 014956 206.186.25.7 - GET
/resources/images/main/local_home01.gif - 200 663
Mozilla/2.0(compatibleMSIE3.02AKWindowsNT
) ASPSESSIONIDGQQGQGADIIHCBIFDIECKPAPGICDEOJIIS
ITESERVERID22e0a17296b8c2ed1f77460cde75c27f
http//www.exploit-lib.org/issue1/webtechs/
8How to analyse the log files
Analog ( http//www.analog.cx ) Webaliser (
http//www.mrunix.net/webalizer ) WebTrends (
http//www.webtrends.com ) Bespoke? One of the
scripting solutions?
9What do we measure?
- Hits the overall number of requests that the
server is handling. Includes all files making up
a web page. - Pages the number of files designated as base
pages determined by file extension .htm /
.html / .asp / .php / .cfm? - Visits many assumptions must be made
- User Agents ..
- Total Data transfer
- Average processing time
- Search terms in referrer string
- Failed requests
10What are the problems?
Robots and other agents Developers and in-house
access Caches and Proxy servers can conceal site
usage IP addresses lookup can mislead IP
address can mask multiple users firewalls, NAT
11Fluctuations in Hits Requests
- In 2000 images are introduced across a web site
(two images per text page)Result Nos. of hits
trebles, while number of page requests remains
constant - In 2001 external JavaScript files are used to
animate menus when they are selectedResult
Nos. of hits increases while number of page
requests remain constant - In 2002 internal style sheets are used to replace
images ofResult Nos. of hits decrease while
number of page requests remain constant
12How to access the results
Who needs to see the processed reports? Do you
need a private area on your website? Will you
allow 3rd party access to reports, possibly to a
reduced set of information? How much
configuration and re-processing will you allow?
13External Services
- Server usage can also be determined by
third-party services - www.nedstat.com
- www.sitemeter.com
- Non-commercial only - or you pay!
- No guarantee of service
- Includes client-side sniffing
14Client-side sniffing
Some interesting statistics are not derivable
from server logs - Browser plug-in status -
Javascript support - CSS support Client
sniffing code can be used to log this
information. Browser cookies may allow us to
detect individual user sessions.
15Service Monitoring
We also would like to know that our service is
available Remote monitoring services Alerting
or reporting? Does this overlap with your
hosting SLA?
16Link monitoring
How is the site ranked in search engines? Use
url syntax in common search engines How many
sites have linked to you? www.linkpopularity.com
17Coverage By Search Engines
- Have you promoted your Web site?
- Can your Web site be accessed by search engines?
- Are you near the top of the search results?
- Search engines can report on their coverage of
your Web site - Coverage is an indication of potential use of
your Web site
For information on how to ensure that your web
site has been indexed see lthttp//www.exploit-lib.
org/issue4/promotion/gt
18Links To Your Site
www.linkpopularity.com
- Search engines can be used to report on the
numbers of links to a Web site - LinkPopularity.com provides an interface to 3
search engines - Monthly reports can be obtained
- Links are an indication of potential use of your
Web site
A survey of the number of links to University web
sites is available at lthttp//www.ariadne.ac.uk/i
ssue23/web-watch/gt.
19Links From Your Web Site
- Links from your Web site
- Usually implemented usinglta href"http//foo.com
/"gtFoolt/agt - Not normally possible to monitor nos. of users
following link - Is possible if using link of the formlta
href"cgi-bin/monitor.pl?urlfoo.com"gtFoolt/agt
20Considerations
What will we measure? How often will we produce
reports? How will we handle our raw server
logs? Will we be able to view the results over
the web? Will we need different levels of
reporting detail for different users? Technical /
Executive / 3rd party?