Evaluating Web Server Log Analysis Tools - PowerPoint PPT Presentation

About This Presentation
Title:

Evaluating Web Server Log Analysis Tools

Description:

What you can and can't learn from your logs. Pros and cons of various tools ... ls973.ulib.albany.edu [31/May/1995:09:40:52 0600] 'GET /viii1.html HTTP/1.0' 404 244 ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 28
Provided by: ds3
Category:

less

Transcript and Presenter's Notes

Title: Evaluating Web Server Log Analysis Tools


1
Evaluating Web Server Log Analysis Tools
  • David Strom
  • david_at_strom.com
  • SD98 2/13/98

2
Summary
  • Examine different log files
  • What you can and cant learn from your logs
  • Pros and cons of various tools

3
Different types of log files
  • Access
  • Error
  • Referral
  • Other

4
Access logs
  • Domain name
  • Date, time
  • Server command processed and result
  • URL of visitor
  • Bytes transmitted

5
Sample access log data
  • rm258.fav.usu.edu 31/May/1995090323 0600
    "GET /NEI.html HTTP/1.0" 302 396
  • rm258.fav.usu.edu 31/May/1995090328 0600
    "GET /xculture/nei/nei.html HTTP/1.0" 200 2114
  • rm258.fav.usu.edu 31/May/1995090330 0600
    "GET /gifs/sedlbutton.gif HTTP/1.0" 200 1336
  • 129.71.83.161 31/May/1995092032 0600 "GET
    /RELs.html HTTP/1.0" 304 0
  • Leslie-Francis.tenet.edu 31/May/1995093606
    0600 "GET / HTTP/1.0" 200 1867
  • ls973.ulib.albany.edu 31/May/1995094052
    0600 "GET /viii1.html HTTP/1.0" 404 244

6
Errors reported in your logs
  • Clients that time out (or leave in frustration!)
  • Scripts that dont produce any output
  • Server bugs
  • User authentication or configuration problems

7
Sample error log data
  • Thu May 30 072532 1996 send timed out for
    bamberg.sedl.org
  • Thu May 30 075741 1996 send timed out for
    kenya.sedl.org
  • Thu May 30 082311 1996 send timed out for
    ppp092.kyoto-inet.or.jp
  • Thu May 30 091552 1996 access to
    /usr/local/www/htdocs/scimath/compass/vol03
    failed for 170.211.67.51, reason File does not
    exist
  • Thu May 30 095756 1996 send timed out for
    dd10-048.compuserve.com
  • Thu May 30 104725 1996 read timed out for
    ncia110b.ncia.net

8
Referral logs
  • Who links to your site?
  • Who downloads your pages?

9
Sample referral log data
  • http//www.isisnet.com/ -gt/change/welcome.html
  • http//www.ipl.org/ref/RR/EDU/Research-rr.html
    -gt/welcome.html
  • http//www.tenet.edu/snp/main.html
    -gt/policy/networks/toc.html
  • http//www.tenet.edu/new/main.html
    -gt/policy/networks/toc.html
  • http//guide-p.infoseek.com/NS/Titles?qtteachert
    raining -gt/resources/SCIMAST/announcement.html
  • http//www.tenet.edu/new/main.html
    -gt/policy/networks/toc.html
  • http//www.tenet.edu/new/main.html
    -gt/policy/networks/toc.html
  • http//www.nwrel.org/national/regional-labs.html
    -gt/welcome.html

10
Common log format
  • Output by most standard servers
  • Needed by most third-party log analyzers
  • hoohoo.ncsa.uiuc.edu/docs/setup/httpd/Overview.htm
    l

11
Extended/custom log formats
  • Log whatever you wish in whatever order you wish
  • Useful if you will read them regularly!
  • But cant work with the analyzers
  • Now in IIS v4, NSCP v3, others.

12
What you can learn from your log files
  • Hits per day
  • Domain origins
  • The path people take in and around your web
  • Problem areas

13
HITS
  • (How Idiots Track Success)
  • Nobody uses this word anymore
  • Doesnt really measure individual users, just
    access
  • Catching servers and proxies mess up these
    statistics

14
Domain origins
  • Where users are coming from -- sometimes
  • Just because they are from ibm.net doesnt mean
    they work at IBM!
  • Forgotten accounts, friends and family using the
    account
  • Hacked user names
  • Proxies dont help here either

15
The path people take in and around your web
  • Search engines help sometimes
  • Which search site was the most popular front door
  • Who links to you and why
  • Is there a pattern or a random walk?

16
Problem areas to deal with
  • Broken links (locally)
  • Broken outbound links
  • Time outs (sunspots?)

17
What you cant learn from your logs
  • Who are these people, anyway?
  • No specific user names
  • Is it a bot or a real human?
  • How long did they view a page?
  • Most people dont spend much time on your web
  • Where did they go visit next?

18
What technologies are available?
  • Built-in analyzer tools
  • Sites that capture user info
  • Secure sites with registration
  • Build your own from perl
  • Third-party tools

19
Built-in tools
  • WebSite, website.ora.com
  • IIS with Site Server, www.microsoft.com/iis
  • Netscape servers, www.netscape.com
  • Easy to use but limited

20
WebSite Professional v2
  • Win NT, 95
  • Best web server for learning about logs, best
    docs
  • QuickStats module for instant analysis
  • single report but nice set of information
  • shows today, last two days requests and unique
    hosts
  • IP addresses of visitors, average requests/hour

21
IIS Site Server
  • NT Server v4 w/SP3 only
  • Lots of preconfigured reports
  • Two versions, Express and Full (customized
    reports)
  • backoffice.microsoft.com/products/siteserver/expre
    ss/

22
Netscape v3 web servers
  • Various NT, Unix versions
  • Reports for a few variables but nothing too
    extensive
  • Best to use a third-party tool here

23
Sites that capture user info
  • WebCounter, www.digits.com -- third-party hit
    counter
  • Someone else does the programming and debugging
  • But beyond your control

24
Secure sites with registration
  • You know your users
  • But many wont register, or forget their
    passwords
  • Requires scripting, database integration, more
    maintenance

25
Build your own from perl
  • Needs some in-house support
  • Works best with Unix-based webs
  • Examples
  • refstats, members.aol.com/htmlguru/refstats.html
  • surfreport, bienlogic.com/SurfReport/

26
Third-party tools
  • WebTracker, www.CQMInc.com/webtrack
  • WebTrends, www.webtrends.com
  • net.Genesis, www.netgen.com
  • MarketWave, www.marketwave.com
  • IIS Assistant, www.go-iis.com

27
Third-party tools (cont)
  • Can make very pretty reports
  • Customizable
  • Make sure they support your particular log format
  • Not that expensive, mostly run on Windows
Write a Comment
User Comments (0)
About PowerShow.com