Guide to the Clickstream Data - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Guide to the Clickstream Data

Description:

... series of page view (displays on user's browser at one time) requests, ... Contains information about time; IP address; session ID; page request; referer ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 9
Provided by: kuns6
Category:

less

Transcript and Presenter's Notes

Title: Guide to the Clickstream Data


1
Guide to the Clickstream Data
  • Petr Berka
  • University of Economics, Prague
  • berka_at_vse.cz

2
Web Usage Mining Domain
  • click-stream - a sequential series of page view
    (displays on users browser at one time)
    requests,
  • server session - a click-stream of page views for
    a single user for a particular web site,
  • user session - is the click-stream of page views
    for a single user across the entire web.

3
The Clickstream Data
  • 3Millions of records (24 days) from a www shop
    web server log
  • Contains information about time IP address
    session ID page request referer
  • There are hundreds of thousands of sessions most
    of them very short, on average 16 pages
  • Each page request in this www shop has the same
    structure page type / content ID (product ID)
  • Page types are for example dp (detail of
    product), sb (shopping basket), ct (contact)

4
Example of the Data
unix time IP address session ID
page request referee 1074589200193.17
9.144.2 1993441e8a0a4d7a4407ed9554b64ed1/dp/?id
124 www.google.cz 1074589201194.213.35.234399
5b2c0599f1782e2b40582823b1c94/dp/?id182
1074589202194.138.39.56 2fd3213f2edaf82b27562d
28a2a747aa/ www.seznam.cz 1074589233
193.179.144.2 1993441e8a0a4d7a4407ed9554b64ed1/
dp/?id148 /dp/?id124 1074589245193.179.144.2
1993441e8a0a4d7a4407ed9554b64ed1/sb/
/dp/?id148 1074589248194.138.39.56
2fd3213f2edaf82b27562d28a2a747aa/contacts/
/ 1074589290193.179.144.2 1993441e8a0a4d7a4407e
d9554b64ed1/sb/ /sb/
5
Data Description
  • table obchod (shop) - name of the internet shop
    (7 entries),
  • table kategorie (category) - info about
    category of products (64 entries),
  • table list (sheet) - info about a specific
    product of a more detailed type (157 entries),
  • table znacka (brand) - name of the producer or
    brand of a product (197 entries),
  • table tema (theme) - info about themes
    discussed in the on-line advice (36 entries)

6
Data Summary (1/3)
  • 3 617 171 page requests
  • 522 410 sessions
  • 318 523 single page
  • 203 887 length gt 1
  • avg. length 16
  • median 8
  • modus 2
  • longest 15454

7
Data Summary (2/3)
  • time spent during a session
  • avg. time 002446
  • median 000308
  • modus 000009
  • longest 4332753

8
Data Summary (3/3)
distribution of sessions with length gt 1
Write a Comment
User Comments (0)
About PowerShow.com