Visualization of the Web Access Popularity - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Visualization of the Web Access Popularity

Description:

Our customer, Ping Media Ltd; the website, Ping Wales; ... pingwales.co.uk/business/apple-keynote.html' 'Mozilla/5.0 (X11; U; Linux i686; ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 17
Provided by: george220
Category:

less

Transcript and Presenter's Notes

Title: Visualization of the Web Access Popularity


1
Visualization of the Popularity of the Web
Accessfor Ping Wales
Xiaochuan Huang (George) Supervised by Dr Markus
RoggenbachDepartment of Computer
ScienceUniversity of Wales SwanseaNov. 2005 _at_
Gregynog
2
Overview
  • A Regular Website Report
  • Specification
  • Technology Involved
  • A First Approach

3
1. A Regular Website Report
  • What the project is about
  • Our customer, Ping Media Ltd the website, Ping
    Wales
  • What they need and the technical infrastructure

4
1. A Regular Website Report
  • What the project is about
  • Introducing similar tools
  • Log file analyzersThe AWStats and Analogs
    6.0Graphic statistics generated by AWStats and
    Analog

5
1. A Regular Website Report
6
1. A Regular Website Report
  • What the project is about
  • Our customer, Ping Media Ltd the website, Ping
    Wales
  • What they need and the technical infrastructure
  • Introducing similar tools
  • Log file analyzersThe AWStats and Analogs
    6.0Graphic statistics generated by AWStats and
    Analog
  • Why this application is necessary
  • Customers needs The shortage of existing
    applicationsExtendable project

7
2. Specification
  • Components
  • The filter/parserThe analyzerTwo
    databasesVisualization
  • Going through the processes
  • Take daily log file -gt parse with DB1 -gt output
    filtered result -gt write result into DB2
  • Given a specified duration -gt access DB2 -gt
    generate the records -gt output an visualized
    report

8
3. Technologies Involved
  • The Apache log files
  • Introduction

9
3.Technologies Involved
  • The Apache log files
  • Introduction
  • Format"h l u t \"r\" gts b
    \"Refereri\" \"User-agenti\""
    combined220.244.224.104 - - 12/Jan/2005001238
    0000 "GET /hardware/toshiba-small-80gb-hdd.html
    HTTP/1.0" 200 11020 "http//www.pingwales.co.uk/b
    usiness/apple-keynote.html" "Mozilla/5.0 (X11 U
    Linux i686 en-US rv1.7.3) Gecko/20041204
    Epiphany/1.4.4"

10
The Apache log files
  • Introduction
  • Format "h l u t \"r\" gts b
    \"Refereri\" \"User-agenti\""
    combined220.244.224.104 - - 12/Jan/2005001238
    0000 "GET /hardware/toshiba-small-80gb-hdd.html
    HTTP/1.0" 200 11020 "http//www.pingwales.co.uk/b
    usiness/apple-keynote.html" "Mozilla/5.0 (X11 U
    Linux i686 en-US rv1.7.3) Gecko/20041204
    Epiphany/1.4.4"
  • Log string analysis
  • (h) 220.244.224.104 the IP address of the
    client
  • (l) The RFC 1413, identity of the client
  • (u) The userid of the requesting person
  • (t) 12/Jan/2005001238 0000 the request
    time
  • (\"r\") "GET /hardware/toshiba-small-80gb-hdd.h
    tml HTTP/1.0" method, request page, client
    protocol
  • (gts) 200 the status code
  • (b) 11020 the size of the object returned to
    the client
  • (\"Refereri\") the site that the client
    reports having been referred from.
  • (\"User-agenti\") identifying information of
    client browser

11
3. Technologies Involved
  • The Apache log files
  • Programming language Ruby
  • interpreted scripting language for quick and
    easy object-oriented programming

 cd sample ruby eval.rbrubygt a  "Hello, world
!" "Hello, world!rubygt puts a
Hello, world!Nilrubygt D
 rubyputs "Hello, world!DHello, world!
12
3. Technologies Involved
  • The Apache log files
  • Programming language Ruby
  • Database access
  • MySQL,
  • The two databases
  • Access DB with Ruby

13
4. A First Approach
  • load the daily log file
  • Parsing/Filtering
  • while not end of file
  • read hit, line by line
  • for each hit, getIP(h), getTime(t),
    getReq(\"r\"), getSt(gts)
  • Check if even(first( getSt() )), then go
    through the articles database looking for
    getIP()
  • if there is, write such hit to database 2,
    read next
  • go to next hit
  • Analyzing
  • Specify StartingTime, EndTime, build an
    array/stack myArray
  • Read through records from database 2, for those
    within the specified time
  • for each hit,
  • if getIP() is in myArray, then counter1
  • otherwise, write this hit to myArray, initial
    counter
  • Sort myArray according to counter of each element
  • Write out the result of top Ns to file, for
    visualizing

14
  • Water flow model
  • Take daily log file -gt parse with DB1 -gt output
    filtered result -gt write result into DB2
  • Given a specified duration -gt access DB2 -gt
    generate the records -gt output an visualized
    report

Filter
Daily Log File
Database 1ltwebpage add DBgt
Database 2 ltpage visits recordsgt
GraphicReport
Visualization Tool
Analyzer
Period entry
Records
15
Summary
  • What I have done so far
  • What I am planning to do next

16
End
  • hey weak up, there he ends !! LOL
  • George 21/11/2005 _at_Gregynog
Write a Comment
User Comments (0)
About PowerShow.com