Graphing to visualize data - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Graphing to visualize data

Description:

Examples of graph usage & what you get out of them. Art: how to ... Example ... Several examples in the Jain textbook. Shivkumar Kalyanaraman. Rensselaer ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 46
Provided by: ShivkumarK7
Category:

less

Transcript and Presenter's Notes

Title: Graphing to visualize data


1
Graphing to visualize data
  • Satish Raghunath
  • rsatish_at_alum.rpi.edu
  • Shiv Kalyanaraman
  • Google Shiv RPI
  • shivkuma_at_ecse.rpi.edu
  • http//www.ecse.rpi.edu/Homepages/shivkuma

2
Overview
  • Issues with graphing
  • Types of graphs
  • Examples of graph usage what you get out of
    them
  • Art how to choose what graph to use?
  • Graphing Tools
  • Pitfalls and mistakes in graphing
  • Advanced visualization
  • In class work reviewing graphing use in selected
    technical papers

3
Thoughts on Presentation Styles
  • Primary purpose illustrate to help understand

The goal of simulation is intuition, not
numbers," - R.W. Hamming
  • Corollary dont dump data on the reader.
  • Distill it into presentations that give insight
    instead

4
Descriptive Statistics
  • Involves
  • Collecting Data
  • Presenting Data
  • Characterizing Data
  • Understanding data distill insights!


50
25
0
Q1
Q2
Q3
Q4
?X 30.5 S2 113
Insights Somewhat skewed Bell shape perhaps a
Poisson (distrn) would fit?
Statistics obtained from data
5
To graph or not to graph
  • Use graphs when
  • Trends in data are not obvious
  • It is hard to explain the X-Y relationship in
    words
  • Consider tables if
  • The number of data-points are small
  • Reader might find exact value of data-points
    useful

6
Summary Table Frequencies
  • 1. Lists Categories No. Elements in Category
  • 2. Obtained by Tallying Responses in Category
  • 3. May Show Frequencies (Counts), or Both

Row Is Category
Tally
Major
Count
Accounting
130
Economics
20
Management
50
Total
200
7
Example Tables from Networking
8
What kind of graph?
  • Pie-charts to depict fraction of a whole
  • Bar-charts when data-points few and table is not
    suitable
  • Line-plots when there are a lot of data-points
  • Box-plots if statistical inference is drawn
    shows 1st, 2nd, 3rd quartile for each point.
  • Scatter-plots, 3-d plots only if necessary
    AVOID complex graphs

9
Pie Chart
  • 1. Shows Breakdown of Quantity into Categories
  • 2. Useful for Showing Relative Differences
  • 3. Angle Size
  • (360) x (Percent)

Majors
Mgmt.
Econ.
25
10
36
Acct.
65
(360) (10) 36
10
Pie Chart Networking Example
Source http//www.caida.org/bhuffake/papers/skit
viz/
11
Another eg VPN Classification
12
Bar Chart
Horizontal Bars for Categorical Variables
Bar Length Shows Frequency or
Major
Mgmt.
Equal Bar Widths
Econ.
1/2 to 1 Bar Width
Acct.
0
50
100
150
Zero Point
Frequency
Percent Used Also
13
Networking Example Bar Chart
14
Example Analysis with Bar Charts
  • LT-TCP is able to
  • reduce timeouts drastically
  • keep the queue non-empty maximizing throughput
    and capacity utilization.
  • minimize use of FEC to level needed

15
Histogram for distributions
Class
Freq.
Count
15 but lt 25
3
5
25 but lt 35
5
35 but lt 45
2
4
Frequency Relative Frequency Percent
3
Bars Touch
2
1
0
0 15 25 35 45 55
Lower Boundary
16
Recall Real Example Histogram
  • What is the fairness between TCP goodputs when we
    use different queuing policies?
  • What is the confidence interval around your
    estimates of mean file size?
  • Note distribution need not just be a
    probability/frequency distribution

17
Dot Chart or Scatterplots
Major
Line Length Shows Frequency or
Like Horizontal Bar Chart
Mgmt.
Horizontal Lines for Categorical Variables
Equal Spacing
Econ.
Acct.
0
50
100
150
Zero Point
Frequency
Percent Used Also
18
Scatter Plots
19
Scatter plots with trends
20
WiFi Analysis Scatter Plots
  • http//www.sigcomm.org/sigcomm2004/papers/p442-agu
    ayo1111.pdf

21
Line ChartsExampleComparative Performance
Note also plots confidence intervals!
22
Line Plots for Distributions Example
  • Hop count and RTT distributions

Source http//www.caida.org/bhuffake/papers/skit
viz/
23
Recall Distribution Shape
  • 1. Describes How Data Are Distributed
  • 2. Measures of Shape
  • Skew Symmetry

Right-Skewed
Left-Skewed
Symmetric
Mean

Median

Mode
Mean


Median


Mode
Mode

Median

Mean
24
Box Plot
  • Graphical Display of Data Using5-Number Summary

Median
Q
Q
X
X
3
1
largest
smallest
4
6
8
10
12
25
3D Graphs Example
  • Illustrates a complex parameter response surface
    ...

26
3D Plots N/w Example Code Red Worm Analysis
  • http//www.prism.uvsq.fr/users/qst/Tomography/Arti
    cles_jmf/renesys_bgp_instabilities2001.pdf
  • http//www.caida.org/outreach/isma/0112/talks/andy
    o/index.pdf
  • http//www.renesys.com/resource_library/Renesys-NA
    NOG23.pdf

27
Contd
28
Tools Gnuplot
  • To use with data-generating programs for
    repetitive plotting
  • E.g. generate the plot of throughput for every 1
    hour interval in the last week.
  • http//www.gnuplot.info
  • TIP Export gnuplot plots as .fig file and edit
    it in xfig for greater flexibility

29
Tools XmGrace
  • For more intricate details (e.g., creating
    error-bars, different shades for bar-charts)
    GUI-driven, very user friendly.
  • http//plasma-gate.weizmann.ac.il/Grace/
  • Exports images to EPS (good for LaTeX documents),
    PNG (good for PowerPoint) etc.
  • Can also run on Windows on top of Cygwin!

30
Tools MATLAB
  • For complex 3-d and other statistical plots like
    box-plots, scatter-plots and in general if
    enormous quantities of data is involved.
  • http//www.mathworks.com

31
Tools Excel Data Presentations
  • Open up Excel to a new Worksheet.
  • Code a data set as below
  • Blue 34
  • White 68
  • Red 25
  • Green 50
  • Explore simple data presentation possibilities

32
Graphs things to watch out
  • Purpose illustrate entire time-series or
    response distribution
  • Label the x- and y-axis
  • Check what units the x- and y-axes are in (not
    goats or sheep!)
  • Check if either scale is logarithmic (changes
    meaning)
  • Check where is the origin (or zero point) for
    each axis!
  • After understanding WHAT is being plotted, close
    your eyes and ask
  • what will different patterns on this graph imply
    (relative to what I want to understand)?
  • See if the relative performance is over- or
    under-emphasized (if two systems are being
    compared)
  • Several examples in the Jain textbook

33
Errors in Presenting Data
  • 1. Using Chart Junk
  • 2. No Relative Basis in Comparing Data Batches
  • 3. Compressing the Vertical Axis
  • 4. No Zero Point on the Vertical Axis

34
Chart Junk
Bad Presentation
Good Presentation
Minimum Wage
Minimum Wage

1960 1.00
4
1970 1.60
2
1980 3.10
0
1990 3.80
1960
1970
1980
1990
35
No Relative Basis
Good Presentation
Bad Presentation
As by Class
As by Class
Freq.

300
30
200
20
100
10
0
0
FR
SO
JR
SR
FR
SO
JR
SR
36
Compressing Vertical Axis
Good Presentation
Bad Presentation
Quarterly Sales
Quarterly Sales


50
200
25
100
0
0
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
37
No Zero Point on Vertical Axis
Good Presentation
Bad Presentation
Monthly Sales
Monthly Sales


45
60
42
40
39
20
0
36
J
M
M
J
S
N
J
M
M
J
S
N
38
Graphing Practices In pictures ?
39
Graphing Practices
40
Graphing Practices
41
Graphing Practices.
42
Checklist In textbook
43
More Complex Visualizations
  • Internet topology aspects
  • CAIDA skitter project

http//www.caida.org/tools/measurement/skitter/vis
ualizations.xml
44
More
45
The End
Write a Comment
User Comments (0)
About PowerShow.com