Title: Visualizing Data and Information
1Visualizing Data and Information
- Holly Nielsen
- Jason Pasinetti
2Content - Do I have the right information to
think about this issue? Credibility Can I
trust this information? Design Am I seeing the
data or the design?
3Content Counts Most of All
How can I help someone think about this issue?
Are coal mines safer today than in the past?
Mine Safety and Health Administration
www.msha.gov/s tats/centurystats/coalstats.asp
4Be Credible
Include units and scale. Financial data should
be in consistent currency and deflated. Use the
Higher Education Price Index. Put you name on
your work so people can judge your credibility
and ask questions if you are not available when
they read your analysis. Use large data
sets. Source your data. Provide the whole data
set. Graphical elements that represent data
should change at the same rate as the
data. Teach, dont pitch.
Visual Display of Quantitative Information,
Tufte, 1983
5Design Counts
In order to take advantage of visual
understanding, the content needs to be arranged
in space instead of time. If we are using paper
or an electronic screen to display the
information, we are now limited to two
dimensions. Every drop of ink or pixel
counts. Design is about making choices. Every
element we add to the display interacts with and
competes with every other element. The choices we
make about how we display our content directly
affects the clarity and accessibility of that
content.
6Design Counts
Am I going to make it?
From Tuftes website, www.edwardtufte.com/tufte/
Original Data , Hermann Brenner, "Long-term
survival rates of cancer patients achieved by the
end of the 20th century a period analysis," The
Lancet, 360 (October 12, 2002), 1131-1135.
7Design Counts
From Tuftes website, www.edwardtufte.com/tufte/
Original Data , Hermann Brenner, "Long-term
survival rates of cancer patients achieved by the
end of the 20th century a period analysis," The
Lancet, 360 (October 12, 2002), 1131-1135.
8Support Analytical Thinking
Compared to what? Show comparisons Show
causality Show multi-variant information (more
than 1 or 2 variables) Add detail to clarify
of Coal Mining Fatalities by State 1996 - 2007
of Coal Mining Fatalities by State 1996 - 2007
Mine Safety and Health Administration
www.msha.gov/stats/charts/coalbystate.asp
9Do I need a graph?
If the data set is small, use text or a well
organized table. Sort tables by some meaningful
field. Performance data should be presented in a
table like the sports page or the financial page
of most newspapers.
Percent of Top 10 Fatal Coal Mining Incident Causes by Location 1995 - 2007 Percent of Top 10 Fatal Coal Mining Incident Causes by Location 1995 - 2007 Percent of Top 10 Fatal Coal Mining Incident Causes by Location 1995 - 2007 Percent of Top 10 Fatal Coal Mining Incident Causes by Location 1995 - 2007
Cause Surface Underground Both Locations
Powered Haulage 47 28 35
Machinery 27 16 21
Roof Fall 0 27 17
Electrical 7 10 9
Fall of Roof 0 8 5
Fall of Roof or Back 0 7 4
Fall of Highwall 7 1 3
Fall of Person 6 0 2
Slip/Fall of Person 5 1 2
Rib Fall 0 3 2
Mine Safety and Health Administration
www.msha.gov/fatals/fabc.htm
10Time for a Graph
Fatal Coal Mining Incidents 1995 - 2007
Many peoples design solution for large
multi-variant data sets is to remove data. But
what if we need the detail to think about the
problem?
Mine Safety and Health Administration
www.msha.gov/fatals/fabc.htm
11Time for a Graph
12Minimize Non-data Ink, Maximize Data Density
2007 Fatal Coal mine Accident Alert 32, Author
Unknown
13Minimize Non-data Ink, Maximize Data Density
2007 Fatal Coal mine Accident Alert 32, Author
Unknown
14Minimize Non-data Ink to Maximize Data Density
1900 - 2007
1995 - 2007
of Coal Mining Fatalities
of Coal Miners in Thousands
Coal Mining Fatality Rate per 200,000 FTEs
Mine Safety and Health Administration
www.msha.gov/stats/centurystats/coalstats.asp
15Minimize Non-data Ink to Maximize Data Density
113. Closely spaced lines activate the space
between them. This causes a visual vibration that
attracts the eye. Avoid hatching and grids. A
point is formed anywhere two lines cross. All
guide lines should be minimized if not
eliminated. Avoid keys and legends. Do not make
people decipher the graph. Label data directly.
Labels in English should usually read
horizontally regardless of the axis.
of Coal Mining Fatalities by State 1996 - 2007
of Coal Mining Fatalities by State 1996 - 2007
Mine Safety and Health Administration
www.msha.gov/stats/charts/coalbystate.asp
16Aspect Ratio Spiky vs. Lumpy Data
Generally use an aspect ration of 1 height to 1.5
width. Adjusting the aspect ratio can help
expose cycles in time series data. Ideally, the
average slop of the lines should average 45
degrees. Practically, make the data look more
lumpy than spiky.
Cleveland, W.S., 1994. The Elements of Graphing
Data, revised edition. Murray Hill, NJ ATT Bell
Laboratories, pp. 6679.
17Using Color
Colors are a very strong visual element use them
sparingly and with care. Be cognizant of contrast
and color interaction. Use colors found in
nature. Especially light blues, yellows, grays,
greens and tans. Use saturation to show increase
in value. The increase in saturation should match
the increase in the data. Use a very light
border to separate interacting zones of
colors. Avoid strong colors close to one another
with white space. Avoid rainbow encoding.
18Layering and Separation
Year of Sago Mine Disaster
Coal Mining Fatality Rate per 200,000 FTEs
Annotate, but use light guidelines. Show data
and information in parallel to invite comparison.
Clinton Administration
Bush Administration
Mine Safety and Health Administration
www.msha.gov/stats/centurystats/coalstats.asp
19Use Area and Volume Cautiously
Do not depict more dimensions than there are
variables in your data. Avoid shading and 3-D
effects. Do not hide one data behind anothers
3-D effect. We have a hard time estimating area,
especially radial area in circles, cones, and
spheres. Make sure the area changes at the
proportion as the data it depicts. What do you
do when a bound area defines a region relates to
but is not changed by another variable?
20Use Area and Volume Cautiously
Was this the American political landscape in 2004?
21Use Area and Volume Cautiously
But now we loose the political and geographic
boundaries.
Visualizing Data for the Masses Information
Graphics at The New York TimesMatthew Ericson ,
Deputy Graphics Director, The New York Times
22Use Area and Volume Cautiously
This map shows population density much
better. But what about political reality?
This map show how that translates to our
political system.
Visualizing Data for the Masses Information
Graphics at The New York TimesMatthew Ericson ,
Deputy Graphics Director, The New York Times
23Macro/Micro Readings
24Macro/Micro Readings
25Use Small Multiples
Coal mining Fatalities for Top 6 States 1996 -
2007
Well designed data graphs can be shrunk way
down. This allows us to show change using space
instead of time. Do not use frames. The scales
become the frames. Show the change in data, not
the design.
Mine Safety and Health Administration
www.msha.gov/stats/charts/coalbystate.asp
26Integrate Word, Number, and Image
Do not segregate content by means of
production. Avoid referencing figures far away
from text analysis. Put the visual close to the
text. Use a simple, intense, word-sized graphic
like a sparkline.
From Tuftes website, www.edwardtufte.com/tufte/
From Bissantzs website, www.bissantz.de/
27Mapping Relationships
When mapping networks, project , or workflows,
think about the nature of the relationships. Av
oid shapes. Label directly and annotate,
especially the relationships.
Mark Lombardi george w. bush, harken energy, and
jackson stevens c.1979-90, 5th version, 1999
28Mapping Relationships
29Prophets, Practitioners, and Toolmakers
The long shadow Tufte The new kids on the
block Few, Many Eyes, Bissantz, NYT Graphics
Department Do not forfeit your design to
software or templates Excel hammering a samurai
sword out of a butter knife Look for design
inspiration. Talent imitates, but genius steals
Steal Avoid
Elite Newspapers New York Times Wall Street Journal Elite Science Journals Nature Science TV Lawyers Marketing/PR Politicians
30Seeing the Trees in the Forest
A rage to conclude trends, averages, forecasts,
overreaching, and recency bias Graphs are
powerful but not magic. Aid decision making dont
short cut it. The Oracle only has answers. You
have to ask the right questions. Fatalities are
people
31Seeing the Trees in the Forest
32Questions?
33Aesthetics
When I am working on a problem I never think
about beauty. I only think about how to solve the
problem. But when I have finished, if the
solution is not beautiful, I know it is
wrong. Buckminster Fuller
34Table Exercises
- Think about the situation. Read the background
material (it is only background it will not
contain all the detail you want to think about). - What questions need to be asked to understand the
issues? - After you decide what info you need, what would
be some good ways to show it? (show as different
visualizations) - You have 30 minutes to discuss. At the end of
the 30 minutes, one person per table needs to
give a one minute report out to the group.