SlideShare a Scribd company logo
How Humans
See Data
John Rauser
@jrauser
November 2016
How Humans
See Data
John Rauser
@jrauser
November 2016
visualization
visualization
is
communication
how to make better visualizations
help humans solve analytical
problems quickly and accurately
with visualization
Part I: Why visualize data at all?
How Humans See Data
x
1.972
y
1.236
x y
0.111 0.542
1.112 1.994 0.902 0.005
0.000 1.009 0.598 0.085
0.665 1.942 1.613 1.790
0.235 0.356 1.298 1.955
0.247 1.658 0.651 1.937
1.275 1.961 1.949 1.316
0.702 0.045 0.099 0.567
1.760 0.350 0.862 0.010
1.691 0.277 0.027 0.768
1.628 1.778 0.706 1.956
1.957 1.290 1.042 1.999
How Humans See Data
pre-attentive processing
A graph is an encoding
of the data.
x
1.972
y
1.236
x y
0.111 0.542
1.112 1.994 0.902 0.005
0.000 1.009 0.598 0.085
0.665 1.942 1.613 1.790
0.235 0.356 1.298 1.955
0.247 1.658 0.651 1.937
1.275 1.961 1.949 1.316
0.702 0.045 0.099 0.567
1.760 0.350 0.862 0.010
1.691 0.277 0.027 0.768
1.628 1.778 0.706 1.956
1.957 1.290 1.042 1.999
How Humans See Data
n x y n x y
1 1.972 1.236 13 0.111 0.542
2 1.112 1.994 14 0.902 0.005
3 0.000 1.009 15 0.598 0.085
4 0.665 1.942 16 1.613 1.790
5 0.235 0.356 17 1.298 1.955
6 0.247 1.658 18 0.651 1.937
7 1.275 1.961 19 1.949 1.316
8 0.702 0.045 20 0.099 0.567
9 1.760 0.350 21 0.862 0.010
10 1.691 0.277 22 0.027 0.768
11 1.628 1.778 23 0.706 1.956
12 1.957 1.290 24 1.042 1.999
How Humans See Data
How Humans See Data
Good visualizations optimize
for the human visual system.
How does the human
visual system work?
How does the human visual
system decode a graph?
How Humans See Data
Cleveland’s three visual
operations of pattern perception:
1. Detection
2. Assembly
3. Estimation
Part II: estimation
Three levels of estimation
a. discrimination X=Y X!=Y
b. ranking X>Y X<Y
c. ratioing X / Y = ?
At the heart of quantitative
reasoning is a single question:
Compared to what?
- Tufte, Envisioning Information
Three levels of estimation
a. discrimination X=Y X!=Y
b. ranking X>Y X<Y
c. ratioing X / Y = ?
How Humans See Data
How Humans See Data
the most
important
thing
How Humans See Data
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
“The first rule of color:
do not talk about color!”
- Tamara Munzner
luminance
saturation
hue
luminance
saturation
hue
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
Observation: Alphabetical is
almost never the correct ordering
of a categorical variable.
How Humans See Data
How Humans See Data
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
How Humans See Data
How Humans See Data
How Humans See Data
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
11 mpg
11 mpg
11 mpg
How Humans See Data
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned
scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
How Humans See Data
How Humans See Data
How Humans See Data
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
How Humans See Data
How Humans See Data
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
Observation: Stacked
anything is nearly always
a mistake.
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
Stacking makes the reader
decode lengths, not position
on a common scale.
11 mpg
How Humans See Data
Observation: Stacked
anything is nearly always
a mistake.
How Humans See Data
Observation: Pie charts are
ALWAYS a mistake.
Piecharts are the information visualization
equivalent of a roofing hammer to the
frontal lobe. They have no place in the world
of grownups, and occupy the same semiotic
space as short pants, a runny nose, and
chocolate smeared on one’s face. They are
as professional as a pair of assless chaps.
http://blog.codahale.com/2006/04/29/google-analytics-the-goggles-they-do-nothing/
Piecharts are the information visualization
equivalent of a roofing hammer to the frontal
lobe. They have no place in the world of
grownups, and occupy the same semiotic
space as short pants, a runny nose, and
chocolate smeared on one’s face. They are
as professional as a pair of assless chaps.
http://blog.codahale.com/2006/04/29/google-analytics-the-goggles-they-do-nothing/
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
How Humans See Data
How Humans See Data
Tables are preferable to graphics for many small
data sets. A table is nearly always better than a
dumb pie chart; the only thing worse than a pie
chart is several of them, for then the viewer is
asked to compared quantities located in spatial
disarray both within and between pies… Given
their low data-density and failure to order
numbers along a visual dimension, pie charts
should never be used.
-Edward Tufte, The Visual Display of Quantitative Information
Tables are preferable to graphics for many
small data sets. A table is nearly always better
than a dumb pie chart; the only thing worse than
a pie chart is several of them, for then the viewer
is asked to compared quantities located in spatial
disarray both within and between pies… Given
their low data-density and failure to order
numbers along a visual dimension, pie charts
should never be used.
-Edward Tufte, The Visual Display of Quantitative Information
Clinton Trump
Among Democrats 99% 1%
Among Republicans 53% 47%
Who do you think did a better
job in tonight’s debate?
How Humans See Data
Afghanistan
Albania
Algeria
Angola
Argentina
Australia
Austria
Bahrain
Bangladesh
Belgium
Benin
Bolivia
Bosnia and Herzegovina
Botswana
Brazil
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
All good pie charts are jokes.
How Humans See Data
Observation: Comparison is trivial
on a common scale.
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
the dashboard metaphor is
fundamentally flawed
How Humans See Data
How Humans See Data
How Humans See Data
Observation: Scatterplots
show relationships directly.
How Humans See Data
How Humans See Data
Observation: Growth charts
usually aren’t.
How Humans See Data
If growth (slope) is
important, plot it directly.
How Humans See Data
Observation: Growth charts
usually aren’t.
If growth (slope) is important,
plot it directly.
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
Cleveland’s three visual operations
of pattern perception:
1. Detection
2. Assembly
3. Estimation
Part three: assembly
Gestalt Psychology
How Humans See Data
reification
emergence
How Humans See Data
emergence
How Humans See Data
Prägnanz
Law Of Closure
How Humans See Data
Law Of Continuity
How Humans See Data
How Humans See Data
Observation: Good plots
leverage the law of continuity
to assist with assembly.
How Humans See Data
How Humans See Data
Law of Similarity
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
Law of Proximity
How Humans See Data
How Humans See Data
Observation: dodged bar
charts are a bad idea
Cleveland’s three visual operations
of pattern perception:
1. Detection
2. Assembly
3. Estimation
Part IV: detection
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
excel’s defaults are pretty bad
-
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
180,000
200,000
1 2 3 4 5 6
Observation: Detection isn’t
as trivial as it seems.
How Humans See Data
“Above all else, show the data.”
-Tufte
Part V: other useful results
Weber’s law: The “Just Noticeable
Difference” is proportional to the
size of the initial stimuli.
10 20
10 20
100 110
How Humans See Data
How Humans See Data
12 units
12 units
Observation: Weber’s Law is
why gridlines are useful
How Humans See Data
How Humans See Data
How Humans See Data
“Erase non-data ink.”
-Tufte
“Erase non-data ink,
within reason.”
-Tufte
“Erase non-data ink that interferes
with detection or doesn’t assist
assembly and estimation.”
-Rauser
You are best at detecting variation
in slope near 45 degrees.
How Humans See Data
banking to 45
How Humans See Data
How Humans See Data
Observation: Banking to 45
best shows variation in slope
How Humans See Data
Q: Should I include 0 on my scale?
How Humans See Data
How Humans See Data
Q: Should I include 0 on my scale?
A: It depends.
Q: Should I include 0 on my scale?
A: Relying on the pre-attentive
perception of size or intensity?
Yes, otherwise you will mislead.
Using position? It’s up to you.
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
“Above all else, show the data.”
-Tufte
“Above all else, show
the variation in the data.”
-Rauser (via Tufte)
R/GGplot2 code for every plot in this
presentation available at http://goo.gl/xH5PLV
The rendered document is at
http://rpubs.com/jrauser/hhsd_notes
This presentation is at
http://goo.gl/VKxxya
I will tweet these links as @jrauser
coda
visualization
is
communication
art
is
communication
visualization
is
art
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
How Humans See Data
why does it make you
feel that way?
visualization has as much to
learn from art as from science
R/GGplot2 code for every plot in this
presentation available at http://goo.gl/xH5PLV
The rendered document is at
http://rpubs.com/jrauser/hhsd_notes
This presentation is at
http://goo.gl/VKxxya
I will tweet these links as @jrauser
end

More Related Content

How Humans See Data