SlideShare a Scribd company logo
analyzing MLB data with
ggplot
Greg Lamp
ggplot
● What is it?
● Alternatives
● How it works
● Why should I use it?
● Brief case study
● Questions
Here I am on
the Internet.
Founder/CTO @ Yhat
Hi, I’m Greg!
What is
ggplot?
Analyzing mlb data with ggplot
DSL for graphics
DSL for graphics
scatterplot
histogram
labels
color
shape
What about
matplotlib?
Analyzing mlb data with ggplot
a quick example
Analyzing mlb data with ggplot
matplotlib ggplot
it��s not all bad!
matplotlib
syntax, api,
default themes,
learning curve
matplotlib
maturity, ipython,
customization, community
syntax, api,
default themes,
learning curve
What about
d3.js?
d3.js
ggplot
ggplot d3.js
How it works
Format
ggplot
Analyzing mlb data with ggplot
data frame
“aesthetics”
Aesthetics
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
color
shape
size
...fill, alpha, slope,
intercept, ymin,
ymax, ...
Geoms,
Stats, &
Scales
geom_point
geom_area
...there are many
stat_smooth
...there are a few
scale_color_brewer
scale_color_gradient
...there are many
Layers
ggplot()
+
ggplot() geom_point()
+ +
ggplot() geom_point() stat_smooth()
+ +
ggplot() geom_point() stat_smooth()+ +
ggplot() +
geom_point() +
stat_smooth()
Why is this
good?
Makes “reasonable
assumptions”
not real colors
matplotlib freaks
still not real colors
...but i can guess
what you mean
Analyzing mlb data with ggplot
Concise yet
expressive
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Looks pretty good
(and is easy to customize)
Analyzing mlb data with ggplot
Seaborngithub.com/mwaskom/seaborn
Case Study
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
pitch speed
Analyzing mlb data with ggplot
103.4 mph
Analyzing mlb data with ggplot
Load ggplot and pandas
Read in our pitch f/x data
define the x-
axis
pass in your data frame
add a histogram
How does fatigue
impact velocity?
...not helpful
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
What about at the
individual level?
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Justin
Verlander
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
ggplot let’s you
fail quicker
Finding Help
/tagged/python-ggplot
http://ggplot.yhathq.com
What’s next?
Analyzing mlb data with ggplot
Thanks!
@theglamp
greg@yhathq.com

More Related Content

Analyzing mlb data with ggplot