Google Analytics Data Mining with R
- 1. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 1 #tatvicwebinar1/28/2015 1 #tatvicwebinar
Google Analytics Data Mining with R
(includes 3 Real Applications)
Jan 28th, 2015
FREE Webinar by
- 2. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 2 #tatvicwebinar1/28/2015 2 #tatvicwebinar
Our Speakers
Kushan Shah
Maintainer of RGoogleAnalytics
Library & Web Analyst at Tatvic
@ kushan_s
Andy Granowitz
Developer Advocate, Google
Analytics (Google)
@ agrano
- 3. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 3 #tatvicwebinar
Outline
An Introduction to R
Why analyze Google Analytics data with R
Getting started with R & Google Analytics
Questions & Answers
3 Real Life Applications & Use Cases
- 4. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 4 #tatvicwebinar
An Introduction to R
• Open source statistical computing language, widely used by
organizations to solve business problems
- 5. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 5 #tatvicwebinar
An Introduction to R
• Open source statistical computing language, widely used by
organizations to solve business problems
Data Analysis Statistical Tests
Data
Visualization
Predictive Models
Forecasting
R
- 6. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 6 #tatvicwebinar
Why Use R
Easy to integrate with various data sources
Data Frame – Analogous to Excel Spreadsheet or MySQL
Table
6000 Pre developed packages for various applications
No software licensing costs
Enables reproducible analysis
- 7. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 7 #tatvicwebinar
Outline of this Webinar
An Introduction to R
Why analyze Google Analytics data with R
Getting started with R & Google Analytics
Questions & Answers
3 Real Life Applications & Use Cases
- 8. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 8 #tatvicwebinar
Why analyze Google Analytics data with R
● Google Analytics API allows data extraction for custom reports
• Reports with up to 7 Dimensions and 10 Metrics
• API is well suited for batch data extraction
• API has techniques for handling large queries (10K - 1M records and beyond)
● RGoogleAnalytics = R Wrapper over the Google Analytics API
• Provides functions to easily interact with the Google Analytics API
• Takes care of the low level plumbing
● Google Analytics Premium User and data exported to Big Query
• Use the bigrquery package by Prof. Hadley Wickham
- 9. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 9 #tatvicwebinar
Outline of this Webinar
An Introduction to R
Why analyze Google Analytics data with R
Getting started with R & Google Analytics
Questions & Answers
3 Real Life Applications & Use Cases
- 10. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 10 #tatvicwebinar
Getting Started with R and Google Analytics
● Install R - http://www.r-project.org/
● Install RStudio - GUI for R (Optional)
● Install RGoogleAnalytics
Check out the blogpost for a step by step walkthrough - bit.ly/18oJjqA
- 11. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 11 #tatvicwebinar
Getting Started with R and Google Analytics
One Time Setup
● Create a Project in the Google Dev Console
● Activate the Google Analytics API for your project
● Get your Project’s Client ID and Client Secret
https://developers.google.com/analytics/devguides/reporting/core/v3/gdataAuthorization
- 12. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 12 #tatvicwebinar
Outline of this Webinar
An Introduction to R
Why analyze Google Analytics data with R
Getting started with R & Google Analytics
Questions & Answers
3 Real Life Applications & Use Cases
- 13. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 13 #tatvicwebinar
Examples
● Example 1: Forecast Product Revenue for an eCommerce
Store
● Example 2: Assess the long term value of your Marketing
Campaigns
● Example 3: Web Analytics Visualization with ggplot2
- 14. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 14 #tatvicwebinar
Example 1: Predict Product Revenue with R
● Get Product Revenue as Time Series (historical data)
● Forecast Product Revenue for the next quarter
Check out the blogpost for a complete walkthrough - http://bit.ly/1y3dmtI
- 15. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 15 #tatvicwebinar
Example 1: Predict Product Revenue with R
Time Series Components
- 16. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 16 #tatvicwebinar
Example 2: Long Term Value of Your Marketing Campaigns
Check out the blogpost for a complete walkthrough - http://bit.ly/1zpjbYA
- 17. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 17 #tatvicwebinar
Example 2: Long Term Value of Your Marketing Campaigns
- 18. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 18 #tatvicwebinar
Query New Customers Acquired via a Given Campaign
query.list <- Init(start.date = "2014-11-01",
end.date = "2014-12-20",
dimensions = "ga:date",
metrics = "ga:transactions,ga:transactionRevenue",
segment = "users::sequence::
^ga:userType==New Visitor;
dateOfSession<>2014-11-01_2014-11-07;
ga:campaign==Campaign A;
->>perSession::ga:transactions>0",
sort = "ga:date",
table.id = tableId)
Example 2: Long Term Value of Your Marketing Campaigns
- 19. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 19 #tatvicwebinar
Example 2: Long Term Value of Your Marketing Campaigns
Segments Explained
• The segment selects users:: in order to include not only the sessions that match the
conditions, but all sessions among users who match the conditions.
• The sequence:: prefix selects a set of users that completed a specified set of steps
• Step #1 - Visit from a given campaign in a given set of time
• Step #2 - Make a purchase
• The ^ prefix in front of ga:userType==New Visitor;dateOfSession<>2014-11-01_2014-
11-07;ga:campaign==Campaign A ensures that the Date of Session, Campaign, and
User Type conditions are true for the first hit of the first session in the given date
range.
• ->>perSession::ga:transactions > 0 specifies the second step of making a purchase at
some point.
- 20. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 20 #tatvicwebinar
Example 2: Long Term Value of Your Marketing Campaigns
• head(campaign_a_df)
• cumulativeTransactions <- cumsum(campaign_a_df$transactions)
- 21. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 21 #tatvicwebinar
Example 2: Long Term Value of Your Marketing Campaigns
Use the data to Generate a Cumulative Transactions Plot
- 22. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 22 #tatvicwebinar
Example 3: Web Analytics visualization with ggplot2
Background
• gg – Grammar of Graphics (Wilkinson, 2005)
• R Implementation by Prof. Hadley Wickham
• Sophisticated graphs in a *few lines of R code
* Learning Curve
- 23. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 23 #tatvicwebinar
Example 3: Web Analytics visualization with ggplot2
ggplot(data = ga.data) + geom_line(aes(x = date, y = itemRevenue)
Check out the blogpost for a complete walkthrough - http://bit.ly/15Iaaf7
- 24. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 24 #tatvicwebinar
Example 3: Web Analytics visualization with ggplot2
Decomposing the syntax
• Create the ggplot object and populate it with data
• ggplot(ga.data)
• Add Layers(s)
• geom_line(aes(x=date, y=itemRevenue))
• Other geoms – bar, point, line, histogram
• Aesthetics describe how variables are mapped to visual
properties of geoms
• Additional Plotting Options -> Plot title, Axis Titles, Axis Text
Formatting, Legends
- 25. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 25 #tatvicwebinar
Best Practices
• Become familiar with the Google Analytics API Naming
Conventions
– Dimension/Metric names are in camelCase
• Know the permissible Dimension Metric Combinations
– https://developers.google.com/analytics/devguides/reporting/core/dims
mets
• Use the Query Feed Explorer to test queries before running them
in R
– https://ga-dev-tools.appspot.com/explorer/
• Post issues at https://github.com/Tatvic/RGoogleAnalytics/issues
• Questions at Google Analytics Reporting API Forum
- 26. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 26 #tatvicwebinar
Further Resources
• http://bit.ly/r-googleanalytics-resources
• Watch the full R Google Analytics Webinar -
http://bit.ly/1KrHXtH
- 27. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 27 #tatvicwebinar
Outline of this Webinar
An Introduction to R
Why analyze Google Analytics data with R
Getting started with R & Google Analytics
Questions & Answers
3 Real Life Applications & Use Cases
- 28. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 28 #tatvicwebinar
Next Webinar
Webinar: Everything You Need to Know about
GTM V2
When: Feb 18th 10:00 AM PDT Guest Speaker
Phil Pearce
(Web Analyst/ PPC/ SEO Expert)
You will learn:
• How to Upgrade to GTM V2
• Mistakes to Avoid while Upgrading
• A Quick Checklist for Upgrading to GTM V2
And much more…
Register Here - http://bit.ly/gtm-v2-webinar
- 29. #tatvicwebinar
A GACP and GTMCP company
1/28/2015 29 #tatvicwebinar
Kushan Shah
Twitter: @kushan_s
Thank You!
Andy Granowitz
Twitter: @agrano
Editor's Notes
- Very often we analyze how campaigns perform directly for us. Someone visits via a campaign, and makes a purchase. This is good analysis, but often leaves money on the table. What happens to these visitors the subsequent times they visit? Especially if they are first time visitors - customer acquisitions. Perhaps some campaigns are slower to convert, or lead to more loyal customers over time.
- What we want is a cumulative graph of the campaign’s performance over time, so we can see the total value generated by the campaign.
Let’s dig into how to do this.