Data Science Empowerment through
DevOps, Cloud Computing and Building
your own Applications
Kelly O’Briant
Data Science Product Engineer
@kellrstats | @RLadiesDC
• R-Ladies Washington DC Chapter Founder
and Organizer
• R-Ladies Global unofficial “cloud expert”
• Publish a monthly series called .rprofile
on the rOpenSci blog
• Business Science University
course developer
My Talk Goal:
I want you to leave this conference so
excited, you go back to work and completely
ignore whatever project you’re supposed to
be working on because you’re so pumped up
about building a data product and you can’t
stop yourself from doing it.
Why I talk about Data Science Empowerment
R-Ladies events
• How do I get a job as a data scientist/analyst/anything?
• What should I study/learn/do/produce to be a data scientist?
• Am I even a data scientist? Is what I do data science?
Why are data products empowering?
• I use data products to justify/prove to myself that I belong, that my
ideas are valid and to help me communicate with people who are bad
at listening (or when I’m bad at speaking)

Traumatic Experiences!
Windows Lab Linux Lab Mac Lab
R-Ladies + International Women’s Day
Twitter Campaign
• Create a twitter bot using R code
to tweet out a profile for every
woman in our Global speaker
• Project collaboration through GitHub
• Docker linked to a local volume
• Twitter Application(s)
Deploy and Use H2O Machine Learning
Models in Production
• Build and validate a model in python
working in a Jupyter Notebook with the
H2O machine learning API
• Package the model code as a POJO or
MOJO file
• Deploy the model to STEAM to
create an ML prediction service complete
with a REST API query URL
Create and Maintain a Personal Website
• Use the blogdown package in an
RStudio project to create the
framework for a Hugo static
• Create content for the site by
writing Rmarkdown files
• Compile and deploy the static site –
choose a hosting mechanism:
GitHub? Continuous Integration
with Netlify?

Why are you so into R?
• It’s great for Data Science
• The community at large is awesome
• The female community is awesome
• R integrates with other tech
• It’s growing really fast in cool ways
• I can use it to build cool stuff
Worldwide organization
that promotes gender diversity
in the R community via meetups
and mentorship in a friendly and
safe environment
Why are you so into R?
• It’s great for Data Science
• The community at large is awesome
• The female community is awesome
• R integrates with other tech
• It’s growing really fast in cool ways
• I can use it to build cool stuff
Back to the topic: DataOps
1. It usually takes a little DevOps to build a Data Product
2. Building more Data Products is empowering – good for your portfolio and soul
What is DevOps
And why should Data-oriented people care about it?
DevOps is…
“A combination of cultural philosophies, practices
and tools that increases an organizations ability to
deliver applications and services at high velocity.
- AWS DevOps Blog

Do This – without pulling all
your hair out?
Deliver applications and services at high velocity
Do This – Super Effectively
Host your analysis
• Share
• Publish
• Collaborate
• Prove a point
• Serve a purpose
• Be reproducible
• Save the day
What is DataOps?
Anywhere you can put a little DevOps magic into your data science workflow
Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a Hint of DevOps

Build More Data Products
So that you and others can use them to solve real problems
Try Shiny!
The Iris Dataset
Do Machine Learning!
So Hot Right Now
What Species
is this iris??
Credit: xkcd

• Write functions to generate the
plots you’re envisioning
• Package: ggplot2
• Train and validate a machine
learning model to use
• Package: caret
geom_hist_basic <- function(var){
ggplot(iris, aes_string(x = var)) +
geom_histogram() +
facet_wrap(~ Species)
predict_matrix(fit.knn, validation)
Confusion Matrix and Statistics
Prediction setosa versicolor virginica
setosa 10 0 0
versicolor 0 8 1
virginica 0 2 9
2. Turn your R code into an R Shiny app
Client Side Code:
User Interface and
Input Elements
Server Side Code:
(Reactive) R Output
shinyApp(ui = fluidPage, server = serverFunction)
Try Plumber!
Let’s Build a REST API with R
1. Write Functions in R
Expose Data or Model
Produce Analysis or Visualization
Data Agnostic
Perform Analysis on New Data
2. Create Plumber
API Endpoints
- Get
- Post
4. Send Requests to
the Plumber Service
Through external (or
internal) Applications
- Jupyter Notebooks
- Web Apps
3. Host the Plumber
Script on a Server
- Create Plumber
router object
- Run in an R Session

R Session
My Local File
- Plumber.R
- Dockerfile
Local Volume Link
Demo Framework
That’s it!
Now go build some sweet data products
Resources for Learning R
R-Ladies Global Meetups
• Get involved!
• More female speakers,
leaders, teachers, builders,

RStudio Webinars
• All of the talks
from RStudio::conf
2018 have just
been published
• Highly
Resources for Learning Shiny Development
Resources for Learning Plumber
on Twitter!
Note to self: Remember to give
out stickers
I have R-Ladies and R-Ladies Plumber Stickers!
I’m Kelly!
@kellrstats on Twitter

Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a Hint of DevOps

  • 1. DataOps Data Science Empowerment through DevOps, Cloud Computing and Building your own Applications
  • 2. Kelly O’Briant Data Science Product Engineer @kellrstats | @RLadiesDC • R-Ladies Washington DC Chapter Founder and Organizer • R-Ladies Global unofficial “cloud expert” • Publish a monthly series called .rprofile on the rOpenSci blog • Business Science University course developer
  • 3. My Talk Goal: I want you to leave this conference so excited, you go back to work and completely ignore whatever project you’re supposed to be working on because you’re so pumped up about building a data product and you can’t stop yourself from doing it.
  • 4. Motivation Why I talk about Data Science Empowerment R-Ladies events • How do I get a job as a data scientist/analyst/anything? • What should I study/learn/do/produce to be a data scientist? • Am I even a data scientist? Is what I do data science? Why are data products empowering? • I use data products to justify/prove to myself that I belong, that my ideas are valid and to help me communicate with people who are bad at listening (or when I’m bad at speaking)
  • 6. R-Ladies + International Women’s Day Twitter Campaign • Create a twitter bot using R code to tweet out a profile for every woman in our Global speaker directory • Project collaboration through GitHub • Docker linked to a local volume • Twitter Application(s)
  • 7. Deploy and Use H2O Machine Learning Models in Production • Build and validate a model in python working in a Jupyter Notebook with the H2O machine learning API • Package the model code as a POJO or MOJO file • Deploy the model to STEAM to create an ML prediction service complete with a REST API query URL
  • 8. Create and Maintain a Personal Website • Use the blogdown package in an RStudio project to create the framework for a Hugo static website • Create content for the site by writing Rmarkdown files • Compile and deploy the static site – choose a hosting mechanism: GitHub? Continuous Integration with Netlify?
  • 9. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff
  • 10. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff #rstats
  • 11. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff Worldwide organization that promotes gender diversity in the R community via meetups and mentorship in a friendly and safe environment
  • 12. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff
  • 13. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff
  • 14. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff
  • 15. Back to the topic: DataOps 1. It usually takes a little DevOps to build a Data Product 2. Building more Data Products is empowering – good for your portfolio and soul
  • 16. What is DevOps And why should Data-oriented people care about it? DevOps is… “A combination of cultural philosophies, practices and tools that increases an organizations ability to deliver applications and services at high velocity. - AWS DevOps Blog
  • 17. Deliver applications and services at high velocity Do This – without pulling all your hair out?
  • 18. Deliver applications and services at high velocity Do This – Super Effectively Host your analysis • Share • Publish • Collaborate • Prove a point • Serve a purpose • Be reproducible • Save the day
  • 19. What is DataOps? DataOps? Anywhere you can put a little DevOps magic into your data science workflow
  • 21. Build More Data Products So that you and others can use them to solve real problems
  • 24. Do Machine Learning! So Hot Right Now What Species is this iris?? Credit: xkcd
  • 25. 1. Turn your ideas into R code • Write functions to generate the plots you’re envisioning • Package: ggplot2 • Train and validate a machine learning model to use • Package: caret geom_hist_basic <- function(var){ ggplot(iris, aes_string(x = var)) + geom_histogram() + facet_wrap(~ Species) } predict_matrix(fit.knn, validation) Confusion Matrix and Statistics Prediction setosa versicolor virginica setosa 10 0 0 versicolor 0 8 1 virginica 0 2 9
  • 26. 2. Turn your R code into an R Shiny app Client Side Code: User Interface and Input Elements Server Side Code: (Reactive) R Output Elements shinyApp(ui = fluidPage, server = serverFunction) fluidPage Code serverFunction Code
  • 28. Let’s Build a REST API with R 1. Write Functions in R Expose Data or Model Produce Analysis or Visualization Data Agnostic Perform Analysis on New Data 2. Create Plumber API Endpoints - Get - Post 4. Send Requests to the Plumber Service Through external (or internal) Applications - Jupyter Notebooks - Web Apps 3. Host the Plumber Script on a Server - Create Plumber router object - Run in an R Session
  • 29. Docker Image RStudio Server R Session Running Plumber REST API My Local File System - Plumber.R - Dockerfile Local Volume Link Applications & Notebooks Requests! Demo Framework
  • 30. That’s it! Now go build some sweet data products
  • 32. R-Ladies Global Meetups • Get involved! • More female speakers, leaders, teachers, builders, friends! @RLadiesGlobal
  • 33. RStudio Webinars • All of the talks from RStudio::conf 2018 have just been published • Highly recommend!
  • 34. Resources for Learning Shiny Development
  • 35. Resources for Learning Plumber @TrestleJeff on Twitter!
  • 36. Note to self: Remember to give out stickers I have R-Ladies and R-Ladies Plumber Stickers! I’m Kelly! @kellrstats on Twitter