SlideShare a Scribd company logo
SLO
DRIVEN
DEVELOPMENT
Alon Nativ, Tomorrow.io
SLO Driven Development
SLO
DRIVEN
DEVELOPMENT
Photo by Oscar Keys on Unsplash
Name: Alon Nativ
Company: Tomorrow.io
Hobbies: Rant about Python
@anativ /anativ
Accurate ML & Data Save Lives
SLO Driven Development
LIGHTNING ALERT
Image by Keli Black from Pixabay
Notify on
Every
Lightning
In Less
Than
500 ms
Image by David Schwarzenberg from Pixabay
SIMPLE
DESIGN
SLO Driven Development
Why?
Requirements
Lambda?
Cold Start
Pub/Sub?
No SLA
@#$!&%^
!!!!
Notify on
Every
Lightning
In Less
Than
500 ms
Image by David Schwarzenberg from Pixabay
Notify on
EVERY
Lightning
In Less
Than
500 ms
Image by David Schwarzenberg from Pixabay
The most IMPORTANT
feature of any system is its
RELIABILITY
100% is the wrong
RELIABILITY
target for basically
EVERYTHING.
Benjamin Trynor Sloss
VP of 24x7 Engineering, Google
every number that you pick has
DIRECT IMPACT on your cost velocity
and architecture
99.5
99.4 99.6 99.7 99.8 99.9 1
Image by Arek Socha from Pixabay
SLA
Photo by Todd Quackenbush on Unsplash
SLA - Service Level Agreement
Binding
Agreement
Pay Sales &
Customers
Image by Okan Caliskan from Pixabay
DOWN OVER a DAY
Image by Markus Kammermann from Pixabay
SLA Refund Time / Month
99% > X >= 95% 25% 1d 12h 31m
If you are proud of your
SLA you are probably
doing something
WRONG
SLA SLO
Photo by Todd Quackenbush on Unsplash
SLO - Service Level Objectives
User
Happiness
Your
Expectations
Product &
SRE*
GOOD SLO
0ms 1500ms
?
SLA SLO SLI
SLI
SLI
Photo by Todd Quackenbush on Unsplash
SLI - Service Level Indicator
Key Metrics Monitors Developers &
SRE*
SLI =
good events
valid events
X 100
recipe
for
good
SLI
GOOD SLI
Up to 4 No Internal*
Metrics
2-4
Response
Time
Number Of
Results
Top Clicks CPU
X
HIGH CORRELATION
Bad Good
HIGH CORRELATION
This slide can’t be reached
ERROR_NO_SLIDE_FOUND
1 - SLO = ERROR BUDGET
SPENDING
ERROR
BUDGET
SPENDING
ERROR BUDGET
SLI Error Budget
SPENDING
ERROR BUDGET
SLI Error Budget
ERRORS
PER DAY
Weekend
MetaGoat
Team A: TESTS
Team B: CI/CD
W. Edwards Deming.
Data Scientist
Without DATA
you are another
person with an
OPINION.
MTTR / MTTF
MTTR
Mean Time To Recovery
MTTF
Mean Time to Failure
Team B: MTTR (rollback)
Team A: MTTF (tests)
Team A: MTTF (tests)
Team B: MTTR (rollback)
Team B: MTTR (rollback)
Team A: MTTF (tests)
If you can’t
MEASURE it, you
can’t IMPROVE it.
Lord Kelvin
Mathematician & engineer
TRADEOFFS
SPARE BUDGET
Features Risky
Experiments
Spot /
preemptible
Scale Down A/B Testing
Image by PublicDomainPictures from Pixabay
OUT OF BUDGET
Deployment
freeze
Post Mortem CI/CD
Monitoring Relax SLO Deprecate
Services
Photo by Allef Vinicius on Unsplash
HIGH SLO
Less Budget Development
Time
Sleeping
Hours
Maintenance
x2-x10
Photo by Gina Neri on Unsplash
FIRST STEPS
Image by Pexels from Pixabay
USER
CENTRIC
Photo by Mark Pan4ratte on Unsplash
In GOD we trust
all others bring
DATA
W. Edwards Deming.
Data Scientist
Photo by Jackson David on Unsplash
USE YOUR
BUDGET
Image by Olya Adamovich from Pixabay
Users
Data
Budget
Photo by montatip lilitsanong on Unsplash
If you can’t
manage your
RELIABILITY,
your reliability
MANAGES you
Photo by Jonathan Klok on Unsplash

More Related Content

SLO Driven Development