SlideShare a Scribd company logo
1Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way
| Dan Collins | 2018
AGENDA
Who is Uptake?
Solving Hard Problems
Facts of Life
Continuous Evolution
Data Engineering the Startup Way
AGENDA
Who is Uptake?
4Copyright © 2018 Uptake06-Sep-18AWS Startup Day
• CEO and Co-founder Brad Keywell
• President Ganesh Bell
• ~ 4 years old
• 100+ Customers
• Two-time CNBC Disruptor 50
honoree
• World Economic Forum Technology
Pioneer
• One of Chicago’s best workplaces
for 2018 by Fortune
• Uptake is ranked in top 25 of the
2017 “Forbes Cloud 100”
5Copyright © 2018 Uptake06-Sep-18AWS Startup Day
AGENDA
Solving Hard Problems
7Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Industrial AI and IOT
• Predictive Analytics
• Anomaly Detection
• Label Correction
• Applications and AI UX
8Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Industrial Data
• Telematics
• SCADA Systems
• PLC / Sensor Data
• Contextual Data
• Resource Planning
• Customer Relationships
• Content Management
9Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Industrial Data is… Dirty
• Out of Order
• High Volatility
• System-wide Snapshots with no deltas
• Pre-determined Aggregation
• Duplicated, Partitioned, Compressed
10Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Industrial Data has a past…
• Very old systems (some > 30 years old)
• Susceptible to policy changes over time (formatting, time, etc)
• Most integrations follow a standard, but not the same one
11Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
12Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
13Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
• ~150,000 writes/second
• Across tenants
• Across integrations
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Processing Time
14Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
• How it really works
• Remember, industrial data is dirty
• We need to validate, hydrate,
quarantine, and persist updates
as they come in
• We need to be consistent or our
data science models lose their
efficacy
• At 150,000k writes/second
1 2 3 5 6 7 8 9
1234
9 7 9 10
1 1 1 2 3 4 1 8 9 2
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Processing Time
15Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
16Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Platform Instance Platform Instance Platform InstancePlatform
Oh my!
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform
Platform Instance
W
e did it!
17Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Solving Hard Problems
Shared Platform
Configured
Product
Bespoke
Solution
Platform
W
e did it!
More
Feature set
Feature set
AGENDA
Facts of Life
19Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Facts of Life
Machine Learning: The
High-Interest Credit Card of
Tech Debt
20Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Facts of Life
What people talk about
The hard parts
21Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Facts of Life
Changing Anything,
Changes Everything
22Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Facts of Life
So, to recap:
• Take dirty data from old systems
• Scale it to > 150,000 writes/seconds
• Spin up data science models on top
and balance them really carefully
• What could go wrong?
xkcd.com/1838
AGENDA
Continuous Evolution
24Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution
1. Proof of Concept
2. Build it
3. Learn from it
4. Repeat
25Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution – Proof of Concept
• Prototype: from works on my machine to scales in the cloud
• We create real-world working models written in R and Python
and sample data sets
• Focus on the problem, not the infrastructure, monitoring, etc
• Use the “beefiest” boxes to find equilibrium
• AWS allows you to go all in as soon as you’re ready to start
• Quickly spin up test instances or scaffold an environment
26Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution – Build It
• Build out for scale
• Account for real-world data sets on distributed systems
• Lean on managed services and IaaS as your foundation
• AWS managed services and elastic scaling can drastically
reduce the time it takes to get up and running
• You can be production ready very quickly
27Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution – Build It
What people talk about
The hard parts
AWS kickstarts your data
engineering here
28Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution – Learn from It
• Codify patterns and encourage repeatability
• From bespoke to baked in
• Review trade-offs
• Analyze compute, I/O, parallelism
• Partition the problem space
• The scientific method, AWS’ huge array of services, and some
luck let you put hindsight to work as you build
29Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Continuous Evolution
Repeat “A program that is used and that as an
implementation of its specification reflects some
other reality, undergoes continual
change or becomes progressively less
useful. The change or decay process continues
until it is judged more cost effective to replace
the system with a recreated version.”
- Meir Lehman’s law of software evolution
AGENDA
Data Engineering the
Startup Way
31Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way
Monolith
Microservices
Platform
• Features and efficiency are better fit each iteration
• Survival depends on flexibility and feedback
Data Science Applications Data Engineering
Platform
32Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way
1. Focus on Value
2. Choose good abstractions
3. Act like an enterprise
4. Invest
5. Be Open
33Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Focus on Value
• You have great ideas.
• Focus on where you have value, let others solve the less
interesting problems
• Use what’s available when it’s available, check often
• AWS and services like it can remove noise, letting you focus on
where you’re most innovative
34Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Choose good abstractions
• Choose abstractions that let you take advantage of managed
services
• Don’t reinvent the wheel and don’t be afraid to change the
implementation
• docker, microservices, test driven development, continuous
delivery, automation, etc can all help you here
35Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Act like an enterprise
• When you use world class, global services, you get the
services levels of world class, global services.
• Use services to enable your two person team operate like the
army of infra/ops they’re used to working with
• An outage is an outage no matter how small…
36Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Invest
• Pairing really smart people with really great services gives you
the flexibility to be curious while you deliver
• Put down a foundation in your data platform and use managed
services where you can
• Craft your platform
• Investing in your data engineering gives you repeatability and
“paved roads” you can use to accelerate your delivery
37Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Be Open
• There are a lot of smart people working on really useful
projects
• Scala, Flink, Spark, Kafka, Postgres, Docker, Airflow,
Kubernetes, Mesos, Kudu, Hive, Impala
• Get involved, share back, and use
open source
38Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Data Engineering the Startup Way – Oh, and Have Fun
• Don’t fight change, build systems and orgs that are flexible
• Use all the cool tech and packaged solutions to get you closer
to your vision
• And have fun!
• There’s never been a better time to be building
AGENDA
Recap
AGENDA
Who is Uptake?
Solving Hard Problems
Facts of Life
Continuous Evolution
Data Engineering the Startup Way
41Copyright © 2018 Uptake06-Sep-18AWS Startup Day
• is awesome
• There are hard problems and we’re
solving them
• You can solve your hard problems
too if you try
• AWS makes it easier, especially for
startups
• Build, Learn, Repeat
• Have fun
In Summary
42Copyright © 2018 Uptake06-Sep-18AWS Startup Day
Copyright © 2018 by Uptake Technologies Inc. All rights reserved. No parts of this document may be
distributed, reproduced, transmitted, or stored electronically without Uptake’s prior written permission. This
document contains Uptake's confidential and proprietary information. If a pre-existing contract containing
disclosure and use restrictions exists between your company and Uptake, you and your company will use the
information in this document subject to the terms of the pre-existing contract. If no such pre-existing contract
exists, you and your Company agree to protect the information in this document and agree not to reproduce or
disclose the information in any way. Uptake makes no warranties, express or implied, in this document. Uptake
shall not be liable for damages of any kind arising out of use of this document. Any discussion of potential
features is not a promise of future functionality.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thanks!

More Related Content

Data Engineering the Startup Way - AWS Startup Day Chicago 2018

  • 1. 1Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way | Dan Collins | 2018
  • 2. AGENDA Who is Uptake? Solving Hard Problems Facts of Life Continuous Evolution Data Engineering the Startup Way
  • 4. 4Copyright © 2018 Uptake06-Sep-18AWS Startup Day • CEO and Co-founder Brad Keywell • President Ganesh Bell • ~ 4 years old • 100+ Customers • Two-time CNBC Disruptor 50 honoree • World Economic Forum Technology Pioneer • One of Chicago’s best workplaces for 2018 by Fortune • Uptake is ranked in top 25 of the 2017 “Forbes Cloud 100”
  • 5. 5Copyright © 2018 Uptake06-Sep-18AWS Startup Day
  • 7. 7Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Industrial AI and IOT • Predictive Analytics • Anomaly Detection • Label Correction • Applications and AI UX
  • 8. 8Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Industrial Data • Telematics • SCADA Systems • PLC / Sensor Data • Contextual Data • Resource Planning • Customer Relationships • Content Management
  • 9. 9Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Industrial Data is… Dirty • Out of Order • High Volatility • System-wide Snapshots with no deltas • Pre-determined Aggregation • Duplicated, Partitioned, Compressed
  • 10. 10Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Industrial Data has a past… • Very old systems (some > 30 years old) • Susceptible to policy changes over time (formatting, time, etc) • Most integrations follow a standard, but not the same one
  • 11. 11Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems
  • 12. 12Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems
  • 13. 13Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems • ~150,000 writes/second • Across tenants • Across integrations 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 Processing Time
  • 14. 14Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems • How it really works • Remember, industrial data is dirty • We need to validate, hydrate, quarantine, and persist updates as they come in • We need to be consistent or our data science models lose their efficacy • At 150,000k writes/second 1 2 3 5 6 7 8 9 1234 9 7 9 10 1 1 1 2 3 4 1 8 9 2 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 Processing Time
  • 15. 15Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems
  • 16. 16Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Platform Instance Platform Instance Platform InstancePlatform Oh my! Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Platform Instance W e did it!
  • 17. 17Copyright © 2018 Uptake06-Sep-18AWS Startup Day Solving Hard Problems Shared Platform Configured Product Bespoke Solution Platform W e did it! More Feature set Feature set
  • 19. 19Copyright © 2018 Uptake06-Sep-18AWS Startup Day Facts of Life Machine Learning: The High-Interest Credit Card of Tech Debt
  • 20. 20Copyright © 2018 Uptake06-Sep-18AWS Startup Day Facts of Life What people talk about The hard parts
  • 21. 21Copyright © 2018 Uptake06-Sep-18AWS Startup Day Facts of Life Changing Anything, Changes Everything
  • 22. 22Copyright © 2018 Uptake06-Sep-18AWS Startup Day Facts of Life So, to recap: • Take dirty data from old systems • Scale it to > 150,000 writes/seconds • Spin up data science models on top and balance them really carefully • What could go wrong? xkcd.com/1838
  • 24. 24Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution 1. Proof of Concept 2. Build it 3. Learn from it 4. Repeat
  • 25. 25Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution – Proof of Concept • Prototype: from works on my machine to scales in the cloud • We create real-world working models written in R and Python and sample data sets • Focus on the problem, not the infrastructure, monitoring, etc • Use the “beefiest” boxes to find equilibrium • AWS allows you to go all in as soon as you’re ready to start • Quickly spin up test instances or scaffold an environment
  • 26. 26Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution – Build It • Build out for scale • Account for real-world data sets on distributed systems • Lean on managed services and IaaS as your foundation • AWS managed services and elastic scaling can drastically reduce the time it takes to get up and running • You can be production ready very quickly
  • 27. 27Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution – Build It What people talk about The hard parts AWS kickstarts your data engineering here
  • 28. 28Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution – Learn from It • Codify patterns and encourage repeatability • From bespoke to baked in • Review trade-offs • Analyze compute, I/O, parallelism • Partition the problem space • The scientific method, AWS’ huge array of services, and some luck let you put hindsight to work as you build
  • 29. 29Copyright © 2018 Uptake06-Sep-18AWS Startup Day Continuous Evolution Repeat “A program that is used and that as an implementation of its specification reflects some other reality, undergoes continual change or becomes progressively less useful. The change or decay process continues until it is judged more cost effective to replace the system with a recreated version.” - Meir Lehman’s law of software evolution
  • 31. 31Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way Monolith Microservices Platform • Features and efficiency are better fit each iteration • Survival depends on flexibility and feedback Data Science Applications Data Engineering Platform
  • 32. 32Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way 1. Focus on Value 2. Choose good abstractions 3. Act like an enterprise 4. Invest 5. Be Open
  • 33. 33Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Focus on Value • You have great ideas. • Focus on where you have value, let others solve the less interesting problems • Use what’s available when it’s available, check often • AWS and services like it can remove noise, letting you focus on where you’re most innovative
  • 34. 34Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Choose good abstractions • Choose abstractions that let you take advantage of managed services • Don’t reinvent the wheel and don’t be afraid to change the implementation • docker, microservices, test driven development, continuous delivery, automation, etc can all help you here
  • 35. 35Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Act like an enterprise • When you use world class, global services, you get the services levels of world class, global services. • Use services to enable your two person team operate like the army of infra/ops they’re used to working with • An outage is an outage no matter how small…
  • 36. 36Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Invest • Pairing really smart people with really great services gives you the flexibility to be curious while you deliver • Put down a foundation in your data platform and use managed services where you can • Craft your platform • Investing in your data engineering gives you repeatability and “paved roads” you can use to accelerate your delivery
  • 37. 37Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Be Open • There are a lot of smart people working on really useful projects • Scala, Flink, Spark, Kafka, Postgres, Docker, Airflow, Kubernetes, Mesos, Kudu, Hive, Impala • Get involved, share back, and use open source
  • 38. 38Copyright © 2018 Uptake06-Sep-18AWS Startup Day Data Engineering the Startup Way – Oh, and Have Fun • Don’t fight change, build systems and orgs that are flexible • Use all the cool tech and packaged solutions to get you closer to your vision • And have fun! • There’s never been a better time to be building
  • 40. AGENDA Who is Uptake? Solving Hard Problems Facts of Life Continuous Evolution Data Engineering the Startup Way
  • 41. 41Copyright © 2018 Uptake06-Sep-18AWS Startup Day • is awesome • There are hard problems and we’re solving them • You can solve your hard problems too if you try • AWS makes it easier, especially for startups • Build, Learn, Repeat • Have fun In Summary
  • 42. 42Copyright © 2018 Uptake06-Sep-18AWS Startup Day
  • 43. Copyright © 2018 by Uptake Technologies Inc. All rights reserved. No parts of this document may be distributed, reproduced, transmitted, or stored electronically without Uptake’s prior written permission. This document contains Uptake's confidential and proprietary information. If a pre-existing contract containing disclosure and use restrictions exists between your company and Uptake, you and your company will use the information in this document subject to the terms of the pre-existing contract. If no such pre-existing contract exists, you and your Company agree to protect the information in this document and agree not to reproduce or disclose the information in any way. Uptake makes no warranties, express or implied, in this document. Uptake shall not be liable for damages of any kind arising out of use of this document. Any discussion of potential features is not a promise of future functionality.
  • 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thanks!