Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform

© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
My Nguyen – Solutions Architect – Amazon Web Services Vietnam
AWS’s philosophy on
designing
MLOps platform
Dec 2020

© 2019, Amazon Web Services, Inc. or its Affiliates.
Agenda
• What is MLOps?
• DevOps vs MLOps
• DevOps practices inheritance
• Machine learning development lifecycle
• Unique driving factors to MLOps
• Personas
• Unique challenges faced by ML workload
• MLOps practices on Amazon SageMaker
• Complete separation of steps (and their environments)
• Versioning & tracking
• Pipeline automation
• Continuous improvement
• Demo
• QnA
2

What is MLOps?
Operationalizing machine learning workloads

DevOps vs MLOps 4

Notes: Technology is just a piece of the overall picture 5

DevOps practices inheritance
• Communication & collaboration
• Continuous integration
• Continuous delivery/deployment
• Microservices design
• Infrastructure-as-code & configuration-as-code
• Continuous monitoring & logging
6

Machine learning development lifecycle 7

Unique driving factors to MLOps

Personas
• Business stakeholder
• Data scientist
• Domain expert
• Data engineer
• Security engineer
• Machine learning/DevOps engineer
• Software engineer
All with different skillsets & priorities
9

Unique challenges
• Data:
• The need to utilize production data in development activities
• Dependencies on data pipelines
• Longer experiment lifecycles
• Output of model artifacts:
• Independent lifecycles between model and integrated applications/systems
• Monitoring & tracking of experiments and models
• Unique metrics for performance evaluation
10

MLOps practices on Amazon SageMaker

Complete separation of steps
101011010
010101010
000011110
Data processing Explore
& Build
Train
&Validate
Deploy Monitor
12

Versioning & tracking of every steps 13

Pipeline automation
Metaflow Apache Airflow AWS Step FunctionsKubeflowFlyte
14

SageMaker workflow
The notebook: An entry-point / studio / IDE
Notebook: Explore and Interact
Data Scientists
SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
15

SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
Prepare data and script; find or build container image(s)
Training Data
Custom Code
Training Image
Framework Code
Data Scientists
16

SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
Run a training job to create a model artifact
Training Job
Custom
model.tar.gz
Training Data
Custom Code Training Image
Framework CodeFrameworkData
Data Scientists
17

SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
Deploy the model to a real-time inference endpoint
Inference Endpoint
Custom
Inference Image
model.tar.gz
Training Data
Framework Code
Training Image
Framework Code
FrameworkModel
Data Scientists
Inference Requests
Custom Code
18

SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
(…Or run a batch transform job)
Transform Job
Custom
Inference Image
model.tar.gz Framework Code
Training Image
Framework Code
FrameworkModel
Data Scientists
Input Data
Custom Code
Results
19

SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
Training Job
Endpoint /Transformer
Custom
Custom
Inference Image
model.tar.gz
Training Data
Custom Code
Framework Code
Training Image
Framework Code
FrameworkModel
FrameworkData
Data Scientists
Inference Requests
20

Continuous improvement
SageMaker
Hosting
Services
SageMaker
Batch
Transform
SageMaker
Notebooks
SageMaker
Autopilot
SageMaker
Experiments
SageMaker
GroundTruth
SageMaker
Processing
SageMaker
Model
Monitor
Amazon
Augmented
AI
SageMaker
Training
SageMaker
Debugger
SageMaker
Hyperparameter
Tuning
SageMaker Studio, the First Fully Integrated Development
Environment For Machine Learning
21

Demo
Transformation from local notebook to SageMaker workflow

The bigger picture 23

QnA
References:
https://d1.awsstatic.com/whitepapers/architecture/wellarchitected-Machine-Learning-Lens.pdf
https://github.com/aws-samples/aws-stepfunctions-byoc-mlops-using-data-science-sdk
https://github.com/apac-ml-tfc/sagemaker-workshop-101

Thank you!
My Nguyen - https://www.linkedin.com/in/mynguyen6512/

Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (20)

Similar to Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform

Similar to Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform (20)

More from Grokking VN

More from Grokking VN (20)

Recently uploaded

Recently uploaded (20)

Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform

Editor's Notes