SlideShare a Scribd company logo
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Julien Simon
Global Evangelist, AI & Machine Learning
@julsimon
Build, train and deploy Machine Learning
models at scale
Put Machine Learning in the hands
of every developer and data scientist
Our mission
Application
Services
Platform
Services
Frameworks
& Infrastructure
API-driven services: Vision, Language & Speech Services, Chatbots
AWS ML Stack
h t t p s : / / m l . a w s
h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / a - m a p - f o r - m a c h i n e - l e a r n i n g - o n - a w s - a 2 8 5 f c d 8 d 9 3 2
Deploy machine learning models with high-performance machine learning algorithms,
broad framework support, and one-click training, tuning, and inference.
Develop sophisticated models with any framework, create managed, auto-scaling
clusters of GPUs for large scale training, or run prediction
on trained models.
Application
Services
Platform
Services
Frameworks
& Infrastructure
API-driven services: Vision, Language & Speech Services, Chatbots
Deploy machine learning models with high-performance machine learning algorithms,
broad framework support, and one-click training, tuning, and inference.
Develop sophisticated models with any framework, create managed, auto-scaling
clusters of GPUs for large scale training, or run prediction
on trained models.
AWS ML Stack
h t t p s : / / m l . a w s
h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / a - m a p - f o r - m a c h i n e - l e a r n i n g - o n - a w s - a 2 8 5 f c d 8 d 9 3 2
Data Visualization &
Analysis
Business Problem
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
YesNo
DataAugmentation
Feature
Augmentation
The Machine Learning Process
Re-training
Predictions
Amazon SageMaker
Pre-built
notebooks for
common
problems
K-Means Clustering
Principal Component Analysis
Neural Topic Modelling
FactorizationMachines
Linear Learner
XGBoost
Latent DirichletAllocation
Image Classification
Seq2Seq,
And more!
ALGORITHMS
Apache MXNet, Chainer
TensorFlow, PyTorch, scikit-learn
FRAMEWORKS Set up and manage
environments for training
Train and tune
model (trial and
error)
Deploy model
in production
Scale and manage the
production environment
Built-in, high-
performance
algorithms
Build
Amazon SageMaker
Pre-built
notebooks for
common
problems
K-Means Clustering
Principal Component Analysis
Neural Topic Modelling
FactorizationMachines
Linear Learner
XGBoost
Latent DirichletAllocation
Image Classification
Seq2Seq,
And more!
ALGORITHMS
Apache MXNet, Chainer
TensorFlow, PyTorch, scikit-learn
FRAMEWORKS Set up and manage
environments for training
Train and tune
model (trial and
error)
Deploy model
in production
Scale and manage the
production environment
Built-in, high-
performance
algorithms
Build
Amazon SageMaker
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
Train
Deploy model
in production
Scale and manage the
production
environment
Build
Amazon SageMaker
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
DeployTrainBuild
Amazon SageMaker
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
Deploy
Model compilation
Elastic inference
Inference pipelines
TrainBuild
P3DN, C5N
TensorFlow on 256 GPUs
Resume HPO tuning job
New built-in algorithms
scikit-learn environment
Model marketplace
Search
Git integration
Elastic inference
Machine Learning Marketplace
Working with Amazon SageMaker
The Amazon SageMaker API
• Python SDK orchestrating all Amazon SageMaker activity
• High-level objects for algorithm selection, training, deploying,
automatic model tuning, etc.
• Spark SDK (Python & Scala)
• AWS CLI: ‘aws sagemaker’
• AWS SDK: boto3, etc.
Model Training (on EC2)
Model Hosting (on EC2)
Trainingdata
Modelartifacts
Training code Helper code
Helper codeInference code
GroundTruth
Client application
Inference code
Training code
Inference requestInference response
Inference Endpoint
Training code
Factorization Machines
Linear Learner
Principal Component Analysis
K-Means Clustering
XGBoost
And more
Built-in Algorithms Bring Your Own ContainerBring Your Own Script
Model options
Built-in algorithms
Built-in algorithms
orange: supervised, yellow: unsupervised
Linear Learner: regression, classification Image Classification: Deep Learning (ResNet)
Factorization Machines: regression, classification,
recommendation
Object Detection (SSD): Deep Learning
(VGG or ResNet)
K-Nearest Neighbors: non-parametric regression and
classification
Neural Topic Model: topic modeling
XGBoost: regression, classification, ranking
https://github.com/dmlc/xgboost
Latent Dirichlet Allocation: topic modeling (mostly)
K-Means: clustering Blazing Text: GPU-based Word2Vec,
and text classification
Principal Component Analysis: dimensionality
reduction
Sequence to Sequence: machine translation, speech
to text and more
Random Cut Forest: anomaly detection DeepAR: time-series forecasting (RNN)
Object2Vec: general-purpose embedding IP Insights: usage patterns for IP addresses
Semantic Segmentation: Deep Learning
Demo:
Image classification with Caltech-256
https://gitlab.com/juliensimon/dlnotebooks/sagemaker/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Blazing Text
https://dl.acm.org/citation.cfm?id=3146354
Demo:
Text Classification with BlazingText
https://github.com/awslabs/amazon-sagemaker-
examples/tree/master/introduction_to_amazon_algorithms/blazingtext_text_classification_dbpedia
XGBoost
• Open Source project
• Popular tree-based algorithm
for regression, classification
and ranking
• Builds a collection of trees.
• Handles missing values
and sparse data
• Supports distributed training
• Can work with data sets larger
than RAM
https://github.com/dmlc/xgboost
https://xgboost.readthedocs.io/en/latest/
https://arxiv.org/abs/1603.02754
Demo: XGBoost
AWS re:Invent 2018 workshop
https://gitlab.com/juliensimon/ent321
Built-in libraries
Demo:
Keras/TensorFlow CNN on CIFAR-10
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-
sdk/tensorflow_keras_cifar10/tensorflow_keras_CIFAR10.ipynb
Demo:
Sentiment analysis with Apache MXNet
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-
sdk/mxnet_sentiment_analysis_with_gluon.ipynb
Integration with AWS DeepLens
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Use your own models with AWS DeepLens
• AWS DeepLens can run TensorFlow, Caffe and Apache MXNet
models
• Inception
• MobileNet
• NasNet
• ResNet
• Etc.
• Train or fine-tune your model on Amazon SageMaker
• Deploy to AWS DeepLens with AWS Greengrass
Run inference
and local actions
on device
Send insights
to the Cloud
Generic
Deploy model
and Lambda function
Write inference code
Setup Greengrass
Architecture
Train model
Demo:
Image classification with Caltech-256
https://gitlab.com/juliensimon/dlnotebooks/sagemaker/
Amazon SageMaker
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high-
performance
algorithms
One-click
training
Hyperparameter
optimization
Build Train Deploy
Selected Amazon SageMaker customers
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Getting started
http://aws.amazon.com/free
https://ml.aws
https://aws.amazon.com/sagemaker
https://github.com/aws/sagemaker-python-sdk
https://github.com/aws/sagemaker-spark
https://github.com/awslabs/amazon-sagemaker-examples
https://gitlab.com/juliensimon/ent321
https://medium.com/@julsimon
https://gitlab.com/juliensimon/dlnotebooks
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Julien Simon
Global Evangelist, AI & Machine Learning
@julsimon
https://medium.com/@julsimon
Thank you!

More Related Content

Build, Train, and Deploy ML Models at Scale

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Julien Simon Global Evangelist, AI & Machine Learning @julsimon Build, train and deploy Machine Learning models at scale
  • 2. Put Machine Learning in the hands of every developer and data scientist Our mission
  • 3. Application Services Platform Services Frameworks & Infrastructure API-driven services: Vision, Language & Speech Services, Chatbots AWS ML Stack h t t p s : / / m l . a w s h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / a - m a p - f o r - m a c h i n e - l e a r n i n g - o n - a w s - a 2 8 5 f c d 8 d 9 3 2 Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference. Develop sophisticated models with any framework, create managed, auto-scaling clusters of GPUs for large scale training, or run prediction on trained models.
  • 4. Application Services Platform Services Frameworks & Infrastructure API-driven services: Vision, Language & Speech Services, Chatbots Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference. Develop sophisticated models with any framework, create managed, auto-scaling clusters of GPUs for large scale training, or run prediction on trained models. AWS ML Stack h t t p s : / / m l . a w s h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / a - m a p - f o r - m a c h i n e - l e a r n i n g - o n - a w s - a 2 8 5 f c d 8 d 9 3 2
  • 5. Data Visualization & Analysis Business Problem ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging YesNo DataAugmentation Feature Augmentation The Machine Learning Process Re-training Predictions
  • 6. Amazon SageMaker Pre-built notebooks for common problems K-Means Clustering Principal Component Analysis Neural Topic Modelling FactorizationMachines Linear Learner XGBoost Latent DirichletAllocation Image Classification Seq2Seq, And more! ALGORITHMS Apache MXNet, Chainer TensorFlow, PyTorch, scikit-learn FRAMEWORKS Set up and manage environments for training Train and tune model (trial and error) Deploy model in production Scale and manage the production environment Built-in, high- performance algorithms Build
  • 7. Amazon SageMaker Pre-built notebooks for common problems K-Means Clustering Principal Component Analysis Neural Topic Modelling FactorizationMachines Linear Learner XGBoost Latent DirichletAllocation Image Classification Seq2Seq, And more! ALGORITHMS Apache MXNet, Chainer TensorFlow, PyTorch, scikit-learn FRAMEWORKS Set up and manage environments for training Train and tune model (trial and error) Deploy model in production Scale and manage the production environment Built-in, high- performance algorithms Build
  • 8. Amazon SageMaker Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Train Deploy model in production Scale and manage the production environment Build
  • 9. Amazon SageMaker Fully managed hosting with auto- scaling One-click deployment Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization DeployTrainBuild
  • 10. Amazon SageMaker Fully managed hosting with auto- scaling One-click deployment Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Deploy Model compilation Elastic inference Inference pipelines TrainBuild P3DN, C5N TensorFlow on 256 GPUs Resume HPO tuning job New built-in algorithms scikit-learn environment Model marketplace Search Git integration Elastic inference
  • 12. Working with Amazon SageMaker
  • 13. The Amazon SageMaker API • Python SDK orchestrating all Amazon SageMaker activity • High-level objects for algorithm selection, training, deploying, automatic model tuning, etc. • Spark SDK (Python & Scala) • AWS CLI: ‘aws sagemaker’ • AWS SDK: boto3, etc.
  • 14. Model Training (on EC2) Model Hosting (on EC2) Trainingdata Modelartifacts Training code Helper code Helper codeInference code GroundTruth Client application Inference code Training code Inference requestInference response Inference Endpoint
  • 15. Training code Factorization Machines Linear Learner Principal Component Analysis K-Means Clustering XGBoost And more Built-in Algorithms Bring Your Own ContainerBring Your Own Script Model options
  • 17. Built-in algorithms orange: supervised, yellow: unsupervised Linear Learner: regression, classification Image Classification: Deep Learning (ResNet) Factorization Machines: regression, classification, recommendation Object Detection (SSD): Deep Learning (VGG or ResNet) K-Nearest Neighbors: non-parametric regression and classification Neural Topic Model: topic modeling XGBoost: regression, classification, ranking https://github.com/dmlc/xgboost Latent Dirichlet Allocation: topic modeling (mostly) K-Means: clustering Blazing Text: GPU-based Word2Vec, and text classification Principal Component Analysis: dimensionality reduction Sequence to Sequence: machine translation, speech to text and more Random Cut Forest: anomaly detection DeepAR: time-series forecasting (RNN) Object2Vec: general-purpose embedding IP Insights: usage patterns for IP addresses Semantic Segmentation: Deep Learning
  • 18. Demo: Image classification with Caltech-256 https://gitlab.com/juliensimon/dlnotebooks/sagemaker/
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Blazing Text https://dl.acm.org/citation.cfm?id=3146354
  • 20. Demo: Text Classification with BlazingText https://github.com/awslabs/amazon-sagemaker- examples/tree/master/introduction_to_amazon_algorithms/blazingtext_text_classification_dbpedia
  • 21. XGBoost • Open Source project • Popular tree-based algorithm for regression, classification and ranking • Builds a collection of trees. • Handles missing values and sparse data • Supports distributed training • Can work with data sets larger than RAM https://github.com/dmlc/xgboost https://xgboost.readthedocs.io/en/latest/ https://arxiv.org/abs/1603.02754
  • 22. Demo: XGBoost AWS re:Invent 2018 workshop https://gitlab.com/juliensimon/ent321
  • 24. Demo: Keras/TensorFlow CNN on CIFAR-10 https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python- sdk/tensorflow_keras_cifar10/tensorflow_keras_CIFAR10.ipynb
  • 25. Demo: Sentiment analysis with Apache MXNet https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python- sdk/mxnet_sentiment_analysis_with_gluon.ipynb
  • 27. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
  • 28. Use your own models with AWS DeepLens • AWS DeepLens can run TensorFlow, Caffe and Apache MXNet models • Inception • MobileNet • NasNet • ResNet • Etc. • Train or fine-tune your model on Amazon SageMaker • Deploy to AWS DeepLens with AWS Greengrass
  • 29. Run inference and local actions on device Send insights to the Cloud Generic Deploy model and Lambda function Write inference code Setup Greengrass Architecture Train model
  • 30. Demo: Image classification with Caltech-256 https://gitlab.com/juliensimon/dlnotebooks/sagemaker/
  • 31. Amazon SageMaker Fully managed hosting with auto- scaling One-click deployment Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Build Train Deploy
  • 33. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Getting started http://aws.amazon.com/free https://ml.aws https://aws.amazon.com/sagemaker https://github.com/aws/sagemaker-python-sdk https://github.com/aws/sagemaker-spark https://github.com/awslabs/amazon-sagemaker-examples https://gitlab.com/juliensimon/ent321 https://medium.com/@julsimon https://gitlab.com/juliensimon/dlnotebooks
  • 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Julien Simon Global Evangelist, AI & Machine Learning @julsimon https://medium.com/@julsimon Thank you!