Máy học (Machine learning) đang trở thành một trong những xu hướng lớn nhất trong phát triển hệ thống hiện đại, với khả năng đem đến những hiểu biết chiến lược, các dự đoán & cái nhìn chuyên sâu cho doanh nghiệp. Tuy nhiên, xây dựng & tích hợp 1 hệ thống máy học không phải lúc nào cũng dễ dàng, đặc biệt với những hệ thống lớn & hệ thống phân tán - khi mà các khuôn phép về phát triển máy học còn chưa đạt đến độ phát triển bằng hệ thống phần mềm.
Trong buổi thảo luận này, chúng ta sẽ cùng tìm hiểu cách Amazon Web Services (AWS) đã thiết kế & xây dựng 1 trong những nền tảng MLOps được ứng dụng rộng rãi nhất trên thế giới - Amazon SageMaker.
- Về diễn giả: My Nguyễn hiện là Kiến trúc sư giải pháp tại AWS Việt Nam, chuyên sâu vào hỗ trợ các giải pháp xây dựng hệ thống Máy học.
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
This document summarizes a webinar on building machine learning platforms. It discusses how operating ML models is complex, requiring tasks like monitoring performance, handling data drift, and ensuring governance and security. It then outlines common components of ML platforms, including data management, model management, and code/deployment management. The webinar will demonstrate how different organizations handle these components and include demos from four companies. It will also cover Databricks' approach to providing an ML platform that integrates various tools and simplifies the full ML lifecycle from data preparation to deployment.
This document provides an overview of DevOps on AWS and the AWS developer tools for continuous delivery including CodeCommit, CodeDeploy, CodePipeline, and Elastic Beanstalk. It discusses how these tools help implement a microservices architecture and continuous delivery approach to software development. Specifically, it describes how CodeCommit provides version control, CodeDeploy enables easy and reliable deployments, CodePipeline allows connecting tools for accelerated release processes, and Elastic Beanstalk provides a simple way to deploy applications.
This document provides an overview of Mustafa Kara's background and expertise in datacenter transformation. It discusses his 10 years of experience in roles such as senior consultant, Azure MVP, technical manager, and technical trainer. It then outlines his work as a speaker and writer for Microsoft events, Virtual Academy, universities, and personal websites. The rest of the document discusses strategies for transforming the datacenter, including moving from on-premises physical servers and VMs to a hybrid cloud model using public cloud off-premises and cloud on-premises. It highlights tools like Azure Migrate and database migration services that can help analyze costs and migrate applications, VMs, and data.
Amazon.com 의 개인화 추천 / 예측 기능을 우리도 써 봅시다. :: 심호진 - AWS Community Day 2019
Amazon Personalize
개인화 및 추천에 대하여
Amazon Personalize 소개
Amazon Personalize 사용 방법
데모 - 캡쳐 화면
결론
Amazon Forecast
예측 기술에 대하여
Amazon Forecast 소개
Amazon Forecast 사용 방법
데모 - 캡쳐 화면
결론
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
Learn about architecture best practices for combining AWS storage and database technologies. We outline AWS storage options (Amazon EBS, Amazon EC2 Instance Storage, Amazon S3 and Amazon Glacier) along with AWS database options including Amazon ElastiCache (in-memory data store), Amazon RDS (SQL database), Amazon DynamoDB (NoSQL database), Amazon CloudSearch (search), Amazon EMR (hadoop) and Amazon Redshift (data warehouse). Then we discuss how to architect your database tier by using the right database and storage technologies to achieve the required functionality, performance, availability, and durability—at the right cost.
Amazon SageMaker 모델 학습 방법 소개::최영준, 솔루션즈 아키텍트 AI/ML 엑스퍼트, AWS::AWS AIML 스페셜 웨비나
Amazon SageMaker Training과 Processing에 처음 입문 하고자 하는 분을 위해 동작 방식을 설명하고, 실행할 수 있는 가이드를 제공합니다.사용자는 Amazon SageMaker 노트북을 생성한 다음, 직접 정의한 별도의 GPU 또는 고성능 CPU로 구성된 학습 클러스터에서 학습 코드를 실행하여, 효율적으로 모델 학습과 데이터 전처리, 추론 결과 후처리 또는 모델 평가 등을 할 수 있도록 합니다. 추가적으로 Amazon SageMaker Experiments를 이용하여 학습 실험에 대한 구조화와 평가 메트릭 간의 비교를 체계적으로 관리하는 방법을 소개합니다.
This document provides resources for learning about the different phases and components of Azure Purview including documentation, training courses, how to create subscriptions and accounts, set up collections and scans, understand the data map and lineage, best practices, and connect data sources. It also lists some competitors to Azure Purview and provides pricing information for development/trial usage based on capacity units and hours for the data map, scanning, and resource set processing.
This is the slidedeck I used for my in-depth session at ApacheCon EU 2015. It covers a good technical overview of what Apache Unomi is and it's potential for integrating into many different systems.
AWS KMS 에서 제공하는 봉투암호화 방식의 암호화 및 사이닝 기능에 대한 소개와 실습 - 신은수, AWS 솔루션즈 아키텍트 :: AWS...
발표영상 다시보기: https://youtu.be/B7JTWT3vfis
AWS KMS 에서 제공하는 봉투암호화 방식의 암호화 기능의 장점 및 주요 특징들에 대해 소개하고 AWS 환경에서 KMS 를 이용한 데이터 암호화를 보다 효율적이고 안전하게 사용할 수 있는 방법에 대해 소개합니다.
The document discusses Amazon SageMaker, a fully managed machine learning platform. It introduces several new Amazon SageMaker capabilities: Amazon SageMaker Studio, which provides an integrated development environment for machine learning; Amazon SageMaker Notebooks for easier collaboration; Amazon SageMaker Processing for automated data processing and model evaluation; Amazon SageMaker Experiments for organizing and comparing training experiments; Amazon SageMaker Debugger for automated debugging of machine learning models; Amazon SageMaker Model Monitor for continuous monitoring of models in production; and Amazon SageMaker Autopilot for automated machine learning without writing code. It also discusses how Amazon SageMaker addresses challenges in deploying and managing machine learning models at scale.
Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...
대량의 딥러닝 모델의 훈련을 위해 Amazon SageMaker에서는 새로운 분산 훈련 기능과 빠른 분산 훈련 환경을 제공하고 있습니다. 특히 기존 TensorFlow/PyTorch의 코드에 몇 줄만 추가하면 쉽게 Amazon SageMaker 환경으로 마이그레이션하여 훈련 속도를 단축할 수 있습니다. 또한 모니터링 기능으로 리소스 사용률을 제공하며, 훈련 속도 최적화에 활용이 가능합니다. 예제 코드와 데모를 통해 Amazon SageMaker 분산 훈련의 이점을 자세히 알려 드립니다.
Webinar AWS 201 - Using Amazon Virtual Private Cloud (VPC)
This document discusses using Amazon Virtual Private Cloud (VPC) for hybrid IT architectures. It defines hybrid IT and outlines some common AWS services that can be used to build hybrid solutions, including VPC, VPN/Direct Connect networking, IAM policies and virtual images. Specific examples are given for disaster recovery and development/test environments extending on-premises networks to AWS. The presentation concludes with a demonstration of creating a VPC with IPSEC VPN tunnels to an on-premises office and deploying a CMS within the VPC.
AWS January 2016 Webinar Series - Managing your Infrastructure as Code
In this session, you will learn how you can provision, configure, and manage your infrastructure using code and treat it just like your application code. We will discuss the AWS services that enable these practices (AWS CloudFormation, AWS OpsWorks, and AWS CodeDeploy) and that allow you to control everything from Amazon VPCs and AWS Identity and Access Management to the configuration of individual applications on a single host. We’ll also talk about on-going management, how to best update your resources, and which tools are best suited for AWS resource management and host-based configuration management.
Learning Objectives:
Understand Infrastructure as Code
Understand the AWS services that help you manage your infrastructure as code
Discover best practices for managing your AWS infrastructure, host configuration, and applications
Who Should Attend:
DevOps Engineers, IT Professionals, Systems Administrators, Architects, Operations Professionals, Developers
1) Databricks provides a machine learning platform for MLOps that includes tools for data ingestion, model training, runtime environments, and monitoring.
2) It offers a collaborative data science workspace for data engineers, data scientists, and ML engineers to work together on projects using notebooks.
3) The platform provides end-to-end governance for machine learning including experiment tracking, reproducibility, and model governance.
This talk will provide a brief update on Microsoft’s recent history in Open Source with specific emphasis on Azure Databricks, a fast, easy and collaborative Apache Spark-based analytics service. Attendees will learn how to integrate MongoDB Atlas with Azure Databricks using the MongoDB Connector for Spark. This integration allows users to process data in MongoDB with the massive parallelism of Spark, its machine learning libraries, and streaming API.
Microsoft Azure - Introduction to microsoft's public cloud
This document provides an overview of Microsoft Azure, Microsoft's public cloud platform. It discusses Azure's infrastructure as a service (IaaS) and platform as a service (PaaS) offerings, as well as other services like compute, storage, networking, databases, web apps, and identity and access management. Usage statistics show that Azure trails only Amazon Web Services (AWS) in market share of public cloud providers. The document outlines how to sign up for a free Azure trial account and lists additional Microsoft resources for learning about Azure.
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon Web Services Korea
Amazon SageMaker 에서 제공하는 기계 학습을 위한 CI/CD 서비스, Aamzon SageMaker Pipelines 를 사용하기 위해 기계 학습의 라이프 사이클과 MLOps 의 개념과 AWS 에서의 MLOps 에 대한 오버뷰를 소개합니다. 또한, Amazon SageMaker Pipelines 의 세부적인 사용법을 스크린샷과 함께 소개합니다.
Computing at the Edge with AWS Greengrass and Amazon FreeRTOS, ft. Enel (IOT2...Amazon Web Services
Edge computing is all about moving compute power to the source of the data instead of having to bring it to the cloud. The edge is a fundamental part of IoT, and it is not only about connecting things to the internet. In this sesssion, we discuss how AWS Greengrass, which is an IoT edge software, can power devices small and large, from a sensor all the way to a wind turbine. With AWS Greengrass, these IoT devices can securely gather data, keep device data in sync, and communicate with each other while still using the cloud for management, analytics, and durable storage. Join us to learn more about the edge of IoT.
Under the Hood: How Amazon Uses AWS Services for Analytics at a Massive Scale...Amazon Web Services
As Amazon's consumer business continues to grow, so does the volume of data and the number and complexity of the analytics done in support of the business. In this session, we talk about how Amazon.com uses AWS technologies to build a scalable environment for data and analytics. We look at how Amazon is evolving the world of data warehousing with a combination of a data lake and parallel, scalable compute engines, such as Amazon EMR and Amazon Redshift.
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleDatabricks
This document summarizes a webinar on building machine learning platforms. It discusses how operating ML models is complex, requiring tasks like monitoring performance, handling data drift, and ensuring governance and security. It then outlines common components of ML platforms, including data management, model management, and code/deployment management. The webinar will demonstrate how different organizations handle these components and include demos from four companies. It will also cover Databricks' approach to providing an ML platform that integrates various tools and simplifies the full ML lifecycle from data preparation to deployment.
This document provides an overview of DevOps on AWS and the AWS developer tools for continuous delivery including CodeCommit, CodeDeploy, CodePipeline, and Elastic Beanstalk. It discusses how these tools help implement a microservices architecture and continuous delivery approach to software development. Specifically, it describes how CodeCommit provides version control, CodeDeploy enables easy and reliable deployments, CodePipeline allows connecting tools for accelerated release processes, and Elastic Beanstalk provides a simple way to deploy applications.
This document provides an overview of Mustafa Kara's background and expertise in datacenter transformation. It discusses his 10 years of experience in roles such as senior consultant, Azure MVP, technical manager, and technical trainer. It then outlines his work as a speaker and writer for Microsoft events, Virtual Academy, universities, and personal websites. The rest of the document discusses strategies for transforming the datacenter, including moving from on-premises physical servers and VMs to a hybrid cloud model using public cloud off-premises and cloud on-premises. It highlights tools like Azure Migrate and database migration services that can help analyze costs and migrate applications, VMs, and data.
Amazon.com 의 개인화 추천 / 예측 기능을 우리도 써 봅시다. :: 심호진 - AWS Community Day 2019AWSKRUG - AWS한국사용자모임
Amazon Personalize
개인화 및 추천에 대하여
Amazon Personalize 소개
Amazon Personalize 사용 방법
데모 - 캡쳐 화면
결론
Amazon Forecast
예측 기술에 대하여
Amazon Forecast 소개
Amazon Forecast 사용 방법
데모 - 캡쳐 화면
결론
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...Amazon Web Services
Learn about architecture best practices for combining AWS storage and database technologies. We outline AWS storage options (Amazon EBS, Amazon EC2 Instance Storage, Amazon S3 and Amazon Glacier) along with AWS database options including Amazon ElastiCache (in-memory data store), Amazon RDS (SQL database), Amazon DynamoDB (NoSQL database), Amazon CloudSearch (search), Amazon EMR (hadoop) and Amazon Redshift (data warehouse). Then we discuss how to architect your database tier by using the right database and storage technologies to achieve the required functionality, performance, availability, and durability—at the right cost.
Amazon SageMaker 모델 학습 방법 소개::최영준, 솔루션즈 아키텍트 AI/ML 엑스퍼트, AWS::AWS AIML 스페셜 웨비나Amazon Web Services Korea
Amazon SageMaker Training과 Processing에 처음 입문 하고자 하는 분을 위해 동작 방식을 설명하고, 실행할 수 있는 가이드를 제공합니다.사용자는 Amazon SageMaker 노트북을 생성한 다음, 직접 정의한 별도의 GPU 또는 고성능 CPU로 구성된 학습 클러스터에서 학습 코드를 실행하여, 효율적으로 모델 학습과 데이터 전처리, 추론 결과 후처리 또는 모델 평가 등을 할 수 있도록 합니다. 추가적으로 Amazon SageMaker Experiments를 이용하여 학습 실험에 대한 구조화와 평가 메트릭 간의 비교를 체계적으로 관리하는 방법을 소개합니다.
This document provides resources for learning about the different phases and components of Azure Purview including documentation, training courses, how to create subscriptions and accounts, set up collections and scans, understand the data map and lineage, best practices, and connect data sources. It also lists some competitors to Azure Purview and provides pricing information for development/trial usage based on capacity units and hours for the data map, scanning, and resource set processing.
Apache Unomi In Depth - ApacheCon EU 2015 SessionSerge Huber
This is the slidedeck I used for my in-depth session at ApacheCon EU 2015. It covers a good technical overview of what Apache Unomi is and it's potential for integrating into many different systems.
AWS KMS 에서 제공하는 봉투암호화 방식의 암호화 및 사이닝 기능에 대한 소개와 실습 - 신은수, AWS 솔루션즈 아키텍트 :: AWS...Amazon Web Services Korea
발표영상 다시보기: https://youtu.be/B7JTWT3vfis
AWS KMS 에서 제공하는 봉투암호화 방식의 암호화 기능의 장점 및 주요 특징들에 대해 소개하고 AWS 환경에서 KMS 를 이용한 데이터 암호화를 보다 효율적이고 안전하게 사용할 수 있는 방법에 대해 소개합니다.
The document discusses Amazon SageMaker, a fully managed machine learning platform. It introduces several new Amazon SageMaker capabilities: Amazon SageMaker Studio, which provides an integrated development environment for machine learning; Amazon SageMaker Notebooks for easier collaboration; Amazon SageMaker Processing for automated data processing and model evaluation; Amazon SageMaker Experiments for organizing and comparing training experiments; Amazon SageMaker Debugger for automated debugging of machine learning models; Amazon SageMaker Model Monitor for continuous monitoring of models in production; and Amazon SageMaker Autopilot for automated machine learning without writing code. It also discusses how Amazon SageMaker addresses challenges in deploying and managing machine learning models at scale.
Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...Amazon Web Services Korea
대량의 딥러닝 모델의 훈련을 위해 Amazon SageMaker에서는 새로운 분산 훈련 기능과 빠른 분산 훈련 환경을 제공하고 있습니다. 특히 기존 TensorFlow/PyTorch의 코드에 몇 줄만 추가하면 쉽게 Amazon SageMaker 환경으로 마이그레이션하여 훈련 속도를 단축할 수 있습니다. 또한 모니터링 기능으로 리소스 사용률을 제공하며, 훈련 속도 최적화에 활용이 가능합니다. 예제 코드와 데모를 통해 Amazon SageMaker 분산 훈련의 이점을 자세히 알려 드립니다.
Webinar AWS 201 - Using Amazon Virtual Private Cloud (VPC)Amazon Web Services
This document discusses using Amazon Virtual Private Cloud (VPC) for hybrid IT architectures. It defines hybrid IT and outlines some common AWS services that can be used to build hybrid solutions, including VPC, VPN/Direct Connect networking, IAM policies and virtual images. Specific examples are given for disaster recovery and development/test environments extending on-premises networks to AWS. The presentation concludes with a demonstration of creating a VPC with IPSEC VPN tunnels to an on-premises office and deploying a CMS within the VPC.
AWS January 2016 Webinar Series - Managing your Infrastructure as CodeAmazon Web Services
In this session, you will learn how you can provision, configure, and manage your infrastructure using code and treat it just like your application code. We will discuss the AWS services that enable these practices (AWS CloudFormation, AWS OpsWorks, and AWS CodeDeploy) and that allow you to control everything from Amazon VPCs and AWS Identity and Access Management to the configuration of individual applications on a single host. We’ll also talk about on-going management, how to best update your resources, and which tools are best suited for AWS resource management and host-based configuration management.
Learning Objectives:
Understand Infrastructure as Code
Understand the AWS services that help you manage your infrastructure as code
Discover best practices for managing your AWS infrastructure, host configuration, and applications
Who Should Attend:
DevOps Engineers, IT Professionals, Systems Administrators, Architects, Operations Professionals, Developers
1) Databricks provides a machine learning platform for MLOps that includes tools for data ingestion, model training, runtime environments, and monitoring.
2) It offers a collaborative data science workspace for data engineers, data scientists, and ML engineers to work together on projects using notebooks.
3) The platform provides end-to-end governance for machine learning including experiment tracking, reproducibility, and model governance.
This talk will provide a brief update on Microsoft’s recent history in Open Source with specific emphasis on Azure Databricks, a fast, easy and collaborative Apache Spark-based analytics service. Attendees will learn how to integrate MongoDB Atlas with Azure Databricks using the MongoDB Connector for Spark. This integration allows users to process data in MongoDB with the massive parallelism of Spark, its machine learning libraries, and streaming API.
Microsoft Azure - Introduction to microsoft's public cloudAtanas Gergiminov
This document provides an overview of Microsoft Azure, Microsoft's public cloud platform. It discusses Azure's infrastructure as a service (IaaS) and platform as a service (PaaS) offerings, as well as other services like compute, storage, networking, databases, web apps, and identity and access management. Usage statistics show that Azure trails only Amazon Web Services (AWS) in market share of public cloud providers. The document outlines how to sign up for a free Azure trial account and lists additional Microsoft resources for learning about Azure.
AWS DevDay Cologne - CI/CD for modern applicationsCobus Bernard
The document discusses approaches for modern application development including continuous integration, continuous deployment, infrastructure as code, microservices, and serverless technologies. It provides examples of using AWS services like CodePipeline, CodeBuild, CodeDeploy, SAM, and CDK to implement infrastructure as code, continuous integration, and continuous deployment. The document contains diagrams and code samples to illustrate these concepts and services.
1. SkinVision is a company that uses machine learning and smartphone cameras to detect skin cancer, finding over 27,000 cases of skin cancer and 5,000 cases of melanoma.
2. SkinVision uses AWS services for security, scalability, availability, and innovation in building their machine learning models and mobile apps at scale.
3. SkinVision's machine learning approach involves engineering models for repeatability, traceability, measurability, and using infrastructure as code to automate processes and minimize costs.
Become a Machine Learning developer with AWS (Avril 2019)Julien SIMON
1. SkinVision is a company that uses machine learning and smartphone cameras to detect skin cancer, finding over 27,000 cases of skin cancer and 5,000 cases of melanoma.
2. SkinVision uses AWS services for security, scalability, availability, and innovation in building their machine learning models and mobile apps for skin cancer detection at scale.
3. Amazon SageMaker and services like Amazon EC2, S3, and SQS help SkinVision build, train, deploy, and scale their machine learning models for skin cancer risk assessment.
This document provides an overview of Amazon SageMaker, a fully-managed machine learning platform. It describes the machine learning workflow from problem framing to model deployment and monitoring. SageMaker allows users to build, train, and deploy machine learning models using pre-built algorithms, frameworks like TensorFlow and MXNet, or custom containers. Models can be trained and hosted at scale using SageMaker's notebooks, training jobs, and inference endpoints. Examples and resources for using SageMaker are also provided.
WhereML a Serverless ML Powered Location Guessing Twitter BotRandall Hunt
Learn how we designed, built, and deployed the @WhereML Twitter bot that can identify where in the world a picture was taken using only the pixels in the image. We'll dive deep on artificial intelligence and deep learning with the MXNet framework and also talk about working with the Twitter Account Activity API. The bot is entirely autoscaling and powered by Amazon API Gateway and AWS Lambda which means, as a customer, you don't manage any infrastructure. Finally we'll close with a discussion around custom authorizers in API Gateway and when to use them.
Integrate Machine Learning into Your Spring Application in Less than an HourVMware Tanzu
SpringOne 2020
Integrate Machine Learning into Your Spring Application in Less than an Hour
Hermann Burgmeier, Senior Software Engineer at Amazon
Qing Lan, Software Developement Engineer at AWS
Mikhail Shapirov, Senior Partner Solutions at Amazon Web Services, Inc
Vaibhav Goel, Sr. Software Development Engineer at Amazon
Modern Applications Development on AWSBoaz Ziniman
Modern Application Development, using Microservices and Serverless, allow you to build and run simpler and more efficient applications, while improving your agility and saving a lot of money.
The ability to deploy your applications without the need for provisioning or managing servers opens new opportunities to build web, mobile, and IoT backends; run stream processing or big data workloads; run chatbots, and more, without the investment in hardware or professional manpower to run this hardware.
In this session, we will learn how to get started with Microservices and Serverless computing with AWS Lambda, which lets you run code without provisioning or managing servers.
Supercharge your Machine Learning Solutions with Amazon SageMakerAmazon Web Services
Amazon SageMaker is a fully-managed service that enables data scientists and developers to quickly and easily build, train, and deploy machine learning models, at scale. This session will introduce you the features of Amazon SageMaker, including a one-click training environment, highly-optimized machine learning algorithms with built-in model tuning, and deployment without engineering effort. With zero-setup required, Amazon SageMaker significantly decreases your training time and overall cost of building production machine learning systems. You'll also hear how and why Intuit is using Amazon SageMaker on AWS for real-time fraud detection.
Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...Amazon Web Services
Twelve-Factor designs improve component reuse and resilience for developers building large-scale software-as-a-service (SaaS) applications. In recent years, the Twelve-Factor guidelines have become a source of best practices for both developers and operations engineers, regardless of the application’s use case and at nearly any scale. In this workshop, create a modern app to see how the Twelve-Factor Application guidelines align with serverless best practices. Learn how to address those Twelve-Factor guidelines that don’t directly align with serverless architectures or are interpreted differently, and practice by implementing examples using AWS Lambda, AWS Step Functions, Amazon API Gateway, and the AWS Code services. Bring a laptop (Windows/OSX/Linux all supported). Tablets are not appropriate. We also recommend installing the current version of Chrome or Firefox.
Build, train and deploy ML models with SageMaker (October 2019)Julien SIMON
The document discusses Amazon SageMaker, a fully managed machine learning platform. It describes how SageMaker allows users to build, train, and deploy machine learning models using various options like built-in algorithms and frameworks. The document provides an overview of key SageMaker capabilities like notebook instances, APIs, training options, and frameworks. It also includes a demo of image classification using Keras/TensorFlow with SageMaker Script Mode and managed spot training.
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...Jonathan Dion
This document provides an overview of a presentation on Amazon SageMaker. The presentation agenda includes a quick overview of SageMaker, followed by labs demonstrating how to load data from S3, train and deploy models with built-in algorithms, perform hyperparameter tuning, and make predictions. The presentation also briefly previews deep learning with SageMaker. The goal is to predict customer enrollment in term deposits using the XGBoost algorithm on a banking dataset.
Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018Amazon Web Services
Take advantage of serverless technologies for artificial intelligence (AI) by making a prediction on the fly. There is no model hosting and no servers to maintain. In this session, we show how to train a model in scikit-learn, an open source machine learning library for Python. Then we load and call the trained model from an AWS Lambda function, and finally we demonstrate how to load the library and send the data for prediction.
The document discusses approaches to modern application development, including using infrastructure as code, microservices, serverless technologies, continuous integration, and continuous delivery. It provides examples of modeling infrastructure with AWS CloudFormation, AWS Serverless Application Model (SAM), and AWS Cloud Development Kit (CDK). It also discusses using AWS CodePipeline for continuous integration and delivery and AWS CodeBuild for building and testing code.
Mainframe Modernization with AWS: Patterns and Best PracticesAmazon Web Services
In this webinar, learn common mainframe migration patterns and best practices for a successful migration to AWS. Hear experiences and lessons learned based on real-world customer modernization projects to AWS.
Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018Amazon Web Services
Serverless computing enables you to build and run applications without the need to provision, manage servers, or worry about the availability or scalability of your solutions. With serverless computing, you can build web, mobile, and IoT backends, run stream processing or big data workloads, run chatbots, and more. In this session, learn how to get started with serverless computing with AWS Lambda, Amazon API Gateway, Amazon DynamoDB, and more.
How can you accelerate the delivery of new, high-quality services? How can you be able to experiment and get feedback quickly from your customers? To get the most out of the agility afforded by serverless and containers, it is essential to build CI/CD pipelines that help teams iterate on code and quickly release features. In this talk, we demonstrate how developers can build effective CI/CD release workflows to manage their serverless or containerized deployments on AWS. We cover infrastructure-as-code (IaC) application models, such as AWS Serverless Application Model (AWS SAM) and new imperative IaC tools. We also demonstrate how to set up CI/CD release pipelines with AWS CodePipeline and AWS CodeBuild, and we show you how to automate safer deployments with AWS CodeDeploy.
MLOps with serverless architectures (October 2018)Julien SIMON
Talk @ AWS Loft Stockholm, 23/10/2018
But why?
A quick recap on Amazon SageMaker
A quick recap on serverless architectures
Open Source tools: AWS Chalice, Serverless Framework
Demos
Resources
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...Amazon Web Services Korea
발표자료 다시보기: https://youtu.be/6sogVHw9jZ4
Machine Learning 워크로드를 실제 운영환경에서 사용하기 위하여 다양한 툴들과 방법들이 시도되고 있습니다. 본 세션에서는 ML 운영을 위해 어떤 툴들이 활용되고 있는지를 살펴보고, 그 중 엔터프라이즈 환경에서 많이 선택하고 았는 Kubernetes와 Kubeflow를 사용하여, 어떻게 Machine Learning 전처리와 Training 작업을 관리하고 운영환경에 배포할 수 있는지를 데모와 함께 알아봅니다.
Similar to Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform (20)
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking VN
Trong những năm gần đây, Việt Nam luôn là một trong những quốc gia có tỉ lệ nhiễm mã độc và hứng chịu các cuộc tấn công mạng thuộc nhóm cao trên thế giới. Bên cạnh đó, mức độ sử dụng máy tính và các thiết bị thông minh tại Việt Nam tăng đột biến do ảnh hưởng của COVID-19, và đây cũng chính là môi trường lý tưởng để virus bùng phát, lây lan mạnh. Điều nay làm dấy lên mối lo ngại về an ninh trên không gian mạng, một vấn đề mà ít người Việt quan tâm đến nhưng lại có tầm quan trọng cao và sức ảnh hưởng lớn.
Chính vì lí do đó, ở số Techtalk #46 này, Grokking Việt Nam xin giới thiệu với các bạn chủ đề “Những bài học về xâm nhập và bảo vệ hệ thống mạng Việt Nam” do anh Dương Ngọc Thái trình bày. Anh Thái hiện đang làm việc tại Google, anh thường được biết đến thông qua blog cá nhân vnhacker@blogspot.
"Từ năm 2016, cùng với vài người bạn, tôi đã xâm nhập vào hệ thống mạng máy tính của nhiều ngân hàng, bệnh viện, startup ở Việt Nam (với sự đồng ý của họ). Đối với các ngân hàng, chúng tôi đã có thể đánh cắp được lượng tiền lớn và nhiều dữ liệu nhạy cảm. Đối với các bệnh viện, chúng tôi đã có thể đánh cắp toàn bộ dữ liệu khách hàng và thậm chí có thể thay đổi hồ sơ bệnh án.
Trong bài nói chuyện này, tôi chia sẻ những gì chúng tôi đã học được, cung cấp thông tin về hiện trạng an ninh mạng ở Việt Nam. Tôi cũng đưa ra một cẩm nang giúp các doanh nghiệp và tổ chức bảo vệ tài sản và dữ liệu, tạo ra những sản phẩm được khách hàng tin tưởng." - Anh Thái chia sẻ về mục đích của bài talk.
Grokking Techtalk #45: First Principles ThinkingGrokking VN
Bạn có từng nghe ai đó nói về First Principles Thinking? Nó là gì và engineers chúng ta có thể sử dụng như thế nào cho công việc của mình?
---
First Principles Thinking là một trong những phương pháp mà chúng ta có thể vận dụng để phân chia những vấn đề phức tạp thành những vấn đề nhỏ và cơ bản hơn có thể giải quyết được, cuối cùng tổng hợp lại thành một giải pháp có thể giải quyết được vấn đề phức tạp ban đầu.
Nối tiếp về chủ đề Problem Solving, trong Techtalk lần này, Grokking Vietnam cùng Gambaru sẽ mang đến cho các bạn thêm một góc nhìn về tư duy giải quyết vấn đề. Chúng ta sẽ cùng gặp gỡ anh Hùng Đoàn - exFacebook và hiện đang là Software Engineer tại Coda và cùng nhau thảo luận sâu hơn về chủ đề First Principles Thinking này nhé.
Nội dung bài talk:
* Analogy thinking
* Breaking a problem space down to its building blocks
* Techniques to arrive at first principles thinking
* Application in Programming
---
Ngôn ngữ: Tiếng Việt
---
Speaker:
- Hùng Đoàn - Software Engineer @ Coda.io, Ex-Facebook SWE
Anh Hùng có nhiều năm kinh nghiệm trong các lĩnh vực thuộc software engineering. Anh từng thi quốc gia tin học quốc tế và đoạt huy chương vào 2007
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking VN
Đến với Techtalk #42, các bạn sẽ được chia sẻ về cách thiết kế và hiện thực một platform phục vụ các bài toán về machine learning thông qua một case study về việc phân tích các bình luận của người dùng.
Nội dung chủ đề lần này sẽ xoay quanh một số thách thức trong quá trình xây dựng bao gồm các khó khăn về mặt kỹ thuật và phân tích khi:
+ Cần phải thu thập lượng lớn bình luận của người dùng
+ Tổ chức lưu trữ và xử lý dữ liệu để dễ dàng mở rộng, thuận tiện cho việc giám sát, vận hành
+ Thiết kế các thành phần trong hệ thống đảm báo tính tái sử dụng cao, tránh lãng phí tài nguyên
Ngôn ngữ: Tiếng Việt
---
Speakers:
- Anh Hiền Hoàng - Principal Big Data Engineer & TPP
- Anh Hiếu Hoàng - Data Scientist & TPP
Đối với các hệ thống thương mại điện tử, việc tích hợp với một cổng thanh toán trực tuyến (payment gateway) sẽ là yêu cầu cơ bản nhất, dịch vụ thanh toán này ngoài việc cần phải chính xác, chúng còn phải mang lại trải nghiệm tốt cho người sử dụng, xử lý được những sự cố có thể xảy ra trong quá trình thực hiện và đặc biệt là phải bảo mật. Đây là một bài toán khó về mặt kỹ thuật để có thể thiết kế và xây dựng một cách hiệu quả!
Trong Techtalk #43 này, các bạn tham gia sẽ được chia sẻ về những thành phần của một payment gateway, quá trình xử lý một transaction, cách thức lưu trữ thông tin thanh toán, xử lý hoàn tiền,.. và những vấn đề gặp phải khác khi xây dựng một cổng thanh toán trực tuyến. Chủ đề sẽ đi qua các nội dung sau:
- Payment Domain Knowledge
- Payment Gateway Integration
+ Create Order
+ Check Order Amount (Optional)
+ Browser Redirect
+ Instant Payment Notification (IPN)
+ Payment Query (QueryDR)
- Advance Concept
+ Tokenization
+ Credit Card Authorization/Reversal/Settle
---
Ngôn ngữ: Tiếng Việt
---
Speakers:
- Nguyễn Văn Lợi - Technical Architect @ Vexere
Anh Nguyễn Văn Lợi là một kỹ sư phần mềm với hơn 10 năm kinh nghiệm thực tế từ các công ty có hệ thống lớn trong các mảng VoIP, Ecommerce, Big Data, Logistics. Tại Vexere, anh luôn đề cao tinh thần tự học hỏi, phát triển và chia sẻ để team member liên tục tích lũy kiến thức, kỹ năng, nhằm tăng hiệu quả công việc và mang lại sản phẩm có trải nghiệm tốt nhất cho người dùng
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking VN
Những năm gần đây, cùng với sự bùng nổ của các startup cùng các loại công nghệ như máy học, lượng dữ liệu phát sinh cần thu thập và xử lý trong các hệ thống ngày càng tăng cao.
Chính vì vậy, đối với các hệ thống lớn thì việc lưu trữ và xử lý dữ liệu trên một node database đã không đáp ứng được nữa, đòi hỏi phải sử dụng nhiều node kết nối với nhau để hình thành database cluster.
Đối với các database cluster nói riêng và hệ thống Distributed System nói chung, có khá nhiều chủ đề thú vị để đào sâu. Trong buổi thảo luận này, chúng ta sẽ giới hạn trong việc khảo sát về cách ba hệ thống Redis, Elastic Search và Cassandra tổ chức cluster cũng như sự trade-off giữa tính nhất quán (consistency) và khả năng đáp ứng (availability) của ba hệ thống này.
- Speaker: Lộc Võ - Lead Software Engineer @ Grab
Grokking Techtalk #39: Gossip protocol and applicationsGrokking VN
Gossip là một giao thức trao đổi thông tin phổ biến trong các hệ thống phân tán giúp cho các máy chủ duy trì trạng thái đồng nhất với nhau cũng như thực hiện các nhiệm vụ có chủ đích. Điểm mạnh của nó là khả năng phát tán thông tin ở tốc độ cao cũng như không hề có single point of failure. Trong bài talk này, Anh Nguyễn Anh Tú, thành viên của Grokking sẽ chia sẻ một số thông tin về giao thức Gossip cũng như điểm qua một vài ứng dụng thực tiễn của nó.
- Về diễn giả: Anh Nguyễn Anh Tú hiện đang là Staff Software Engineer tại Axon Vietnam, đồng thời là thành viên của Grokking Vietnam.
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...Grokking VN
The document discusses building an event-driven architecture using Apache Kafka and Kafka Connect. It describes how VeXeRe uses this approach to stream data from their MS SQL database into Kafka. Key points covered include event sourcing, how Kafka Connect works using connectors and tasks, best practices for monitoring connectors, and handling database schema evolution. Real-world use cases at VeXeRe like syncing data to data warehouses and search indexes are also examined.
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
This document discusses escape analysis in the Go compiler. It provides an overview of the Go language and compiler, including the main phases of parsing, type checking and AST transformations, SSA form, and generating machine code. It notes that the type checking phase contains several sub-phases, including escape analysis, which determines if local variables can be allocated to the stack instead of the heap. The document then delves into how escape analysis is implemented in the Go compiler.
Grokking Techtalk #37: Data intensive problemGrokking VN
At some point in your software engineer career, you will have to deal with data and your success depends on how big the data that your software can deal with. From a simple problem that requires processing a large amount of data, this talk will present to you how to approach this kind of issue and how to design and choose an efficient solution.
About speaker:
Hồ is Senior Software Engineer at AXON where he helps design and develops complex distributed systems, including image and video encoding, distributed file conversion system. Besides coding, Ho likes to read manga and meet friends in his free time.
Grokking Techtalk #37: Software design and refactoringGrokking VN
Even though software engineering has been around for decades, there is still no clear ways to assess the strengths and weaknesses of software design.
This talk introduces a framework to assess the strength of any specific software design and steps to refactor and improve it. Both object-oriented and functional programming will be discussed as ways to improve the design.
In the talk, the speaker also proposes a software architecture that incorporates all the ideas presented as the conclusion.
About speaker:
Thành currently works at Holistics Software as Co-founder and Chief Engineer architecting the next generation DataOps driven BI platform.
Before joining Holistics as co-founder, Thanh had 8 years of experience as a software engineer and big-data consultant from multiple companies, notably Revolution Analytics which was acquired by Microsoft in 2015.
Thanh graduated from National University of Singapore in 2009 majoring in Computer Engineering with a minor in Technopreneurship.
- Speaker: Servey Bochenkov - Head of Search @ TIKI
Search là một trong những feature quan trọng nhất đối với các website thương mại điện tử giúp khách hàng có thể dễ dàng tìm kiếm được sản phẩm mà mình mong muốn. Nhưng việc xây dựng một hệ thống search chất lượng nhưng vẫn đảm bảo tối ưu performance, resource sử dụng như RAM, CPU là một thách thức không hề nhỏ.
Đến với TechTalk #35 lần này, anh Sergey Bochenkov - với hơn 7 năm làm việc tại Cốc Cốc, hiện đang là Head of Search @ Tiki - sẽ chia sẻ cho chúng ta những ý tưởng cũng như khó khăn khi xây dựng language model dựa trên dữ liệu sản phẩm và search queries của Tiki cùng những dữ liệu khác được crawl từ các website để xây dựng Tiki spellchecker và autocorrection với một số nội dung nổi bật như:
- Quality optimizations idea
- Performance optimizations problems
- Giúp tăng 3-9% lượng mua hàng.
This document discusses the Kubernetes on-premise stack used by ZaloPay Merchant Platform (MEP) and lessons learned. The key points are:
1. MEP used Kubernetes on-premise to orchestrate microservices for increased scalability and cost savings while keeping sensitive fintech data secure on-premise.
2. The Kubernetes stack included load balancing, storage, CI/CD, logging, monitoring and more. Issues arose around scaling nodes breaking production and nodes failing.
3. Lessons included practicing deployments to better match production, understanding root causes of issues, applying chaos engineering, and ensuring business support for new technologies. Next steps focus on upgrades, automation, and improving monitoring.
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking VN
- Speaker: Nguyễn Hoàng Bách - Senior Principal Engineer @ TIKI
Trải qua 9 năm xây dựng và phát triển hệ thống, đội ngũ engineer TIKI lần lượt phải giải quyết từng bài toán kỹ thuật khó khăn để hệ thống phát triển theo kịp tốc độ tăng trưởng của business. Đặc thù của hệ thống Ecommerce có một thách thức lớn là phải đảm bảo tính chính xác của dữ liệu nhưng đồng thời vẫn phải đáp ứng lượng truy cập lớn. Do đó High Concurrency Architecture có vai trò quan trọng trong kiến trúc tổng thể của TIKI. Nó cũng là bước tiến lớn của các kỹ sư TIKI trong 6 tháng qua.
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking VN
This document discusses architectures for AI systems and big data engineering. It addresses topics like what constitutes data engineering; how to store and process event data using pipelines; when to use batch vs stream processing; how AI models can be integrated into data pipelines; and considerations for engineering systems involving big data and AI like non-functional requirements, reprocessing data as models improve, and storing model outputs. The overall aim is to provide an overview of key concepts in engineering systems that handle large volumes of data and incorporate artificial intelligence.
This document discusses SOLID principles and design patterns. It introduces SOLID, including the single responsibility, open-closed, Liskov substitution, interface segregation, and dependency inversion principles. It then explains three design patterns - decorator, strategy, and visitor patterns - with examples. The document emphasizes that understanding SOLID helps make design patterns more approachable and useful for communication. It concludes with a Q&A section.
Trong talk lần này của Grokking, anh Huy sẽ chia sẻ về điểm hay và tác hại của văn hoá chat ở công sở, và đưa ra thêm những lựa chọn khác phù hợp hơn cho từng trường hợp cụ thể. Đối tượng là dành cho các team khi gặp các vấn đề sau:
1. Bạn cảm thấy bỏ rất nhiều thời gian, nhưng lại không làm được gì nhiều vì luôn bị đồng nghiệp nhờ/hỏi khi có công việc gấp
2. Cuối ngày nhìn lại bạn chả nhớ mình làm được gì quan trọng
3. Bạn dành thời gian trao đổi với team rất hăng hái để đưa ra quyết định, nhưng 3 tháng sau lại quên mất tại sao hồi đó quyết định như vậy..
Bài talk sẽ nói về thói quen giao tiếp bất đồng bộ, thói quen ghi lại những gì mình cần nói & cách xây dựng wiki cho team của mình nhằm mục tiêu hạn chế những ảnh hưởng không tốt của việc sử dụng chat.
Speaker: Huy Nguyen - CTO @ Holistics
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking VN
This document summarizes Axon's process for scaling their frontend development. It discusses how Axon moved to a monorepository to improve code sharing. It also details how Axon built a component library to provide consistent UI and encourage adoption of best practices. Additionally, the document outlines how Axon improved their tooling like custom CLI tools and code generation. It describes optimizations made to build speeds, including only building what is needed. Finally, it discusses automating quality control through linting and enforcing best practices to catch errors early. The key takeaways are about the benefits of monorepos, component libraries, improved tools, faster builds, and automated quality control.
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking VN
Bài techtalk của anh Khải Trần nói về hệ thống data pipeline của LinkedIn được dùng để thu thập hàng chục tỷ messages mỗi ngày, và cách họ chạy hệ thống real-time processing để thống kê lượng dữ liệu này cho mục đính metrics monitoring.
1 số điểm bài talk sẽ chia sẻ:
- Giới thiệu về hệ thống unified metrics platform của LinkedIn
- Cách LinkedIn setup hệ thống BigData pipeline dùng Kafka, HDFS, Apache Calcite và Apache Samza.
- Khái niệm nearline storage, và cách LinkedIn chuyển từ offline architecture sang nearline architecture.
Speaker: Khai Tran, Staff Software Engineer - LinkedIn.
- Hiện đang là staff software engineer ở LinkedIn, phụ trách hệ thống metrics monitoring system. Trước đây từng làm ở Amazon AWS và Oracle.
- PhD, University of Wisconsin-Madison, nghiên cứu về Database Systems.
Cây nhị phân tìm kiếm là 1 cấu trúc dữ liệu quen thuộc với chúng ta. Có rất nhiều nghiên cứu và các thuật toán xoay quanh cấu trúc dữ liệu này. Trong talk này, xin giới thiệu một kỹ thuật giúp tối ưu cây nhị phân tìm kiếm dựa trên tần suất tìm kiếm, qua đó giúp giảm chi phí tìm kiếm xuống mức thấp nhất.
- Speaker: Phong Vu - Software Engineer
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking VN
- Discuss and understand how Kotlin's core feature works, compare with its ancestor.
Speaker: Ngô Minh Hiền
- Senior Android Developer
- Android Mobile Team lead @ Wizeline
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfjackson110191
These fighter aircraft have uses outside of traditional combat situations. They are essential in defending India's territorial integrity, averting dangers, and delivering aid to those in need during natural calamities. Additionally, the IAF improves its interoperability and fortifies international military alliances by working together and conducting joint exercises with other air forces.
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...Toru Tamaki
Jindong Gu, Zhen Han, Shuo Chen, Ahmad Beirami, Bailan He, Gengyuan Zhang, Ruotong Liao, Yao Qin, Volker Tresp, Philip Torr "A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models" arXiv2023
https://arxiv.org/abs/2307.12980
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
Mitigating the Impact of State Management in Cloud Stream Processing SystemsScyllaDB
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states.
In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing.
Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
Implementations of Fused Deposition Modeling in real worldEmerging Tech
The presentation showcases the diverse real-world applications of Fused Deposition Modeling (FDM) across multiple industries:
1. **Manufacturing**: FDM is utilized in manufacturing for rapid prototyping, creating custom tools and fixtures, and producing functional end-use parts. Companies leverage its cost-effectiveness and flexibility to streamline production processes.
2. **Medical**: In the medical field, FDM is used to create patient-specific anatomical models, surgical guides, and prosthetics. Its ability to produce precise and biocompatible parts supports advancements in personalized healthcare solutions.
3. **Education**: FDM plays a crucial role in education by enabling students to learn about design and engineering through hands-on 3D printing projects. It promotes innovation and practical skill development in STEM disciplines.
4. **Science**: Researchers use FDM to prototype equipment for scientific experiments, build custom laboratory tools, and create models for visualization and testing purposes. It facilitates rapid iteration and customization in scientific endeavors.
5. **Automotive**: Automotive manufacturers employ FDM for prototyping vehicle components, tooling for assembly lines, and customized parts. It speeds up the design validation process and enhances efficiency in automotive engineering.
6. **Consumer Electronics**: FDM is utilized in consumer electronics for designing and prototyping product enclosures, casings, and internal components. It enables rapid iteration and customization to meet evolving consumer demands.
7. **Robotics**: Robotics engineers leverage FDM to prototype robot parts, create lightweight and durable components, and customize robot designs for specific applications. It supports innovation and optimization in robotic systems.
8. **Aerospace**: In aerospace, FDM is used to manufacture lightweight parts, complex geometries, and prototypes of aircraft components. It contributes to cost reduction, faster production cycles, and weight savings in aerospace engineering.
9. **Architecture**: Architects utilize FDM for creating detailed architectural models, prototypes of building components, and intricate designs. It aids in visualizing concepts, testing structural integrity, and communicating design ideas effectively.
Each industry example demonstrates how FDM enhances innovation, accelerates product development, and addresses specific challenges through advanced manufacturing capabilities.
Best Practices for Effectively Running dbt in Airflow.pdfTatiana Al-Chueyr
As a popular open-source library for analytics engineering, dbt is often used in combination with Airflow. Orchestrating and executing dbt models as DAGs ensures an additional layer of control over tasks, observability, and provides a reliable, scalable environment to run dbt models.
This webinar will cover a step-by-step guide to Cosmos, an open source package from Astronomer that helps you easily run your dbt Core projects as Airflow DAGs and Task Groups, all with just a few lines of code. We’ll walk through:
- Standard ways of running dbt (and when to utilize other methods)
- How Cosmos can be used to run and visualize your dbt projects in Airflow
- Common challenges and how to address them, including performance, dependency conflicts, and more
- How running dbt projects in Airflow helps with cost optimization
Webinar given on 9 July 2024
Best Programming Language for Civil EngineersAwais Yaseen
The integration of programming into civil engineering is transforming the industry. We can design complex infrastructure projects and analyse large datasets. Imagine revolutionizing the way we build our cities and infrastructure, all by the power of coding. Programming skills are no longer just a bonus—they’re a game changer in this era.
Technology is revolutionizing civil engineering by integrating advanced tools and techniques. Programming allows for the automation of repetitive tasks, enhancing the accuracy of designs, simulations, and analyses. With the advent of artificial intelligence and machine learning, engineers can now predict structural behaviors under various conditions, optimize material usage, and improve project planning.
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc
Six months into 2024, and it is clear the privacy ecosystem takes no days off!! Regulators continue to implement and enforce new regulations, businesses strive to meet requirements, and technology advances like AI have privacy professionals scratching their heads about managing risk.
What can we learn about the first six months of data privacy trends and events in 2024? How should this inform your privacy program management for the rest of the year?
Join TrustArc, Goodwin, and Snyk privacy experts as they discuss the changes we’ve seen in the first half of 2024 and gain insight into the concrete, actionable steps you can take to up-level your privacy program in the second half of the year.
This webinar will review:
- Key changes to privacy regulations in 2024
- Key themes in privacy and data governance in 2024
- How to maximize your privacy program in the second half of 2024
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
YOUR RELIABLE WEB DESIGN & DEVELOPMENT TEAM — FOR LASTING SUCCESS
WPRiders is a web development company specialized in WordPress and WooCommerce websites and plugins for customers around the world. The company is headquartered in Bucharest, Romania, but our team members are located all over the world. Our customers are primarily from the US and Western Europe, but we have clients from Australia, Canada and other areas as well.
Some facts about WPRiders and why we are one of the best firms around:
More than 700 five-star reviews! You can check them here.
1500 WordPress projects delivered.
We respond 80% faster than other firms! Data provided by Freshdesk.
We’ve been in business since 2015.
We are located in 7 countries and have 22 team members.
With so many projects delivered, our team knows what works and what doesn’t when it comes to WordPress and WooCommerce.
Our team members are:
- highly experienced developers (employees & contractors with 5 -10+ years of experience),
- great designers with an eye for UX/UI with 10+ years of experience
- project managers with development background who speak both tech and non-tech
- QA specialists
- Conversion Rate Optimisation - CRO experts
They are all working together to provide you with the best possible service. We are passionate about WordPress, and we love creating custom solutions that help our clients achieve their goals.
At WPRiders, we are committed to building long-term relationships with our clients. We believe in accountability, in doing the right thing, as well as in transparency and open communication. You can read more about WPRiders on the About us page.
The DealBook is our annual overview of the Ukrainian tech investment industry. This edition comprehensively covers the full year 2023 and the first deals of 2024.
Code versioning controls
Shared environments, IDE – Jupyter Note/Lab
Infrastructure as code
Self-service environment
SaaS
Most importantly: training & processing
Separation of source, environments, etc.
Security
Experiment lifecycles
Pricing
Efficiency
Reproduceability is hard
End-to-end tracability
Dashboard ->
Netflix built metaflow
Lyft build Flyte
Kubeflow
Apache Airflow
Important factor: skill set & enforce
Metaflow
Netflix built metaflow
Netflix is a huge customer of AWS
In production since 2018
Made open source by Netflix & AWS in 2019
What is it?
Basic concepts of metaflow
Deploying to AWS is easy
Flyte
A K8s native distributed workflow orchestrator used at Lyft for:
Data science
Pricing
Fraud detection
Locations
ETA and more
Enables highly concurrent, scalable workflows for ML and data processing
Core concepts of Flyte – task, DAG, workflows, control flow specification.
Actual task can be in any language – tasks executed as containers.
Provisions necessary resources dynamically, executes tasks as docker containers, and de-provisions resources when tasks are complete to control costs.
Supports execution across 100s of machines e.g. production model training
Kubeflow, Airflow are fairly popular
Airflow
Amazon SageMaker with Apache Airflow 1.10.1. If you use Airflow, you can use SageMaker Workflow in Apache Airflow
More details from https://sagemaker.readthedocs.io/en/stable/using_workflow.html
Many customers want to use the fully managed capabilities of Amazon SageMaker for machine learning, but also want platform and infrastructure teams to continue using Kubernetes for orchestration and managing pipelines. SageMaker addresses this requirement by letting Kubernetes users train and deploy models in SageMaker using SageMaker-Kubeflow operations and pipelines. With operators and pipelines, Kubernetes users can access fully managed SageMaker ML tools and engines, natively from Kubeflow. This eliminates the need to manually manage and optimize ML infrastructure in Kubernetes while still preserving control of overall orchestration through Kubernetes. Using SageMaker operators and pipelines for Kubernetes, you can get the benefits of a fully managed service for machine learning in Kubernetes, without migrating workloads.
If you use Kubernetes, you can use SageMaker Operators for Kubernetes
You can install the Sagemaker Operator for Kubernetes using the provided Helm Chart
Once you have this operator installed, K8s users can natively invoke SageMaker features like model training, Hyperparameter Tuning and Batch Transform jobs
They can also setup model serving using SageMaker Model Hosting Services
https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_operators_for_kubernetes.html#what-is-an-operator
https://eksworkshop.com/advanced/420_kubeflow/pipelines/
We see customers build serverless ML workflows using AWS Step Functions
Open source - Step Functions Data Science SDK for SageMaker
Create workflows to pre-process data, train/deploy models using SageMaker
Data pre-processing can be done using AWS Glue
SageMaker functionality like model training, HPO and end point creation is accessible
Use the SDK to create and visualize the workflows
Scale workflows without having to worry about infrastructure
https://aws.amazon.com/about-aws/whats-new/2019/11/introducing-aws-step-functions-data-science-sdk-amazon-sagemaker/
Many good tools exist. You can run any of the tools we saw earlier on AWS.
Remember - Tools are meant to make your life easier
Don’t get fixated on the tools.
Work backwards from the problem you are trying to solve.
So think about your existing s/w engg workflows and tools
Ask yourself, which tools will best augment what you already have
Ask yourself, which tools are your people most comfortable with
AWS approach is use the tools that work for you
Easy to think of SageMaker as Notebook.
The key thing to remember is that the notebook UI we see a lot in the demos is just a part of the SageMaker platform – and an optional part at that!
The notebook is the front-end environment in which we’ll experiment with our data and code.
Keep that instance low-cost resource. Value of separation…
When we’re ready to try and train or deploy a model, we’ll be spinning up separate, dedicated infrastructure in the SageMaker container runtime – which means we have lots of flexibility to choose resources cost-effectively and only pay for what we need.
All managed
The orchestration that SageMaker gives us to make this happen is closely integrated to these other two services:
The images defining our containers will need to be stored in Amazon ECR (there’s not currently an integration for external registries like DockerHub – but if you have a particular technology in mind our service team would appreciate the feedback!
…And the preferred storage platform for not just our input data but also model artifacts and other stuff generated in the workflow will be Amazon S3. Why? <The generic S3 pitch – it’s got everything you need for a data lake> Most integrated service, arguably most mature, tiers, security models, high durability
Recaping: 4 things
…So let’s look at how that end-to-end process works.
To start with I have:
The data that I want to train on (prepared and loaded to S3) – pre-processed already, in Notebook, but also option for other services like Glue or Processing Jobs to …
The training script I’d like to run (e.g. defining neural network shape and fitting routine – on the notebook instance where I’m working) minimum code
One of the pre-prepared SageMaker framework container images somewhere in Amazon ECR – maybe TensorFlow, PyTorch, or MXNet repeatable, controlled, re-producable
So what’s happening when we start a training job by calling “estimator.fit()” in those examples from before?
We’re gonna start seeing a lot of arrows here, so the cool thing to remember is that all of the arrows are things *SageMaker is doing for you* - not things you need to do yourself!
First, assuming you provide a custom code script (or folder of code), the SageMaker SDK is going to zip that up and upload it to a new location in S3. So you can’t forget to check your working version in to git, and you won’t lose track of that version that worked well in the middle of your experiments: The results are going to be traceable to the code that created them.
Next, SageMaker is going to spin up whatever infrastructure you asked for in the fit() request, and pull down the docker image to run on it
SageMaker will also start downloading your source data from S3 into the container – no messing about with S3 API calls in your script – your code can read it from folder, just as if you were running locally. Env params…
As the container fires up, that framework application does a load of helpful prep but one particularly important thing: It installs any additional inline dependencies specified for your custom code, then starts it up and passes in the parameters of the training job.
Your code runs, prints status to the console, and saves the trained model to disk just like you normally would… But SageMaker takes care of zipping and uploading that final model to S3 – and also other output mechanisms like sending the logs to CloudWatch and collecting metrics. Pay only for …
So the benefit we’ve gained here is that our custom code can be quite simple: Load a CSV from file, make a random forest, save it to file, etc. We can even add specify additional dependencies via a requirements.txt file… and SageMaker plus the framework container will orchestrate these overhead tasks to give us this nice lineage-traceable workflow with all of the cool features we talked about earlier – with no extra code complexity required on our part.
When it’s time to deploy that model to an inference endpoint, we simply reference:
Our model artifact tarball from S3
An inference container (which might be the same one as for training, or might be a different image because the dependencies could be differently optimized for run-time)
And maybe some custom code again: This time just defining some helper functions that we might want to customize from the built-in inference flow, such as how to de/serialize requests and responses, or how the model file(s) need to be loaded from disk into memory if the process is different from standard. How it’s optimized
As in training, SageMaker will handle the creation of infrastructure and loading of these components for us. If we used the ‘estimator’ pattern from the high-level SageMaker SDK, all we need to call is a single estimator.deploy(…) function to make it happen.
Again here the intent is that any custom code needed can be small: Just providing a few optional functions for serialization, model loading, etc… Rather than writing and having to maintain a model server, integrations with TorchServe or TensorFlow Serving, etc.
Custom input format (JSON)…
Not today, but…
In SageMaker, batch transform jobs function pretty much identically to real time inference endpoints from a user code point of view: The batch transform engine handles reading your source data from S3, feeding it through your model, storing the results back to S3, and shutting down the resources again as soon as the job is done.
Pay only for…
Mechanism: how easiest for different personas?
Skillset dependency – learning curve
…So that’s our overview picture for framework containers:
You write pretty minimal code just as you usually would for experimenting in your notebook. But instead of running that code locally, which can make things like infrastructure optimization, experiment tracking, and inference deployment tricky… SageMaker provides some nice streamlined, high-level APIs to trigger containerized training and inference jobs (or deploy endpoints) on separate infrastructure.
At the fundamental level, the system is super flexible because you can make fully custom container images and model artifact tarballs… But the framework container images together with the SageMaker SDK library (for your notebook) enable this higher-level, container-plus-custom-code workflow.
Same as the morning, just diff drawing
Solve problems on experimenting, tracking, etc.
Also lession learnt & best practices
The Repeatable stage is generally focused on applying automation as the number of machine learning workloads running in production increases. In general, at this stage many of the activities in building, training and deploying machine learning models is automated. The introduction of automation reduces manual hand-offs between teams and reduces the operational overhead of previously manual/ad-hoc tasks. The ability to orchestrate machine learning workflows into automated machine learning also depends on having a data strategy and automated data processing tasks.
Queue Management: Ability to manage, schedule, and prioritize tasks
Resource Management: Access to horizontally scalable compute that can scale based on workflow task requirements
Workflow Operators: Error handling, retry and conditional logic functions
Workflow Logs: Centralized logs and configuration parameters for execution and task level logs
The Reliable stage builds on the automation from the Repeatable stage but aims to ensure automation is balanced with practices aimed to increase quality, enable end-to-end traceability, increase reliability through automatic rollbacks, increase visibility into development and operational health, and ensure repeatability. In general, at this stage MLOps practices of Infrastructure-as-Code/Configuration-as-Code, Continuous Integration, Continuous Delivery/Deployment, and Continuous Monitoring are introduced.