Cloud storage is one of the primary service offered by almost all the leading cloud service providers. This presentation looks into the options of Cloud storage in Azure, AWS and Google Cloud platform. Colombo Cloud User Meetup
Come learn about new and existing Amazon S3 features that can help you better protect your data, save on cost, and improve usability, security, and performance. We will cover a wide variety of Amazon S3 features and go into depth on several newer features with configuration and code snippets, so you can apply the learnings on your object storage workloads.
AWS Glue is a fully managed, serverless extract, transform, and load (ETL) service that makes it easy to move data between data stores. AWS Glue simplifies and automates the difficult and time consuming tasks of data discovery, conversion mapping, and job scheduling so you can focus more of your time querying and analyzing your data using Amazon Redshift Spectrum and Amazon Athena. In this session, we introduce AWS Glue, provide an overview of its components, and share how you can use AWS Glue to automate discovering your data, cataloging it, and preparing it for analysis.
Amazon SageMaker는 머신러닝 프로젝트를 위한 통합 플랫폼입니다. SageMaker의 기능 중 Amazon SageMaker Studio는 머신러닝 통합 개발환경을 제공하여, 데이터를 준비에서부터 모델을 빌드, 교육 및 배포하는 데 필요한 모든 단계를 수행할 수 있습니다. Amazon EMR은 Apache Spark, Apache Hive 및 Presto와 같은 오픈 소스 분석 프레임워크를 사용하여 대규모 분산 데이터 처리 작업, 대화형 SQL 쿼리 및 ML 애플리케이션을 실행하기 위한 빅 데이터 플랫폼입니다. 이 세션에서는 데이터 과학자와 ML 엔지니어가 ML 워크플로우에서 분산 빅 데이터 프레임워크를 쉽게 사용할 수 있도록 상호 서비스 간의 통합에 대하여 데모를 통해 알아봅니다.
This presentation covers topics on AWS Organizations meant to educate and prepare those taking the AWS SAA exams
발표영상 다시보기: https://youtu.be/eQjkwhyOOmI 대규모 데이터 레이크 구성 및 관리는 복잡하고 시간이 많이 걸리는 작업입니다. AWS Lake Formation은 수일만에 안전한 데이터 레이크를 구성할 수 있는 완전 관리 서비스입니다. 본 세션에서는 데이터 수집, 분류, 정리, 변환 및 보안을 위해 AWS Lake Formation을 통해 Amazon S3, EMR, Redshift 및 Athena와 같은 분석 도구를 쉽게 구성하는 방법을 알아봅니다. (2019년 11월 서울 리전 출시)
Amazon Redshift is a fully managed data warehouse service that makes it fast, simple and cost effective to analyze data using SQL and existing business intelligence tools. The document provides an overview of Amazon Redshift and its benefits including speed, low cost, security, scalability and ease of use. It also provides examples of how various companies use Redshift for big data analytics including analyzing social media firehoses, mobile usage and real-time IoT streaming data.
In the event of a disaster, you need to be able to recover lost data quickly to ensure business continuity. For critical applications, keeping your time to recover and data loss to a minimum and optimizing your overall capital expense can be challenging. This session presents AWS features and services along with disaster recovery architectures that you can leverage when building highly available and disaster-resilient strategies.
This document provides an introduction to AWS Glue. It discusses that ETL development consumes 70% of data warehouse resources on average. AWS Glue is a fully managed ETL service that automates ETL processes on a serverless Apache Spark environment. It features a data catalog, job authoring tools for Python/Spark code generation, and job execution on serverless Spark. Use cases include understanding data, querying data lakes on S3, and building event-driven ETL pipelines. The presentation demonstrates AWS Glue and reviews pricing.
We’ve partnered with hundreds of customers on their large-scale migrations to AWS. This session outlines some of the common challenges that our customers face and how they’ve overcome these challenges. The session also describes the patterns we’ve observed that make legacy migrations successful and the mechanisms we’ve created to help customers migrate faster.