SlideShare a Scribd company logo
LET'S DECIPHER THE DEVOPS
MACEDONIA
-Suman Kumari & Wamika Singh from
Milan | November 29 - 30, 2018
INTRODUCTION
A quick word about us
SUMAN KUMARI
Application Developer
WAMIKA SINGH
Quality Analyst
at
THE TOPIC
What are we presenting
Athena
S3
at
BUSINESS CONTEXT &
CONSTRAINTS
at

Recommended for you

Apache Flink @ Alibaba - Seattle Apache Flink Meetup
Apache Flink @ Alibaba - Seattle Apache Flink MeetupApache Flink @ Alibaba - Seattle Apache Flink Meetup
Apache Flink @ Alibaba - Seattle Apache Flink Meetup

This document summarizes Haitao Wang's experience working on streaming platforms at Alibaba and Microsoft. It describes Alibaba's data infrastructure challenges in handling large volumes of streaming data. It introduces Alibaba Blink, a distribution of Apache Flink that was developed to meet Alibaba's scale needs. Blink has achieved unprecedented throughput of 472 million events per second with latency of 10s of milliseconds. The document outlines improvements made in Blink's runtime, declarative SQL support, and use cases at Alibaba including real-time A/B testing, search index building, and online machine learning.

seattle apache flink meetup
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...

The Apache Kafka ecosystem is very rich with components and pieces that make for designing and implementing secure, efficient, fault-tolerant and scalable event stream processing (ESP) systems. Using real-world examples, this talk covers why Apache Kafka is an excellent choice for cloud-native and hybrid architectures, how to go about designing, implementing and maintaining ESP systems, best practices and patterns for migrating to the cloud or hybrid configurations, when to go with PaaS or IaaS, what options are available for running Kafka in cloud or hybrid environments and what you need to build and maintain successful ESP systems that are secure, performant, reliable, highly-available and scalable.

kafka summitapache kafkacloud based
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...

Managing Apache Kafka sometimes could be cumbersome, and that's something that we would like to avoid, especially for developers and data engineers that need to build and develop data pipelines. Luckily, Kubernetes and Kafka's combination helps us reduce everyday tasks tremendously by adding myriad capabilities to lessen the complexity of managing clusters. Kafka Connect and KSQLDB are a fantastic combo to add to your streaming stack. These two soldiers can facilitate data acquisition and processing and also provide outstanding real-time ETL capabilities. But what if you need an OLAP datastore to answer complex queries with a low-latency response, that's where Apache Pinot comes to play. At this session, you're going to learn: - Effective Kafka deployment on Kubernetes - How to properly configure Kafka Connect and KSQLDB - Integrate Apache Pinot to answer OLAP queries

apache kafkakafka summitconfluent
A leading manufacturing company
Headquarters in Italy
OUR PROBLEM STATEMENT
What our client’s business looks like
Streaming large data
Digital Transformation
Intent to get better insights
on the process using data
Each factory with its
own physical data storage
Close to 20 plants globally
at
OUR PROBLEM STATEMENT
What constraints did we have
at
Extensibility
and availability
of products in
20 plants
Growing
number of
products and
data
Keeping a low
operational and
maintenance
cost
Staying cloud
agnostic
Building a proof
of concept
before an org.
wide roll out
Factory HQ
OUR PROBLEM STATEMENT
What technology challenges were we dealing with
at
1
APPLICATION
INFRASTRUCTURE
4
CONTINUOUS
DEPLOYMENT
2
DATA
STREAMING
3
QUERYING
SERVICE
1. APPLICATION INFRASTRUCTURE
at

Recommended for you

One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)

In 2015, Google open sourced the core of their internal container clustering system under the name Kubernetes. Teams that previously relied upon IaaS and PaaS to run their applications quickly adopted Kubernetes instead. Today, only a few years later, Kubernetes is key to many companies and runs applications with literally billions of users. Kubernetes has become the de facto standard for deploying and running cloud native applications. We’ll give an overview of what Kubernetes is today and share our experiences from using Kubernetes in an ecormmerce and an IoT application. The future of Kubernetes could not look better. The Kubernetes ecosystem is growing, allowing to provision professionally managed databases directly within the cluster, running functions in a serverless-fashion, and even allowing us to host the code, the build pipeline and the application itself on Kubernetes. In the future, there might be only one Kubernetes to rule them all.

kubernetesinnoqkeynote
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)

You have billions of events in your fact table, all of it waiting to be visualized. Enter Tableau… but wait: how can you ensure scalability and speed with your data in Amazon S3, Spark, Amazon Redshift, or Presto? In this talk, you’ll hear how Albert Wong and Srikanth Devidi at Netflix use Tableau on top of their big data stack. Albert and Srikanth also show how you can get the most out of a massive dataset using Tableau, and help guide you through the problems you may encounter along the way. Session sponsored by Tableau. AWS Competency Partner

stg306aws cloudadvanced (300 level)
Journey to the Modern App with Containers, Microservices and Big Data
Journey to the Modern App with Containers, Microservices and Big DataJourney to the Modern App with Containers, Microservices and Big Data
Journey to the Modern App with Containers, Microservices and Big Data

This document discusses the transition to modern enterprise applications using containers, microservices, and big data technologies. It outlines how the Datacenter Operating System (DC/OS) provides a platform for building, running, and managing modern apps at scale. DC/OS abstracts infrastructure and provides platform services to simplify developing and operating distributed apps across a datacenter. It allows organizations to innovate faster by accelerating development and deployment of new services.

1. APPLICATION INFRASTRUCTURE
How we made the choice
at
CONTAINERS
Cost
Portable
Consistent
Isolation
1. APPLICATION INFRASTRUCTURE
How we made the choice
at
✓ Open source
✓ Provides primitives for modern application
✓ auto scaling of service
✓ automated rollouts and roll backs
✓ auto discovery
✓ self healing
✓ Zero down time
✓ Cloud agnostic deployment
1. APPLICATION INFRASTRUCTURE
How we implemented it
at
To create the basic infrastructure on AWS
To provision the Kubernetes cluster
1. APPLICATION INFRASTRUCTURE
How we implemented it
at

Recommended for you

Building a Real-Time Forecasting Engine with Scala and Akka
Building a Real-Time Forecasting Engine with Scala and Akka Building a Real-Time Forecasting Engine with Scala and Akka
Building a Real-Time Forecasting Engine with Scala and Akka

In this presentation, Steven Laan, Product Owner and Advanced Real-Time Analytics Dev Engineer, ING Group talks about the Why, What, and How of real time transaction forecasting. Topics include: visual end product, architecture landscape, actor system solution and a bit of ING Way of Working.

Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern

This document discusses strategies for migrating monolithic applications to the cloud using the strangler pattern. It begins with an overview of the strangler pattern, which involves gradually building a new system around the edges of an existing monolith. It then provides examples of how to implement the strangler pattern on AWS by hosting the existing application, adding facades with API Gateway, detecting hot spots with X-Ray, replacing hot spots with Lambda functions, and iteratively strangulating more of the monolith over time until it is retired. The document emphasizes that this incremental approach allows migrating applications at a lower cost and risk compared to full rewrites.

monolithicrehoststrangler
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...

Using Kafka to stream data into TigerGraph, a distributed graph database, is a common pattern in our customers’ data architecture. In the TigerGraph database, Kafka Connect framework was used to build the native S3 data loader. In TigerGraph Cloud, we will be building native integration with many data sources such as Azure Blob Storage and Google Cloud Storage using Kafka as an integrated component for the Cloud Portal. In this session, we will be discussing both architectures: 1. built-in Kafka Connect framework within TigerGraph database; 2. using Kafka cluster for cloud native integration with other popular data sources. Demo will be provided for both data streaming processes.

apache kafkakafka summit
2. DATA STREAMING
at
2. DATA STREAMING
How we made the choice
at
➡ Low cost

➡ Inbuilt notification feature

➡ Does not maintain a persistent
checkpoint.

➡ No infra maintenance

➡ To retain data for more than 24
hours, you need to pay
additional cost
➡ Slow as compared to Kafka due
to it’s replication factor over
multiple zones

➡ High Cost

➡ AWS specific
➡ Opensource

➡ Fast, Durable, Scalable and very
high throughput

➡ Lower cost
➡ Durable logs that allow us to
replay messages.
2. DATA STREAMING
How we implemented it
at
We used confluent platform which
leverage the features of Kafka.
➡ Used official docker images
➡ Deployed on Kubernetes

➡ Added persistent volume
➡ Used spark streaming to do the
manipulation of data
2. DATA STREAMING
How we implemented it
at

Recommended for you

Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020

One of the key metrics to monitor when working with Apache Kafka, as a data pipeline or a streaming platform, is Consumer Groups Lag. Lag is the delta between the last produced message and the last committed message of a partition. In other words, lag indicates how far behind your application is in processing up-to-date information. For a long time, we used our own service to keep track of these metrics, collect them and visualize them. But this didn’t scale well. You had to perform many manual operations, redeploy it and to do other tedious manual tasks, but most importantly, the biggest gap for us, was that its output was represented in absolute numbers (e.g - your lag is 30K), which basically tells you nothing as a human being. We understood that we had to find a more suitable solution that will give us better visibility and will allow us to measure the lag in a time-based format that we all understand. In this talk, I’m going to go over the core concepts of Kafka offsets and lags, and explain why lag even matters and is an important KPI to measure. I’ll also talk about the kind of research we did to find the right tool, what the options in the market were at the time, and eventually why we chose Linkedin’s Burrow as the right tool for us. And finally, I’ll take a closer look at Burrow, its building blocks, how we build and deploy it, how we monitor better with it, and eventually the most important improvement - how we transformed its output from numbers to time-based metrics.

systemsoperationsarchitecture
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...

Apache Kafka is the de facto standard for real-time event streaming, but what do you do if you want to perform user-facing, ad-hoc, real-time analytics too? That's a hard problem. Apache Pinot solves it, and the two together are like chocolate and peanut butter, peaches and cream, and Steve Rogers and Peggy Carter. Come to this talk for an introduction to Pinot and an overview of how the Pinot Kafka Connector works. Hear the challenges unique to a user-facing realtime analytics system, and how Pinot and Kafka work harmoniously to solve them. Witness an action-packed demo, showing just how easy it is to go from events to blazing-fast analytics, and how to use powerful features of both systems that help you do this at scale.

kafka summitapache kafkarealtime
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...

This document summarizes a presentation about managing Kafka clusters at scale. It discusses how AppsFlyer migrated from a monolithic Kafka deployment to multiple clusters for different teams. It then outlines challenges faced like traffic surges and mixed Kafka protocol versions. Solutions discussed include improving infrastructure, adding visibility tools, creating automation and APIs for management, and implementing sleep-driven design principles to reduce developer fatigue. The presentation concludes by discussing future goals like auto-scaling clusters.

apachekafkasummit
3. QUERYING SERVICE
at
3. QUERYING SERVICE
The various options we evaluated
at
3. QUERYING SERVICE
Why we made the choice
at
➡ Low infrastructure cost
➡ Built on Presto sql query engine.
➡ You only pay for the queries you run
➡ No ETL need, uses structured data stored on S3 as objects
➡ Supports CSV, parquet, JSON and structured file formats
Parquet files Data queried
3. QUERYING SERVICE
How we create tables
at

Recommended for you

Understanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at ScaleUnderstanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at Scale

Understanding Apache Kafka® Latency at Scale, Pere Urbon Bayes, Solutions Architect, Confluent Meetup Link: https://www.meetup.com/Mexico-Kafka/events/282390919/

apache kafkamicroservicesopen source
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)

Slides from the Chicago AWS user group on May 5th, 2016. Asaf Yigal, Co-Founder and VP Product at Logz.io, presented on using Elasticsearch, Logstash, and Kibana in Amazon Web Services. "Setting up the increasingly-popular open-source ELK Stack (Elasticsearch, Logstash, and Kibana) on AWS might seem like an easy task, but we have gone through several iterations in our architecture and have made some mistakes in our deployments that have turned out to be common in the industry. In this talk, we will go through what we did and explain what worked and what failed -- and why. We will also provide a complete blueprint of how to set up ELK for production on AWS." ~ @asafyigal

elasticsearchelk stackasaf yigal
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...

Simon Aubury gave a presentation on using ksqlDB for various enterprise workloads. He discussed four use cases: 1) streaming ETL to analyze web traffic data, 2) data enrichment to identify customers impacted by a storm, 3) measurement and audit to verify new system loads, and 4) data transformation to quickly fix data issues. For each use case, he described how to develop pipelines and applications in ksqlDB to address the business needs in a scalable and failure-resistant manner. Overall, he advocated for understanding when ksqlDB is appropriate to use and planning systems accordingly.

ksqlstream processingsystems
3. QUERYING SERVICE
How Athena query works
at
plant=Location 1
plant=Location 2
plant=Location 3
evs_start_date=2018-10-28 06:00
evs_start_date=2018-10-29 06:00
plant=Location n evs_start_date=2018-11-30 06:00
file1.parquet
file2.parquet
file3.parquet
filen.parquet
Select * from <table> where country=‘Europe’ and plant=‘Location3’ and evs_start_date=‘2018-10-29 6:00’
country=USA
country=China
country=India
country=Europe
3. QUERYING SERVICE
How we call Athena
at
4. CONTINUOUS DEPLOYMENT
at
4. CONTINUOUS DEPLOYMENT
How they made the choice
at
Better on-premise support
Competitive enterprise plan pricing
Better starting cost for enterprises

Recommended for you

Windows Azure Zero Downtime Upgrade
Windows Azure Zero Downtime UpgradeWindows Azure Zero Downtime Upgrade
Windows Azure Zero Downtime Upgrade

This document compares different approaches for performing zero downtime upgrades of applications hosted on Microsoft Azure: Web Deploy, VIP-swap, load balanced endpoints, and Traffic Manager. Web Deploy allows automatic updates of web roles with minor changes but requires an RDP connection. VIP-swap uses DNS swapping to test upgrades on a staging environment with fast redirection. Load balanced endpoints provides easy scaling but requires manual upgrades and running multiple versions simultaneously. Traffic Manager also uses DNS for isolated testing and fast redirection between environments, but incurs additional costs.

zero downtime upgradewindows azure
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...

This is a story about what happens when a distributed system becomes a big part of a small team's infrastructure. This distributed system was Kafka and the team size was one engineer. I will discuss my failures along with my journey of deploying Kafka at scale with very little prior distributed systems experience. In this presentation, we will discuss how unique insights in the following organization culture, engineering and metrics created tailwinds and headwinds. This presentation will be a tactical approach to conquering a complex system with an understaffed team while your business is growing fast. I will discuss how the use case and resilience requirements for our Kafka cluster change as the user base grew from 100K users to over 6 million.

apachekafkasummit
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition

See what's new in #Serverless and #Data at GCP. Our guest, Guillaume Blaquiere - Stack Overflow contributor & #GCP #Developer Expert from France, covered the best #GoogleCloudNext announcements, practically demoed how to benefit from #BigQuery Remote Functions and answered many questions. The meetup recording with TOC for easy navigation is at https://youtu.be/AuZZTwHIcdY P.S. For more interactive lectures like this, go to http://youtube.serverlesstoronto.org/ or sign up for our upcoming live events at https://www.meetup.com/Serverless-Toronto/events/

gcpgoogle cloudserverless
4. CONTINUOUS DEPLOYMENT
How we implemented it
at
1
2
3
4. CONTINUOUS DEPLOYMENT
How we implemented it
at
4. CONTINUOUS DEPLOYMENT
How we implemented it
at
THE FINAL PICTURE
How it ended up looking like
at
Multiple Data
Sources
Kafka S3 (Avro files) Structured streaming S3( Parquet files) Athena
1 2 3 4 5
Logstash Log FilesElasticsearchKibana
CircleCI
Deployment
Application
Kubernetes Cluster

Recommended for you

Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes

Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs, and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud native apps. But what to do if you’ve no shiny new cloud native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can! We’re facing the challenge of migrating hundreds of JEE legacy applications of a German blue chip company onto a Kubernetes cluster within one year. The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way.

kubernetes migration legacy
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes

Open Source Summit 2018, Vancouver (Canada): Talk by Josef Adersberger (@adersberger, CTO at QAware), Michael Frank (Software Architect at QAware) and Robert Bichler (IT Project Manager at Allianz Germany) Abstract: Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud-native apps. But what to do if you’ve no shiny new cloud-native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can! We’re facing the challenge of migrating hundreds of JEE legacy applications of a German blue chip company onto a Kubernetes cluster within one year. The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way.

oss2018kubernetesk8s
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...

Implement a Universal Data Distribution Architecture to Manage All Streaming Data Cloudera Partner SkillUp Tim Spann Principal Developer Advocate in Data In Motion for Cloudera tspann@cloudera.com using apache nifi, apache kafka and apache flink in a hybrid environment cloudera dataflow cloudera streams messaging manager cloudera sql streams builder

apache nifiapache flinkapache kafka
OUR LEARNINGS
at
OUR LEARNINGS
Limitations through our journey
at
Current Architecture Limitation:
➡ Athena supports by default 20 DDL and 20 DML queries at the same
time
➡ Athena loads the data from scratch for each request instead of caching
➡ Small size of parquet files made the query very slow, since it scans
through all the files
AWS Athena is a good tool to do data analysis. Wasn’t suitable for our use-case
where we wanted to handle a lot more concurrent user requests
OUR LEARNINGS
Improvements through our journey
at
Source Kafka RDS App
➡ Developed custom Kafka producer and consumer
➡ Kafka producer processes the data before dumping on RDS
➡ Reduced the amount of data stored on RDS
➡ Used indexing to make the query run faster
➡ Moved away from the concurrency issues
OUR LEARNINGS
Testing data Sanity on pipelines
at
Validate the incoming data & check for:
➡ any holes/ missing values
➡ quality of data for bad values
➡ faster notification on if streaming and refresh jobs fail

Recommended for you

How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics

This presentation is geared toward enterprise architects and senior IT leaders looking to drive more value from their data by learning about cloud data lake management. As businesses focus on leveraging big data to drive digital transformation, technology leaders are struggling to keep pace with the high volume of data coming in at high speed and rapidly evolving technologies. What's needed is an approach that helps you turn petabytes into profit. Cloud data lakes and cloud data warehouses have emerged as a popular architectural pattern to support next-generation analytics. Informatica's comprehensive AI-driven cloud data lake management solution natively ingests, streams, integrates, cleanses, governs, protects and processes big data workloads in multi-cloud environments. Please leave any questions or comments below.

data lakedata lake architecturedata lake analytics
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights

In today's data-driven world, the Internet of Things (IoT) is revolutionizing industries and unlocking new possibilities. Join Data Reply, Confluent, and Imply as we unveil a comprehensive solution for IoT that harnesses the power of real-time insights.

FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudFSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud

FINRA’s Data Lake unlocks the value in its data to accelerate analytics and machine learning at scale.  FINRA's Technology group has changed its customer's relationship with data by creating a Managed Data Lake that enables discovery on Petabytes of capital markets data, while saving time and money over traditional analytics solutions. FINRA’s Managed Data Lake includes a centralized data catalog and separates storage from compute, allowing users to query from petabytes of data in seconds.  Learn how FINRA uses Spot instances and services such as Amazon S3, Amazon EMR, Amazon Redshift, and AWS Lambda to provide the 'right tool for the right job' at each step in the data processing pipeline.  All of this is done while meeting FINRA’s security and compliance responsibilities as a financial regulator.

#awsnysummit2017#nysummit2017#aws
atOUR LEARNINGS
Tests coverage on pipeline
Build
Deploy
to Test
Run UI
Tests
Run API
Smoke Tests
Deploy
to QA
Run Data
Sanity Tests
Deploy
to Prod
Run Data
Sanity Tests
Run Front-end
Unit Tests
Run Backend
Unit Tests
THE FUTURE
at
THE FUTURE
What we plan to do next
at
CHAOS
MONKEY
Test the ecosystem for resilience and it’s self healing mechanism
INFRASTRUCTURE
IMPROVEMENTS
Helm to manage Kubernetes applications
Moving towards Amazon EKS

NOTIFY
FAILURES
Send appropriate alerts as well as logs to the concerned group
SUMAN KUMARI
sumankum@thoughtworks.com
WAMIKA SINGH
wamikas@thoughtworks.com
THANK YOU
Feel free to send us your feedback/questions

Recommended for you

Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017

This document provides an overview of the Confluent streaming platform and Apache Kafka. It discusses how streaming platforms can be used to publish, subscribe and process streams of data in real-time. It also highlights challenges with traditional architectures and how the Confluent platform addresses them by allowing data to be ingested from many sources and processed using stream processing APIs. The document also summarizes key components of the Confluent platform like Kafka Connect for streaming data between systems, the Schema Registry for ensuring compatibility, and Control Center for monitoring the platform.

kafka seattle meetup
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKS

This document discusses strategies for modernizing applications and moving workloads to Kubernetes and container platforms like Pivotal Container Service (PKS). It recommends identifying candidate applications using buckets based on factors like programming language, dependencies, and access to source code. It outlines assessing applications' business value and technical quality using Gartner's TIME methodology to prioritize efforts. The document provides an overview of PKS and how it can provide benefits like increased speed, stability, scalability and cost savings. It recommends starting projects by pushing a few applications to production on PKS to measure ROI metrics.

securityapplicationplatform
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKS

This document discusses strategies for modernizing applications and moving workloads to Kubernetes and container platforms like Pivotal Container Service (PKS). It recommends identifying candidate applications using buckets based on factors like programming language, dependencies, and access to source code. It outlines assessing applications' business value and technical quality using Gartner's TIME methodology to prioritize efforts. The document provides an overview of PKS and how it can provide benefits like increased speed, security, scalability and cost savings. It recommends starting projects by pushing a few applications to production on PKS to measure ROI metrics.

software developmentapplicationsecurity

More Related Content

What's hot

Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
HostedbyConfluent
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
Monal Daxini
 
Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at Cohesive
CloudCamp Chicago
 
Apache Flink @ Alibaba - Seattle Apache Flink Meetup
Apache Flink @ Alibaba - Seattle Apache Flink MeetupApache Flink @ Alibaba - Seattle Apache Flink Meetup
Apache Flink @ Alibaba - Seattle Apache Flink Meetup
Bowen Li
 
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
HostedbyConfluent
 
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
HostedbyConfluent
 
One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)
Simon Harrer
 
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)
Amazon Web Services
 
Journey to the Modern App with Containers, Microservices and Big Data
Journey to the Modern App with Containers, Microservices and Big DataJourney to the Modern App with Containers, Microservices and Big Data
Journey to the Modern App with Containers, Microservices and Big Data
Lightbend
 
Building a Real-Time Forecasting Engine with Scala and Akka
Building a Real-Time Forecasting Engine with Scala and Akka Building a Real-Time Forecasting Engine with Scala and Akka
Building a Real-Time Forecasting Engine with Scala and Akka
Lightbend
 
Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern
Thanh Nguyen
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
HostedbyConfluent
 
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
HostedbyConfluent
 
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
HostedbyConfluent
 
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
confluent
 
Understanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at ScaleUnderstanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at Scale
confluent
 
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Cohesive Networks
 
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
confluent
 
Windows Azure Zero Downtime Upgrade
Windows Azure Zero Downtime UpgradeWindows Azure Zero Downtime Upgrade
Windows Azure Zero Downtime Upgrade
Pavel Revenkov
 
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
confluent
 

What's hot (20)

Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
 
Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at Cohesive
 
Apache Flink @ Alibaba - Seattle Apache Flink Meetup
Apache Flink @ Alibaba - Seattle Apache Flink MeetupApache Flink @ Alibaba - Seattle Apache Flink Meetup
Apache Flink @ Alibaba - Seattle Apache Flink Meetup
 
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
 
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
 
One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)
 
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)
AWS re:Invent 2016: Tableau Rules of Engagement in the Cloud (STG306)
 
Journey to the Modern App with Containers, Microservices and Big Data
Journey to the Modern App with Containers, Microservices and Big DataJourney to the Modern App with Containers, Microservices and Big Data
Journey to the Modern App with Containers, Microservices and Big Data
 
Building a Real-Time Forecasting Engine with Scala and Akka
Building a Real-Time Forecasting Engine with Scala and Akka Building a Real-Time Forecasting Engine with Scala and Akka
Building a Real-Time Forecasting Engine with Scala and Akka
 
Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
 
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
 
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
Look how easy it is to go from events to blazing-fast analytics! | Neha Pawar...
 
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
 
Understanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at ScaleUnderstanding Apache Kafka® Latency at Scale
Understanding Apache Kafka® Latency at Scale
 
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
 
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
 
Windows Azure Zero Downtime Upgrade
Windows Azure Zero Downtime UpgradeWindows Azure Zero Downtime Upgrade
Windows Azure Zero Downtime Upgrade
 
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
 

Similar to Let's decipher the DevOps macedonia

Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
Daniel Zivkovic
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Josef Adersberger
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
QAware GmbH
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Timothy Spann
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
Informatica
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
confluent
 
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudFSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
Amazon Web Services
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
Nitin Kumar
 
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKS
Phil Reay
 
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKS
Phil Reay
 
Best Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and DeltaBest Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and Delta
Databricks
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
Amazon Web Services
 
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
VMware Tanzu
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
Eric Kavanagh
 
Challenges In Modern Application
Challenges In Modern ApplicationChallenges In Modern Application
Challenges In Modern Application
Rahul Kumar Gupta
 
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
DataStax
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
DATAVERSITY
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
confluent
 
Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS
Amazon Web Services
 
EEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS ApplicationsEEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS Applications
Expertos en TI
 

Similar to Let's decipher the DevOps macedonia (20)

Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudFSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
 
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKS
 
Application Modernisation with PKS
Application Modernisation with PKSApplication Modernisation with PKS
Application Modernisation with PKS
 
Best Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and DeltaBest Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and Delta
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
Faster, more Secure Application Modernization and Replatforming with PKS - Ku...
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Challenges In Modern Application
Challenges In Modern ApplicationChallenges In Modern Application
Challenges In Modern Application
 
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS
 
EEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS ApplicationsEEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS Applications
 

Recently uploaded

How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Toru Tamaki
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
UiPathCommunity
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
rajancomputerfbd
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
Liveplex
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
shanthidl1
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Mydbops
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
 

Recently uploaded (20)

How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
 

Let's decipher the DevOps macedonia

  • 1. LET'S DECIPHER THE DEVOPS MACEDONIA -Suman Kumari & Wamika Singh from Milan | November 29 - 30, 2018
  • 2. INTRODUCTION A quick word about us SUMAN KUMARI Application Developer WAMIKA SINGH Quality Analyst at
  • 3. THE TOPIC What are we presenting Athena S3 at
  • 5. A leading manufacturing company Headquarters in Italy OUR PROBLEM STATEMENT What our client’s business looks like Streaming large data Digital Transformation Intent to get better insights on the process using data Each factory with its own physical data storage Close to 20 plants globally at
  • 6. OUR PROBLEM STATEMENT What constraints did we have at Extensibility and availability of products in 20 plants Growing number of products and data Keeping a low operational and maintenance cost Staying cloud agnostic Building a proof of concept before an org. wide roll out
  • 7. Factory HQ OUR PROBLEM STATEMENT What technology challenges were we dealing with at 1 APPLICATION INFRASTRUCTURE 4 CONTINUOUS DEPLOYMENT 2 DATA STREAMING 3 QUERYING SERVICE
  • 9. 1. APPLICATION INFRASTRUCTURE How we made the choice at CONTAINERS Cost Portable Consistent Isolation
  • 10. 1. APPLICATION INFRASTRUCTURE How we made the choice at ✓ Open source ✓ Provides primitives for modern application ✓ auto scaling of service ✓ automated rollouts and roll backs ✓ auto discovery ✓ self healing ✓ Zero down time ✓ Cloud agnostic deployment
  • 11. 1. APPLICATION INFRASTRUCTURE How we implemented it at To create the basic infrastructure on AWS To provision the Kubernetes cluster
  • 12. 1. APPLICATION INFRASTRUCTURE How we implemented it at
  • 14. 2. DATA STREAMING How we made the choice at ➡ Low cost
 ➡ Inbuilt notification feature
 ➡ Does not maintain a persistent checkpoint.
 ➡ No infra maintenance
 ➡ To retain data for more than 24 hours, you need to pay additional cost ➡ Slow as compared to Kafka due to it’s replication factor over multiple zones
 ➡ High Cost
 ➡ AWS specific ➡ Opensource
 ➡ Fast, Durable, Scalable and very high throughput
 ➡ Lower cost ➡ Durable logs that allow us to replay messages.
  • 15. 2. DATA STREAMING How we implemented it at We used confluent platform which leverage the features of Kafka. ➡ Used official docker images ➡ Deployed on Kubernetes
 ➡ Added persistent volume ➡ Used spark streaming to do the manipulation of data
  • 16. 2. DATA STREAMING How we implemented it at
  • 18. 3. QUERYING SERVICE The various options we evaluated at
  • 19. 3. QUERYING SERVICE Why we made the choice at ➡ Low infrastructure cost ➡ Built on Presto sql query engine. ➡ You only pay for the queries you run ➡ No ETL need, uses structured data stored on S3 as objects ➡ Supports CSV, parquet, JSON and structured file formats Parquet files Data queried
  • 20. 3. QUERYING SERVICE How we create tables at
  • 21. 3. QUERYING SERVICE How Athena query works at plant=Location 1 plant=Location 2 plant=Location 3 evs_start_date=2018-10-28 06:00 evs_start_date=2018-10-29 06:00 plant=Location n evs_start_date=2018-11-30 06:00 file1.parquet file2.parquet file3.parquet filen.parquet Select * from <table> where country=‘Europe’ and plant=‘Location3’ and evs_start_date=‘2018-10-29 6:00’ country=USA country=China country=India country=Europe
  • 22. 3. QUERYING SERVICE How we call Athena at
  • 24. 4. CONTINUOUS DEPLOYMENT How they made the choice at Better on-premise support Competitive enterprise plan pricing Better starting cost for enterprises
  • 25. 4. CONTINUOUS DEPLOYMENT How we implemented it at 1 2 3
  • 26. 4. CONTINUOUS DEPLOYMENT How we implemented it at
  • 27. 4. CONTINUOUS DEPLOYMENT How we implemented it at
  • 28. THE FINAL PICTURE How it ended up looking like at Multiple Data Sources Kafka S3 (Avro files) Structured streaming S3( Parquet files) Athena 1 2 3 4 5 Logstash Log FilesElasticsearchKibana CircleCI Deployment Application Kubernetes Cluster
  • 30. OUR LEARNINGS Limitations through our journey at Current Architecture Limitation: ➡ Athena supports by default 20 DDL and 20 DML queries at the same time ➡ Athena loads the data from scratch for each request instead of caching ➡ Small size of parquet files made the query very slow, since it scans through all the files AWS Athena is a good tool to do data analysis. Wasn’t suitable for our use-case where we wanted to handle a lot more concurrent user requests
  • 31. OUR LEARNINGS Improvements through our journey at Source Kafka RDS App ➡ Developed custom Kafka producer and consumer ➡ Kafka producer processes the data before dumping on RDS ➡ Reduced the amount of data stored on RDS ➡ Used indexing to make the query run faster ➡ Moved away from the concurrency issues
  • 32. OUR LEARNINGS Testing data Sanity on pipelines at Validate the incoming data & check for: ➡ any holes/ missing values ➡ quality of data for bad values ➡ faster notification on if streaming and refresh jobs fail
  • 33. atOUR LEARNINGS Tests coverage on pipeline Build Deploy to Test Run UI Tests Run API Smoke Tests Deploy to QA Run Data Sanity Tests Deploy to Prod Run Data Sanity Tests Run Front-end Unit Tests Run Backend Unit Tests
  • 35. THE FUTURE What we plan to do next at CHAOS MONKEY Test the ecosystem for resilience and it’s self healing mechanism INFRASTRUCTURE IMPROVEMENTS Helm to manage Kubernetes applications Moving towards Amazon EKS
 NOTIFY FAILURES Send appropriate alerts as well as logs to the concerned group