SlideShare a Scribd company logo
© Cloudera, Inc. All rights reserved.
Apache Hadoop YARN Containerized Services:
Fading The Lines Between On-Prem And Cloud
Billie Rinaldi
© Cloudera, Inc. All rights reserved. 2
AGENDA
Emergence of Containers
Journey to a Container Cloud
Building Blocks of a Container Cloud
YARN Service APIs
YARN Service Examples
Enabling Hybrid Deployments
© Cloudera, Inc. All rights reserved. 4
CONTAINERIZATION IS GAINING MOMENTUM
• Industry adoption continues
• “Number of containerized applications
will rise by 80% in the next two years” [1]
• Multi-cloud and hybrid strategies
• Adoption of microservices
• Exponential ecosystem growth
• Dozens of container orchestrators
• Thousands of plugins
• Market moves
1. http://i.dell.com/sites/doccontent/business/solutions/whitepapers/en/Documents/Containers_Real_Adoption_2017_Dell_EMC_Forrester_Paper.pdf
© Cloudera, Inc. All rights reserved. 5
WHY ARE CONTAINERS GAINING POPULARITY?
• Improved hardware utilization through increased density
• No virtual machine operating system overhead
• Image layer reuse limits data duplication on disk
• Strong resource isolation
• Namespaces and cgroups
• Better software packaging
• Package applications and dependencies together
• Improved reuse vs VM images
• Distribution mechanism
• Improved developer self service
• More control over the execution environment
• Promise of portability
• On-premises and across multiple clouds
© Cloudera, Inc. All rights reserved. 6
CONTAINER ARCHITECTURE PATTERNS
• Mix of services
• Long lived services and ephemeral/batch jobs
• Decoupled compute and storage
• Scale independently
• Hybrid deployments
• Desire for consistency between cloud and on-
premises
© Cloudera, Inc. All rights reserved. 7
ON PREM VS. CLOUD: VERY DIFFERENT MODELS
Cloud
• Multiple clusters
• Decoupled compute and storage
• Infrastructure as a Service
• Improved agility and self-service
On Prem
• Large, multi-tenant clusters
• Co-located compute and storage
• Shared security and governance
• Less agile due to physical hardware
Public Cloud
ComputeSecurity &
Governance
Compute
Data Center Storage
EDW
Stream
Processing
Data Science
Operations
Data
Science
Data
Science
Data
Science
Stream
Processing
Stream
Processing
Stream
Processing
EDW
Security, Governance, Operations
EDWEDW
Public Cloud
Storage
© Cloudera, Inc. All rights reserved. 8
WHAT IS NEEDED TO BRIDGE THE GAP?
Across clusters
• Consistent deployment, security, and governance
Within clusters
• Decoupled compute and storage
• Eliminate physical hardware as a barrier to agility
How does Apache Hadoop YARN help enable portability?
© Cloudera, Inc. All rights reserved. 9
AGENDA
Emergence of Containers
Journey to a Container Cloud
Building Blocks of a Container Cloud
YARN Service APIs
YARN Service Examples
Enabling Hybrid Deployments
© Cloudera, Inc. All rights reserved. 10
JOURNEY TO A CONTAINER CLOUD
• Started off with on-prem hardware
• Quickly exceeded capacity, moved
to public cloud
• Costs were higher than we wanted
• Bigger concern was the rate of the
expense growth
• Then back to on-prem
• VM based infrastructure
• CloudStack followed by OpenStack
• Challenges before container cloud
• Low density
• Significant overhead per test
• Many images with minimal
differences, limited composition
• More and more tests and products
on-boarding
• The existing environment could no
longer keep up with the testing
demands
© Cloudera, Inc. All rights reserved. 11
ASSESSING THE CHALLENGES
• How is the industry addressing these same challenges?
• Can we leverage our existing investment in hardware?
• How to reduce overhead, improve density and hardware utilization?
• What about improving reuse of packaging and automation?
© Cloudera, Inc. All rights reserved. 12
SOLUTION: ON-PREM CONTAINER CLOUD BUILT ON YARN
• Containers (think Docker)
• Containers eliminate a bulk of the virtualization overhead
• Containers help improve reuse of images through composition
• Container startup time is fast, no real boot sequence
• Apache Hadoop YARN
• Good technical fit
• Good strategic fit
© Cloudera, Inc. All rights reserved. 13
WHY YARN?
• YARN is Apache Hadoop’s resource
management framework
• At its core, YARN is responsible for
orchestrating “containers” across a
collection of servers
• What is a YARN container?
• Linux Process
• Local Resources (scripts, jars, security tokens)
• Resource constraints (CPU, Memory, IO)
• Aligns well with container technologies
such as Docker
Container Model
© Cloudera, Inc. All rights reserved. 14
WHY YARN?
• YARN is widely deployed
• YARN is a superior scheduler
• hardened by customer feedback
• Leverage our existing expertise
• “use what we ship and ship what we use”
• No big leap to containerization
• existing “Hadoop native” frameworks to run
unchanged on the same infrastructure
Strategic Advantages
© Cloudera, Inc. All rights reserved. 15
DOGFOODING: CONTAINER CLOUD FOR RELEASE TESTING
Shared Services
Resource
Management
(YARN)
Management
and
Monitoring
(Ambari)
Jenkins
Worker
(Docker)
Testing HDP and HDF releases in container clusters
(soon CDH)
HDP
(Docker)
Worker
(Docker)
Storage
(HDFS)
Service
Discovery and
REST API
(YARN Services)
Security and
Governance
(Ranger and
Atlas)
SubmitTest
LaunchTest
Worker
(Docker)
HDP
(Docker)
HDP
(Docker)
HDP
(Docker)
© Cloudera, Inc. All rights reserved. 16
AGENDA
Emergence of Containers
Journey to a Container Cloud
Building Blocks of a Container Cloud
YARN Service APIs
YARN Service Examples
Enabling Hybrid Deployments
© Cloudera, Inc. All rights reserved. 17
BUILDING BLOCKS FOR A CONTAINER CLOUD ON YARN
• YARN Container Runtimes – Enables support for Docker containers to
make it easier to onboard new applications and services on YARN.
• YARN Services Framework – Provides AM implementation, REST API, and
various improvements to enable long running services on YARN.
• YARN Service Discovery – Allows services running on YARN to discover
one another.
© Cloudera, Inc. All rights reserved. 18
BUILDING BLOCKS FOR A CONTAINER CLOUD ON YARN
• YARN Container Runtimes – Enables support for Docker containers to
make it easier to onboard new applications and services on YARN.
• YARN Services Framework – Provides AM implementation, REST API, and
various improvements to enable long running services on YARN.
• YARN Service Discovery – Allows services running on YARN to discover
one another.
© Cloudera, Inc. All rights reserved. 19
NEW ABSTRACTION: YARN CONTAINER RUNTIMES
Choose the Container Runtime at app submission time!
DefaultLinuxContainerRuntime DockerLinuxContainerRuntime
Existing Linux process
based execution
Using Docker to run and
monitor the containers
© Cloudera, Inc. All rights reserved. 20
DISTRIBUTED SHELL AND MAPREDUCE EXAMPLES
Only difference is setting environment variables!
© Cloudera, Inc. All rights reserved. 21
DOCKER CONTAINER SUPPORT EVOLVING
• Recent Efforts
• Container Security
• ACLs for privileged containers
• Improved out the box security for untrusted images
• Entrypoint support (systemd as PID-1 Fixes)
• Exec to container support
• Ongoing Efforts
• Improving image management and lifecycle (YARN-9228)
• runc/squashfs (YARN-9014)
• CSI support (YARN-8811)
© Cloudera, Inc. All rights reserved. 22
BUILDING BLOCKS FOR A CONTAINER CLOUD ON YARN
• YARN Container Runtimes – Enables support for Docker containers to
make it easier to onboard new applications and services on YARN.
• YARN Services Framework – Provides AM implementation, REST API, and
various improvements to enable long running services on YARN.
• YARN Service Discovery – Allows services running on YARN to discover
one another.
© Cloudera, Inc. All rights reserved. 23
YARN SERVICES FRAMEWORK OVERVIEW
• Long Running
• Simplify the deployment and management of long running apps on YARN
• Easy Onboarding
• Remove tedious process of bringing new services to YARN
• Declarative Configuration
• JSON specification describing the desired state for the service to be managed
• Standard Interfaces
• REST API that lives in the Resource Manager, CLI tools for clients
© Cloudera, Inc. All rights reserved. 24
DEFINING SERVICES THROUGH THE JSON SPEC
$ curl -H "Content-Type: application/json" -X POST 
http://RM_HOST:8088/app/v1/services -d @sleeper.json
• This spec creates two
component instances, sleeper-0
and sleeper-1
• Optional features include
readiness checks, placement
policies, and creating / mounting
resources such as config files
$ yarn app -launch serviceName sleeper.json
© Cloudera, Inc. All rights reserved. 25
BUILDING BLOCKS FOR A CONTAINER CLOUD ON YARN
• YARN Container Runtimes – Enables support for Docker containers to
make it easier to onboard new applications and services on YARN.
• YARN Services Framework – Provides AM implementation, REST API, and
various improvements to enable long running services on YARN.
• YARN Service Discovery – Allows services running on YARN to discover
one another.
© Cloudera, Inc. All rights reserved. 26
SIMPLIFIED SERVICE DISCOVERY VIA DNS
Existing YARN Service Registry
• Allows apps to register themselves
• Stores entries in Apache ZooKeeper
• Provides native Java, REST, and CLI
clients to enable service discovery
YARN Registry DNS Server
• Watches the YARN Service Registry
(ZK) for new application and container
records
• Creates user friendly DNS records
based on the records
• Supports zone transfers, zone
forwarding, upstream querying, and
DNSSEC
Examples:
componentInstanceName.serviceName.user.domain
sleeper-0.sleeper-service.billie.domain
ctr-e138-1518143905142-215498-01-000007.domain
© Cloudera, Inc. All rights reserved. 27
AGENDA
Emergence of Containers
Journey to a Container Cloud
Building Blocks of a Container Cloud
YARN Service APIs
Yarn Service Examples
Enabling Hybrid Deployments
© Cloudera, Inc. All rights reserved. 28
YARN SERVICE REST API
Create a service
POST URL - http://RM_HOST:8088/app/v1/services
Get service status
GET URL - http://RM_HOST:8088/app/v1/services/tensorflow
Update service
PUT URL - http://RM_HOST:8088/app/v1/services/tensorflow
• Extend lifetime
• STOP service
• START service
• Flex UP/DOWN the # of containers of one or more components
• DELETE (destroy) service
© Cloudera, Inc. All rights reserved. 29
YARN APP CLI
Usage: yarn app
-launch serviceName jsonfile
-flex serviceName -component componentName count
-save serviceName jsonfile
-start serviceName
-status serviceName
-stop serviceName
-destroy serviceName
© Cloudera, Inc. All rights reserved. 30
Emergence of Containers
Journey to a Container Cloud
Building Blocks of a Container Cloud
YARN Service APIs
YARN Service Examples
Enabling Hybrid Deployments
AGENDA
© Cloudera, Inc. All rights reserved. 31
DEFINING SERVICES THROUGH THE JSON SPEC
$ curl -H "Content-Type: application/json" -X POST 
http://RM_HOST:8088/app/v1/services -d @sleeper.json
• This spec creates two
component instances, sleeper-0
and sleeper-1
• Optional features include
readiness checks, placement
policies, and creating / mounting
resources such as config files
$ yarn app -launch serviceName sleeper.json
© Cloudera, Inc. All rights reserved. 32
DOCKER EXAMPLE
To convert the sleeper example
into a docker example, add an
artifact:
"artifact": {
"id": "library/centos:7",
"type": "DOCKER"
}
© Cloudera, Inc. All rights reserved. 36
APACHE HBASE TARBALL EXAMPLE
HBase tarball service
● TARBALL artifact type
● ENV variables
● Config files
© Cloudera, Inc. All rights reserved. 37
APACHE HBASE DOCKER EXAMPLE
Replace TARBALL artifact with DOCKER artifact
Remove unneeded env vars and add Docker mounts
Optionally use absolute paths for generated config files
Remove unneeded config files that already exist in the image
Adjust launch command based on location in image
© Cloudera, Inc. All rights reserved. 40
AGENDA
Emergence of Containers
Journey to a Container Cloud
Building Blocks of a Container Cloud
YARN Service APIs
YARN Service Examples
Enabling Hybrid Deployments
© Cloudera, Inc. All rights reserved.
Canada East (GCP) Reality: Multi-cloud and On-prem
© Cloudera, Inc. All rights reserved. 42
ON PREM VS. CLOUD: VERY DIFFERENT MODELS
Cloud
• Multiple clusters
• Decoupled compute and storage
• Infrastructure as a Service
• Improved agility and self-service
On Prem
• Large, multi-tenant clusters
• Co-located compute and storage
• Shared security and governance
• Less agile due to physical hardware
Public Cloud
ComputeSecurity &
Governance
Compute
Data Center Storage
EDW
Stream
Processing
Data Science
Operations
Data
Science
Data
Science
Data
Science
Stream
Processing
Stream
Processing
Stream
Processing
EDW
Security, Governance, Operations
EDWEDW
Public Cloud
Storage
© Cloudera, Inc. All rights reserved. 43
ON PREM VS. CLOUD: BRIDGING THE GAP
Cloud
Shared Sec/Gov Services, Multi-
Cluster, Multi-Cloud
On Prem
Shared Sec/Gov Services,
Multi-Cluster, Containerized
Public Cloud
Compute
Data Science
Data ScienceData
Science
Stream
ProcessingStream
Processing
Stream
Processing
EDW
Security, Governance, Operations
EDWEDW
Public Cloud
Storage
Apache Hadoop
YARN Container
Cloud
Data Science
Data ScienceData
Science
Stream
ProcessingStream
Processing
Stream
Processing
EDW
Security, Governance, Operations
EDWEDW
Data Center
Storage
© Cloudera, Inc. All rights reserved. 44© Cloudera, Inc. All rights reserved.
CLOUDERA DATA
PLATFORM
• Public, private & hybrid cloud
• Shared data experience
• Powered by open source
• Analytics from the Edge to AI
• Unified data control plane
Infrastructur
e
Private
Cloud
Hybrid
Cloud
Public
Multi-Cloud
Edge
DSX Catalog | Schema | Migration | Security | GovernanceData
management
Analytic
experiences
Data Flow &
Streaming
Data
Engineering
Data
Warehouse
Operational
Database
Machine
Learning
Altus DataPlane Identity | Orchestration | Management | OperationsUnified
control plane
© Cloudera, Inc. All rights reserved.
THANK YOU

More Related Content

YARN Containerized Services: Fading The Lines Between On-Prem And Cloud

  • 1. © Cloudera, Inc. All rights reserved. Apache Hadoop YARN Containerized Services: Fading The Lines Between On-Prem And Cloud Billie Rinaldi
  • 2. © Cloudera, Inc. All rights reserved. 2 AGENDA Emergence of Containers Journey to a Container Cloud Building Blocks of a Container Cloud YARN Service APIs YARN Service Examples Enabling Hybrid Deployments
  • 3. © Cloudera, Inc. All rights reserved. 4 CONTAINERIZATION IS GAINING MOMENTUM • Industry adoption continues • “Number of containerized applications will rise by 80% in the next two years” [1] • Multi-cloud and hybrid strategies • Adoption of microservices • Exponential ecosystem growth • Dozens of container orchestrators • Thousands of plugins • Market moves 1. http://i.dell.com/sites/doccontent/business/solutions/whitepapers/en/Documents/Containers_Real_Adoption_2017_Dell_EMC_Forrester_Paper.pdf
  • 4. © Cloudera, Inc. All rights reserved. 5 WHY ARE CONTAINERS GAINING POPULARITY? • Improved hardware utilization through increased density • No virtual machine operating system overhead • Image layer reuse limits data duplication on disk • Strong resource isolation • Namespaces and cgroups • Better software packaging • Package applications and dependencies together • Improved reuse vs VM images • Distribution mechanism • Improved developer self service • More control over the execution environment • Promise of portability • On-premises and across multiple clouds
  • 5. © Cloudera, Inc. All rights reserved. 6 CONTAINER ARCHITECTURE PATTERNS • Mix of services • Long lived services and ephemeral/batch jobs • Decoupled compute and storage • Scale independently • Hybrid deployments • Desire for consistency between cloud and on- premises
  • 6. © Cloudera, Inc. All rights reserved. 7 ON PREM VS. CLOUD: VERY DIFFERENT MODELS Cloud • Multiple clusters • Decoupled compute and storage • Infrastructure as a Service • Improved agility and self-service On Prem • Large, multi-tenant clusters • Co-located compute and storage • Shared security and governance • Less agile due to physical hardware Public Cloud ComputeSecurity & Governance Compute Data Center Storage EDW Stream Processing Data Science Operations Data Science Data Science Data Science Stream Processing Stream Processing Stream Processing EDW Security, Governance, Operations EDWEDW Public Cloud Storage
  • 7. © Cloudera, Inc. All rights reserved. 8 WHAT IS NEEDED TO BRIDGE THE GAP? Across clusters • Consistent deployment, security, and governance Within clusters • Decoupled compute and storage • Eliminate physical hardware as a barrier to agility How does Apache Hadoop YARN help enable portability?
  • 8. © Cloudera, Inc. All rights reserved. 9 AGENDA Emergence of Containers Journey to a Container Cloud Building Blocks of a Container Cloud YARN Service APIs YARN Service Examples Enabling Hybrid Deployments
  • 9. © Cloudera, Inc. All rights reserved. 10 JOURNEY TO A CONTAINER CLOUD • Started off with on-prem hardware • Quickly exceeded capacity, moved to public cloud • Costs were higher than we wanted • Bigger concern was the rate of the expense growth • Then back to on-prem • VM based infrastructure • CloudStack followed by OpenStack • Challenges before container cloud • Low density • Significant overhead per test • Many images with minimal differences, limited composition • More and more tests and products on-boarding • The existing environment could no longer keep up with the testing demands
  • 10. © Cloudera, Inc. All rights reserved. 11 ASSESSING THE CHALLENGES • How is the industry addressing these same challenges? • Can we leverage our existing investment in hardware? • How to reduce overhead, improve density and hardware utilization? • What about improving reuse of packaging and automation?
  • 11. © Cloudera, Inc. All rights reserved. 12 SOLUTION: ON-PREM CONTAINER CLOUD BUILT ON YARN • Containers (think Docker) • Containers eliminate a bulk of the virtualization overhead • Containers help improve reuse of images through composition • Container startup time is fast, no real boot sequence • Apache Hadoop YARN • Good technical fit • Good strategic fit
  • 12. © Cloudera, Inc. All rights reserved. 13 WHY YARN? • YARN is Apache Hadoop’s resource management framework • At its core, YARN is responsible for orchestrating “containers” across a collection of servers • What is a YARN container? • Linux Process • Local Resources (scripts, jars, security tokens) • Resource constraints (CPU, Memory, IO) • Aligns well with container technologies such as Docker Container Model
  • 13. © Cloudera, Inc. All rights reserved. 14 WHY YARN? • YARN is widely deployed • YARN is a superior scheduler • hardened by customer feedback • Leverage our existing expertise • “use what we ship and ship what we use” • No big leap to containerization • existing “Hadoop native” frameworks to run unchanged on the same infrastructure Strategic Advantages
  • 14. © Cloudera, Inc. All rights reserved. 15 DOGFOODING: CONTAINER CLOUD FOR RELEASE TESTING Shared Services Resource Management (YARN) Management and Monitoring (Ambari) Jenkins Worker (Docker) Testing HDP and HDF releases in container clusters (soon CDH) HDP (Docker) Worker (Docker) Storage (HDFS) Service Discovery and REST API (YARN Services) Security and Governance (Ranger and Atlas) SubmitTest LaunchTest Worker (Docker) HDP (Docker) HDP (Docker) HDP (Docker)
  • 15. © Cloudera, Inc. All rights reserved. 16 AGENDA Emergence of Containers Journey to a Container Cloud Building Blocks of a Container Cloud YARN Service APIs YARN Service Examples Enabling Hybrid Deployments
  • 16. © Cloudera, Inc. All rights reserved. 17 BUILDING BLOCKS FOR A CONTAINER CLOUD ON YARN • YARN Container Runtimes – Enables support for Docker containers to make it easier to onboard new applications and services on YARN. • YARN Services Framework – Provides AM implementation, REST API, and various improvements to enable long running services on YARN. • YARN Service Discovery – Allows services running on YARN to discover one another.
  • 17. © Cloudera, Inc. All rights reserved. 18 BUILDING BLOCKS FOR A CONTAINER CLOUD ON YARN • YARN Container Runtimes – Enables support for Docker containers to make it easier to onboard new applications and services on YARN. • YARN Services Framework – Provides AM implementation, REST API, and various improvements to enable long running services on YARN. • YARN Service Discovery – Allows services running on YARN to discover one another.
  • 18. © Cloudera, Inc. All rights reserved. 19 NEW ABSTRACTION: YARN CONTAINER RUNTIMES Choose the Container Runtime at app submission time! DefaultLinuxContainerRuntime DockerLinuxContainerRuntime Existing Linux process based execution Using Docker to run and monitor the containers
  • 19. © Cloudera, Inc. All rights reserved. 20 DISTRIBUTED SHELL AND MAPREDUCE EXAMPLES Only difference is setting environment variables!
  • 20. © Cloudera, Inc. All rights reserved. 21 DOCKER CONTAINER SUPPORT EVOLVING • Recent Efforts • Container Security • ACLs for privileged containers • Improved out the box security for untrusted images • Entrypoint support (systemd as PID-1 Fixes) • Exec to container support • Ongoing Efforts • Improving image management and lifecycle (YARN-9228) • runc/squashfs (YARN-9014) • CSI support (YARN-8811)
  • 21. © Cloudera, Inc. All rights reserved. 22 BUILDING BLOCKS FOR A CONTAINER CLOUD ON YARN • YARN Container Runtimes – Enables support for Docker containers to make it easier to onboard new applications and services on YARN. • YARN Services Framework – Provides AM implementation, REST API, and various improvements to enable long running services on YARN. • YARN Service Discovery – Allows services running on YARN to discover one another.
  • 22. © Cloudera, Inc. All rights reserved. 23 YARN SERVICES FRAMEWORK OVERVIEW • Long Running • Simplify the deployment and management of long running apps on YARN • Easy Onboarding • Remove tedious process of bringing new services to YARN • Declarative Configuration • JSON specification describing the desired state for the service to be managed • Standard Interfaces • REST API that lives in the Resource Manager, CLI tools for clients
  • 23. © Cloudera, Inc. All rights reserved. 24 DEFINING SERVICES THROUGH THE JSON SPEC $ curl -H "Content-Type: application/json" -X POST http://RM_HOST:8088/app/v1/services -d @sleeper.json • This spec creates two component instances, sleeper-0 and sleeper-1 • Optional features include readiness checks, placement policies, and creating / mounting resources such as config files $ yarn app -launch serviceName sleeper.json
  • 24. © Cloudera, Inc. All rights reserved. 25 BUILDING BLOCKS FOR A CONTAINER CLOUD ON YARN • YARN Container Runtimes – Enables support for Docker containers to make it easier to onboard new applications and services on YARN. • YARN Services Framework – Provides AM implementation, REST API, and various improvements to enable long running services on YARN. • YARN Service Discovery – Allows services running on YARN to discover one another.
  • 25. © Cloudera, Inc. All rights reserved. 26 SIMPLIFIED SERVICE DISCOVERY VIA DNS Existing YARN Service Registry • Allows apps to register themselves • Stores entries in Apache ZooKeeper • Provides native Java, REST, and CLI clients to enable service discovery YARN Registry DNS Server • Watches the YARN Service Registry (ZK) for new application and container records • Creates user friendly DNS records based on the records • Supports zone transfers, zone forwarding, upstream querying, and DNSSEC Examples: componentInstanceName.serviceName.user.domain sleeper-0.sleeper-service.billie.domain ctr-e138-1518143905142-215498-01-000007.domain
  • 26. © Cloudera, Inc. All rights reserved. 27 AGENDA Emergence of Containers Journey to a Container Cloud Building Blocks of a Container Cloud YARN Service APIs Yarn Service Examples Enabling Hybrid Deployments
  • 27. © Cloudera, Inc. All rights reserved. 28 YARN SERVICE REST API Create a service POST URL - http://RM_HOST:8088/app/v1/services Get service status GET URL - http://RM_HOST:8088/app/v1/services/tensorflow Update service PUT URL - http://RM_HOST:8088/app/v1/services/tensorflow • Extend lifetime • STOP service • START service • Flex UP/DOWN the # of containers of one or more components • DELETE (destroy) service
  • 28. © Cloudera, Inc. All rights reserved. 29 YARN APP CLI Usage: yarn app -launch serviceName jsonfile -flex serviceName -component componentName count -save serviceName jsonfile -start serviceName -status serviceName -stop serviceName -destroy serviceName
  • 29. © Cloudera, Inc. All rights reserved. 30 Emergence of Containers Journey to a Container Cloud Building Blocks of a Container Cloud YARN Service APIs YARN Service Examples Enabling Hybrid Deployments AGENDA
  • 30. © Cloudera, Inc. All rights reserved. 31 DEFINING SERVICES THROUGH THE JSON SPEC $ curl -H "Content-Type: application/json" -X POST http://RM_HOST:8088/app/v1/services -d @sleeper.json • This spec creates two component instances, sleeper-0 and sleeper-1 • Optional features include readiness checks, placement policies, and creating / mounting resources such as config files $ yarn app -launch serviceName sleeper.json
  • 31. © Cloudera, Inc. All rights reserved. 32 DOCKER EXAMPLE To convert the sleeper example into a docker example, add an artifact: "artifact": { "id": "library/centos:7", "type": "DOCKER" }
  • 32. © Cloudera, Inc. All rights reserved. 36 APACHE HBASE TARBALL EXAMPLE HBase tarball service ● TARBALL artifact type ● ENV variables ● Config files
  • 33. © Cloudera, Inc. All rights reserved. 37 APACHE HBASE DOCKER EXAMPLE Replace TARBALL artifact with DOCKER artifact Remove unneeded env vars and add Docker mounts Optionally use absolute paths for generated config files Remove unneeded config files that already exist in the image Adjust launch command based on location in image
  • 34. © Cloudera, Inc. All rights reserved. 40 AGENDA Emergence of Containers Journey to a Container Cloud Building Blocks of a Container Cloud YARN Service APIs YARN Service Examples Enabling Hybrid Deployments
  • 35. © Cloudera, Inc. All rights reserved. Canada East (GCP) Reality: Multi-cloud and On-prem
  • 36. © Cloudera, Inc. All rights reserved. 42 ON PREM VS. CLOUD: VERY DIFFERENT MODELS Cloud • Multiple clusters • Decoupled compute and storage • Infrastructure as a Service • Improved agility and self-service On Prem • Large, multi-tenant clusters • Co-located compute and storage • Shared security and governance • Less agile due to physical hardware Public Cloud ComputeSecurity & Governance Compute Data Center Storage EDW Stream Processing Data Science Operations Data Science Data Science Data Science Stream Processing Stream Processing Stream Processing EDW Security, Governance, Operations EDWEDW Public Cloud Storage
  • 37. © Cloudera, Inc. All rights reserved. 43 ON PREM VS. CLOUD: BRIDGING THE GAP Cloud Shared Sec/Gov Services, Multi- Cluster, Multi-Cloud On Prem Shared Sec/Gov Services, Multi-Cluster, Containerized Public Cloud Compute Data Science Data ScienceData Science Stream ProcessingStream Processing Stream Processing EDW Security, Governance, Operations EDWEDW Public Cloud Storage Apache Hadoop YARN Container Cloud Data Science Data ScienceData Science Stream ProcessingStream Processing Stream Processing EDW Security, Governance, Operations EDWEDW Data Center Storage
  • 38. © Cloudera, Inc. All rights reserved. 44© Cloudera, Inc. All rights reserved. CLOUDERA DATA PLATFORM • Public, private & hybrid cloud • Shared data experience • Powered by open source • Analytics from the Edge to AI • Unified data control plane Infrastructur e Private Cloud Hybrid Cloud Public Multi-Cloud Edge DSX Catalog | Schema | Migration | Security | GovernanceData management Analytic experiences Data Flow & Streaming Data Engineering Data Warehouse Operational Database Machine Learning Altus DataPlane Identity | Orchestration | Management | OperationsUnified control plane
  • 39. © Cloudera, Inc. All rights reserved. THANK YOU