SlideShare a Scribd company logo
Kanister: Application-Level
Data Operations on Kubernetes
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Abstract
For stateful, cloud-native applications, data operations must often be performed by tools with
semantic understanding of the data. The volume-level primitives provided by orchestrators are
not sufficient to support data workflows like backup/recovery of complex, distributed databases.
To bridge this gap between operational requirements for these applications and Kubernetes, the
open source project Kanister was created. Kanister is a framework to support application-level
data management in Kubernetes. It lets developers define relationships between tools and
applications, and then makes running those tools in Kubernetes simple. Kanister is managed
through Kubernetes API objects called CustomResourceDefinitions, and all interactions with
Kanister take place through Kubernetes tools and APIs. In short, Kanister allows administrators
and automation to perform data operations at the Kubernetes level regardless of the complexity
of the application.
In this live webinar, Kanister contributors present how it is used and will demo protection
operations on a live MongoDB cluster. This webinar is targeted towards developers and ops
teams interested in stateful applications in Kubernetes. kanister.io.
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Introductions
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
The Challenge: Layers of Operation
• Day 2 services have become a here and now problem
• Data Management functions such as backup and disaster recovery along with
application mobility.
• Cloud-Native applications, microservices use multiple data services (MongoDB,
Redis, Kafka, etc.)
• Storage technologies to store state and are typically deployed in multiple
locations (regions, clouds, on-premises).
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Physical Storage
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
File, Block & Object
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Data Services
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Stateful Application
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
The Challenge: Flavours of Data Management
• What level of protection is enough?
• Backup & Restore
• Application Mobility
• Disaster Recovery
• Compliance Requirements
• Freedom of choice
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Storage-Centric Snapshots
• Physical Storage Layer exercised
• Crash-Consistent
• Dependant on error handling capability
Fastest option for Backup & Recovery *
*BUT… Same storage system as
production / transaction-level granularity
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Storage-Centric with Data Service Hooks
Same as before but now with hooks into
your Data services
1. Freeze & Flush the Data Services layer
2. Initiate a Storage-layer snapshot
3. Unfreeze the Data Services layer
4. Record completion and status of the
Snapshot process
Fast option for Backup & Recovery *
*BUT… Same storage system as production
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Data Service Centric
• Storage Efficient – database aware and
db specific compression
• No Dependency on underlying storage
• Recovery can be complicated
Good option for backup and consistency,
recovery complex but we are now getting
data out of band.
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Application-Centric
• Focus is on business continuity of an
application
• Backup can be easy but restore also needs
to be easy.
• Higher level of consistency and flexibility
• Freedom of choice
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Data Management Summary
• Storage-centric snapshots provided by the underlying file or block
storage
• Storage-centric with data service hooks that spans across storage and
data services layers,
• Data service-centric approaches that uses database specific utilities, and
finally
• Application-centric that exercises all the above capabilities in a
coordinated manner.
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Need for Application-Consistent Data Management
• Shortcomings of Crash-Consistent Storage Snapshots
• Logical Backups
• mysqldump, pg_dump, mongodump etc.
• Provider Specific API Calls
• RDS Snapshot APIs, Operator CRs
• Application Quiesce/Unquiesce
• e.g. FLUSH TABLES WITH READ LOCK
• Advanced Scenarios
• Backup MongoDB Secondaries
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Data Protection Workflows are Complex
• One application includes many domains
• Cluster Admins != App developers != Database Administrators
• Difficult to separate concerns
• Many moving parts
• Targets: Object Storage, Vendor Targets
• Types of Backups: logical dumps, volume snapshots
• Application lifecycle: Scale down/up when restoring
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
KANISTER allows domain experts to capture application specific data
management tasks in blueprints which can be easily shared and extended.
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Kanister Framework for App-level Data Management
• Kanister Controller
• Operator responsible for Kubernetes Custom Resources and state
management
• Blueprints
• Define workflows for backup, restore and delete operations
• ActionSets
• Run an action to backup, restore and delete
• Profiles
• Define target destination for backups or sources for restores
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Kanister CLI Tools
• kanctl
• CLI to create Kanister Profile CRs and ActionSets
• kando
• CLI used within containers to push and pull backup data to and from an
object store location
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Example Blueprint
apiVersion: cr.kanister.io/v1alpha1
kind: Blueprint
metadata:
name: mongodb-blueprint
actions:
backup:
# Store backup information
outputArtifacts:
cloudObject:
keyValue:
path: '/mongodb-replicaset-backups/{{ .StatefulSet.Name }}/rs_backup.gz'
# Use Kanister functions to perform operations
phases:
- func: KubeTask
args:
namespace: ‘{{ .StatefulSet.Namespace }}’
image: kanisterio/mongodb:0.68.0
command:
- bash
- -c
- mongodump --oplog --gzip --archive --host ${host} -u root -p “${dbPassword}”
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Example ActionSet
apiVersion: cr.kanister.io/v1alpha1
kind: ActionSet
metadata:
generateName: s3backup-
namespace: kanister-controller
spec:
# Select an action from a Blueprint
actions:
- name: backup
blueprint: mongodb-blueprint
# Select a Kubernetes resource to run the action on
object:
kind: StatefulSet
name: mongodb-replicaset
namespace: mongodb
# Select a Profile to use as the source/destination for the action
profile:
kind: profile
name: example-profile
namespace: kanister-controller
# Status set by the Kanister controller
status:
actions:
- artifacts:
cloudObject:
keyValue:
path: '/mongodb-replicaset-backups/mongodb-replicaset/rs_backup.gz'
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Example Profile
apiVersion: cr.kanister.io/v1alpha1
kind: Profile
metadata:
generateName: s3profile-
namespace: kanister-controller
# Object Store location
location:
type: s3Compliant
bucket: kanister-backup
# Credentials for the Object Store location
credential:
type: keyPair
keyPair:
idField: example_key_id
secretField: example_secret_access_key
secret:
apiVersion: v1
kind: Secret
name: example-secret
namespace: example-namespace
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Execution Walkthrough
Controller
Blueprint
Database
Workload
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Execution Walkthrough
ActionSet
Controller
Blueprint
Database
Workload
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Execution Walkthrough
ActionSet
Controller
Blueprint
Database
Workload
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Execution Walkthrough
ActionSet
Controller
Blueprint
Kanister
Function
Database
Workload
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Execution Walkthrough
ActionSet
Controller
Blueprint
Kanister
Function
Database
Workload
Object Storage/
Cloud Snapshot
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Execution Walkthrough
ActionSet
Controller
Blueprint
Database
Workload
Object Storage/
Cloud Snapshot
Demo
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Kanister Functions
Custom Logic
• KubeExec
• KubeExecAll
• KubeTask
Resource Lifecycle
• Scale up/down workload
• KubeTask with kubectl command
Handle PVC
• Backup/Restore/DeleteData
• PrepareData
Volume Snapshots
• Create/Restore/Delete snapshots
Amazon RDS
• Create/Restore/Delete snapshots
• ExportSnapshotToRegion
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Providers Supported
Object Storage
• AWS S3
• S3 Compliant
• Azure Blob
• Google Cloud Storage
Block/File Storage (in-tree)
• AWS EBS/EFS
• Azure Disk
• Google Persistent Disk
• IBM Disk
• CSI
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Roadmap / New Features
• File Storage destinations for backups
• Encryption, deduplication and compression support with kando
• Kanister functions to manage data in Data Service Operators like K8ssandra
© 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners.
Next Steps
Closing
Please look at the project
Feedback & Contributions
Spread the word
An extensible open-source framework for
application-level data management on
Kubernetes

More Related Content

Cncf kanister.pptx

  • 2. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Abstract For stateful, cloud-native applications, data operations must often be performed by tools with semantic understanding of the data. The volume-level primitives provided by orchestrators are not sufficient to support data workflows like backup/recovery of complex, distributed databases. To bridge this gap between operational requirements for these applications and Kubernetes, the open source project Kanister was created. Kanister is a framework to support application-level data management in Kubernetes. It lets developers define relationships between tools and applications, and then makes running those tools in Kubernetes simple. Kanister is managed through Kubernetes API objects called CustomResourceDefinitions, and all interactions with Kanister take place through Kubernetes tools and APIs. In short, Kanister allows administrators and automation to perform data operations at the Kubernetes level regardless of the complexity of the application. In this live webinar, Kanister contributors present how it is used and will demo protection operations on a live MongoDB cluster. This webinar is targeted towards developers and ops teams interested in stateful applications in Kubernetes. kanister.io.
  • 3. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Introductions
  • 4. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. The Challenge: Layers of Operation • Day 2 services have become a here and now problem • Data Management functions such as backup and disaster recovery along with application mobility. • Cloud-Native applications, microservices use multiple data services (MongoDB, Redis, Kafka, etc.) • Storage technologies to store state and are typically deployed in multiple locations (regions, clouds, on-premises).
  • 5. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Physical Storage
  • 6. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. File, Block & Object
  • 7. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Data Services
  • 8. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Stateful Application
  • 9. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. The Challenge: Flavours of Data Management • What level of protection is enough? • Backup & Restore • Application Mobility • Disaster Recovery • Compliance Requirements • Freedom of choice
  • 10. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Storage-Centric Snapshots • Physical Storage Layer exercised • Crash-Consistent • Dependant on error handling capability Fastest option for Backup & Recovery * *BUT… Same storage system as production / transaction-level granularity
  • 11. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Storage-Centric with Data Service Hooks Same as before but now with hooks into your Data services 1. Freeze & Flush the Data Services layer 2. Initiate a Storage-layer snapshot 3. Unfreeze the Data Services layer 4. Record completion and status of the Snapshot process Fast option for Backup & Recovery * *BUT… Same storage system as production
  • 12. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Data Service Centric • Storage Efficient – database aware and db specific compression • No Dependency on underlying storage • Recovery can be complicated Good option for backup and consistency, recovery complex but we are now getting data out of band.
  • 13. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Application-Centric • Focus is on business continuity of an application • Backup can be easy but restore also needs to be easy. • Higher level of consistency and flexibility • Freedom of choice
  • 14. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Data Management Summary • Storage-centric snapshots provided by the underlying file or block storage • Storage-centric with data service hooks that spans across storage and data services layers, • Data service-centric approaches that uses database specific utilities, and finally • Application-centric that exercises all the above capabilities in a coordinated manner.
  • 15. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Need for Application-Consistent Data Management • Shortcomings of Crash-Consistent Storage Snapshots • Logical Backups • mysqldump, pg_dump, mongodump etc. • Provider Specific API Calls • RDS Snapshot APIs, Operator CRs • Application Quiesce/Unquiesce • e.g. FLUSH TABLES WITH READ LOCK • Advanced Scenarios • Backup MongoDB Secondaries
  • 16. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Data Protection Workflows are Complex • One application includes many domains • Cluster Admins != App developers != Database Administrators • Difficult to separate concerns • Many moving parts • Targets: Object Storage, Vendor Targets • Types of Backups: logical dumps, volume snapshots • Application lifecycle: Scale down/up when restoring
  • 17. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. KANISTER allows domain experts to capture application specific data management tasks in blueprints which can be easily shared and extended.
  • 18. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Kanister Framework for App-level Data Management • Kanister Controller • Operator responsible for Kubernetes Custom Resources and state management • Blueprints • Define workflows for backup, restore and delete operations • ActionSets • Run an action to backup, restore and delete • Profiles • Define target destination for backups or sources for restores
  • 19. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Kanister CLI Tools • kanctl • CLI to create Kanister Profile CRs and ActionSets • kando • CLI used within containers to push and pull backup data to and from an object store location
  • 20. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Example Blueprint apiVersion: cr.kanister.io/v1alpha1 kind: Blueprint metadata: name: mongodb-blueprint actions: backup: # Store backup information outputArtifacts: cloudObject: keyValue: path: '/mongodb-replicaset-backups/{{ .StatefulSet.Name }}/rs_backup.gz' # Use Kanister functions to perform operations phases: - func: KubeTask args: namespace: ‘{{ .StatefulSet.Namespace }}’ image: kanisterio/mongodb:0.68.0 command: - bash - -c - mongodump --oplog --gzip --archive --host ${host} -u root -p “${dbPassword}”
  • 21. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Example ActionSet apiVersion: cr.kanister.io/v1alpha1 kind: ActionSet metadata: generateName: s3backup- namespace: kanister-controller spec: # Select an action from a Blueprint actions: - name: backup blueprint: mongodb-blueprint # Select a Kubernetes resource to run the action on object: kind: StatefulSet name: mongodb-replicaset namespace: mongodb # Select a Profile to use as the source/destination for the action profile: kind: profile name: example-profile namespace: kanister-controller # Status set by the Kanister controller status: actions: - artifacts: cloudObject: keyValue: path: '/mongodb-replicaset-backups/mongodb-replicaset/rs_backup.gz'
  • 22. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Example Profile apiVersion: cr.kanister.io/v1alpha1 kind: Profile metadata: generateName: s3profile- namespace: kanister-controller # Object Store location location: type: s3Compliant bucket: kanister-backup # Credentials for the Object Store location credential: type: keyPair keyPair: idField: example_key_id secretField: example_secret_access_key secret: apiVersion: v1 kind: Secret name: example-secret namespace: example-namespace
  • 23. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Execution Walkthrough Controller Blueprint Database Workload
  • 24. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Execution Walkthrough ActionSet Controller Blueprint Database Workload
  • 25. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Execution Walkthrough ActionSet Controller Blueprint Database Workload
  • 26. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Execution Walkthrough ActionSet Controller Blueprint Kanister Function Database Workload
  • 27. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Execution Walkthrough ActionSet Controller Blueprint Kanister Function Database Workload Object Storage/ Cloud Snapshot
  • 28. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Execution Walkthrough ActionSet Controller Blueprint Database Workload Object Storage/ Cloud Snapshot
  • 29. Demo
  • 30. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Kanister Functions Custom Logic • KubeExec • KubeExecAll • KubeTask Resource Lifecycle • Scale up/down workload • KubeTask with kubectl command Handle PVC • Backup/Restore/DeleteData • PrepareData Volume Snapshots • Create/Restore/Delete snapshots Amazon RDS • Create/Restore/Delete snapshots • ExportSnapshotToRegion
  • 31. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Providers Supported Object Storage • AWS S3 • S3 Compliant • Azure Blob • Google Cloud Storage Block/File Storage (in-tree) • AWS EBS/EFS • Azure Disk • Google Persistent Disk • IBM Disk • CSI
  • 32. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Roadmap / New Features • File Storage destinations for backups • Encryption, deduplication and compression support with kando • Kanister functions to manage data in Data Service Operators like K8ssandra
  • 33. © 2021 Kasten by Veeam. Confidential information. All rights reserved. All trademarks are the property of their respective owners. Next Steps
  • 34. Closing Please look at the project Feedback & Contributions Spread the word An extensible open-source framework for application-level data management on Kubernetes