SlideShare a Scribd company logo
Kubernetes
#4. Volume & StatefulSet
조대협 (http://bcho.tistory.com)
Agenda
● Volume
● PersistentVolume and PersistentVolumeClaim
● StatefulSet
Volume
Volume is
● Disk resource which is described as a part of Pod
● It can be shared across containers in Pod
● It must be mounted to container which wants to access the volume
● Life cycle is managed by Pod
Volume is created when Pod is started and deleted when Pod is deleted.
(Not along with container)
Volume type
Temp Local Network
emptyDir hostPath GlusterFS
gitRepo
NFS
iSCSI
gcePersistentDisk
AWS EBS
azureDisk
Fiber Channel
Secret
VshereVolume
Volume type - empyDir
● A simple empty directory used for storing transient data
● It is good for ephemeral storage and good for file sharing across container
emptyDir can use Memory as a disk
Volume type - hostPath
● It posts to a specific file or directory on the node’s file system
● Pods running on same node and using the same path in their volume
● hostPath volume is not deleted when Pod is torn down
(If new Pod is started, the files in the hostPath volume will be remained)
Node
Pod Pod
/mydirectory/
Node
Pod
/mydirectory/
If you’re thinking of using a hostPath volume as the place to store a database’s data directory, think again.
Because the volume’s contents are stored on a specific node’s filesystem, when the database pod gets
rescheduled to another node, it will no longer see the data.
Pod
Volume type - gitRepo
● A gitRepo volume is basically an emptyDir volume
● It gets populated by cloning a Git repository and checking out a specific revision
when the pod is starting up (but before its containers are created)
● Useful to provisioning static (HTML) data or script source from git
Container
gitRepo
<<emptyDir>> Clone
apiVersion: v1
kind: Pod
metadata:
name: gitrepo-volume-pod
spec:
containers:
- image: nginx:alpine
name: web-server
volumeMounts:
- name: html
mountPath: /usr/share/nginx/html
readOnly: true
ports:
- containerPort: 80
protocol: TCP
volumes:
- name: html
gitRepo:
repository: https://github.com/luksa/kubia-website-example.git
revision: master
directory: .
PersistentVolume
& PersitentVolumeClaim
PersistentVolume
● PersistentVolume
Life cycle is managed by k8s cluster (not pod)
Admin can create PersistentVolume (Static provisioning) and developer just
use the volume thru PersistentVolumeClaim without understanding of
infrastructure
(It is more common to use dynamic volume provisioning instead of volume provisioning by admin)
Pod
Container
Volume
Bind
PersistentVolu
meClaim
PersistentVolume
<<Created by admin>>
create
admin
Select
Physical disk
create
PersistentVolume
● PersistentVolume
● Capacity : Storage size
In future it will include IOPS,throughput etc
● VolumeMode (from 1.9) : Filesystem(default) or
rawblock device
● Reclaim Policy
○ Retain – manual reclamation
○ Recycle – basic scrub (rm -rf /thevolume/*)
○ Delete – associated storage asset such as
AWS EBS, GCE PD, Azure Disk, or OpenStack
Cinder volume is deleted
● Mount Option
Additional mount options for when a Persistent
Volume is mounted on a node
Currently, only NFS and HostPath support recycling. AWS
EBS, GCE PD, Azure Disk, and Cinder volumes support
deletion.
PersistentVolume
● AccessModes
○ ReadWriteOnce (RWO)– the
volume can be mounted as
read-write by a single node
○ ReadOnlyMany (ROX) – the
volume can be mounted
read-only by many nodes
○ ReadWriteMany (RWX) – the
volume can be mounted as
read-write by many nodes
A volume can only be mounted using
one access mode at a time, even if it
supports many. For example, a
GCEPersistentDisk can be mounted as
ReadWriteOnce by a single node or
ReadOnlyMany by many nodes, but not
at the same time.
RWO를 여러 Pod에 붙일려고 하면 에러가 남
PersistentVolume Phase
● Available - a free resource that yet is not bound to a claim
● Bound - the volume is bound to a claim
● Released - the claim has been deleted but the resource is not yet reclaimed
by the cluster
● Failed - the volume has failed its automatic reclamation
Lifecycle of volume and claim
Provisioning
● Static
● Dynamic : Dynamically create volume with storage class.
Binding
● Bind PV to PVC
Using
● Bind PVC to Pod and starts to use
Reclaiming
PersistentVolumeClaim (PVC)
Claiming a PersistentVolume is a completely separate process from creating a pod, because you want
the same PersistentVolumeClaim to stay available even if the pod is rescheduled (remember,
rescheduling means the previous pod is deleted and a new one is created
apiVersion: v1 kind: PersistentVolumeClaim
metadata:
name: mongodb-pvc
spec:
resources:
requests:
storage: 1Gi
accessModes:
- ReadWriteOnce
storageClassName: ""
List PersistentVolume
List PersistentVolumeClaim
쿠버네티스 인액션 예제
Nobody else can claim the same volume until you release it.
PersistentVolumeClaim (PVC)
https://kubernetes.io/docs/concepts/s
torage/persistent-
volumes/#persistentvolumeclaims
● accessMode : same as volume
● VolumeMode : same as volume
● Resource : Claims, like pods, can request specific
quantities of a resource. In this case, the request is for
storage. The same resource model applies to both volumes
and claims.
● Selector : Claims can specify a label selector to further
filter the set of volumes. Only the volumes whose labels
match the selector can be bound to the claim
All of the requirements, from both matchLabels and
matchExpressions are ANDed together – they must all be
satisfied in order to match
Using PVC in Pod
쿠버네티스 인액션 예제
Google Cloud Kubernetes example
Recycling
● When a user is done with their volume, they can delete PVC object which allows
reclamation of the resource
● Reclaim policy for PersistentVolume tell the cluster what to do with the volume after it
has been released of its claim
● Reclaim policy
○ Retain (remain / cannot reuse)
After PVC is deleted, PV still exists and volume is marked as “released”. It cannot
be reused by other PVC. (it needs to delete manually)
○ Delete
Delete volume as well as associated storage asset in external infrastructure
○ Recycle (remain / can reuse)
Perform basic scrub (rm -f /thevolume/*) on volume and makes it available again
for a new claim
The Recycle reclaim policy is deprecated. Instead, the recommended approach
is to use dynamic provisioning.
Volume retaining test
Change Reclaim policy from DELETE to Retain
Recreat PVC & POD
Delete Pod and PVC
Dynamic Provisioning
From k8s 1.6. Without manual provisioning of PersistentVolume. Kubernetes can dynamically create
volume based on PersistentVolumeClaim (and storage class)
Pod
Container
Volume
Bind
PersistentVolu
meClaim
PersistentVolume
<<DynamicVoume>>
k8s
Dynamically
created
Instead of admin to create PV manually, admin can deploy Persistent volume provisioner and define one
or more storage class object to let users choose type of PV.
PersistentVolume
<<DynamicVoume>>
Dynamically
created
Request Volume with storage class
Dynamic Provisioning
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast
provisioner: kubernetes.io/gce-
pd
parameters:
type: pd-ssd
zone: europe-west1-b
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongodb-pvc
spec:
storageClassName: fast
resources:
requests:
storage: 100Mi
accessModes:
- ReadWriteOnce
Storage Class
Example
Storage Class
Get default storage class
StatefulSet
Replicating stateful pod
● With ReplicaSet
All pod will refer same PVC and same PV
● Option : Create RS per Pod
Statefulset
● To support stateful application (like database)
● It is supported from k8s 1.9(GA)
● Pod Naming : ${Stateful set name}-${ordinary index}
It is not that the pods having a predictable name and hostname.
It is better to use service for naming.
● When Pod (under statefulset) is restarted (by crash), Pod name will not be changed.
Cf. in case of RS, the Pod name is changed to new one.
StatefulSet specification example
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "my-storage-class"
resources:
requests:
storage: 1Gi
StatefulSet
Pod to PV mapping
● Each pod will get its own PVC and PV
● The PVC will be created “Volume claim template” in StatefulSet specification
POD-1 PVC-1 PV-1
POD-2 PVC-2 PV-2
POD-3 PVC-3 PV-3
Volume claim
template
Pod Template
Create
Create
Scale in & out
● Scale in
POD-1 PVC-1 PV-1
POD-2 PVC-2 PV-2
PVC-3 PV-3
POD-1 PVC-1 PV-1
POD-2 PVC-2 PV-2
POD-3 PVC-3 PV-3
POD-3 is removed by scale in , But PVC-3 and PV-3 are not deleted
● Scale out
POD-3 is added by scale out. It is attached previous PVC-3 and PV-3
It enables POD connect to same PVC and PV and it POD to retain same stateful information in disk
Because of this reason, PVC and PV are not automatically deleted. To delete PVC and PV,
admin needs to manually delete the PVC and PV
Pod management policies
● .spec.podManagementPolicy
○ OrderedReady (default) : start and terminate Pod sequential.
○ Parallel : start Pod and terminate in parallel
End of document

More Related Content

Kubernetes #4 volume &amp; stateful set

  • 1. Kubernetes #4. Volume & StatefulSet 조대협 (http://bcho.tistory.com)
  • 2. Agenda ● Volume ● PersistentVolume and PersistentVolumeClaim ● StatefulSet
  • 4. Volume is ● Disk resource which is described as a part of Pod ● It can be shared across containers in Pod ● It must be mounted to container which wants to access the volume ● Life cycle is managed by Pod Volume is created when Pod is started and deleted when Pod is deleted. (Not along with container)
  • 5. Volume type Temp Local Network emptyDir hostPath GlusterFS gitRepo NFS iSCSI gcePersistentDisk AWS EBS azureDisk Fiber Channel Secret VshereVolume
  • 6. Volume type - empyDir ● A simple empty directory used for storing transient data ● It is good for ephemeral storage and good for file sharing across container emptyDir can use Memory as a disk
  • 7. Volume type - hostPath ● It posts to a specific file or directory on the node’s file system ● Pods running on same node and using the same path in their volume ● hostPath volume is not deleted when Pod is torn down (If new Pod is started, the files in the hostPath volume will be remained) Node Pod Pod /mydirectory/ Node Pod /mydirectory/ If you’re thinking of using a hostPath volume as the place to store a database’s data directory, think again. Because the volume’s contents are stored on a specific node’s filesystem, when the database pod gets rescheduled to another node, it will no longer see the data.
  • 8. Pod Volume type - gitRepo ● A gitRepo volume is basically an emptyDir volume ● It gets populated by cloning a Git repository and checking out a specific revision when the pod is starting up (but before its containers are created) ● Useful to provisioning static (HTML) data or script source from git Container gitRepo <<emptyDir>> Clone apiVersion: v1 kind: Pod metadata: name: gitrepo-volume-pod spec: containers: - image: nginx:alpine name: web-server volumeMounts: - name: html mountPath: /usr/share/nginx/html readOnly: true ports: - containerPort: 80 protocol: TCP volumes: - name: html gitRepo: repository: https://github.com/luksa/kubia-website-example.git revision: master directory: .
  • 10. PersistentVolume ● PersistentVolume Life cycle is managed by k8s cluster (not pod) Admin can create PersistentVolume (Static provisioning) and developer just use the volume thru PersistentVolumeClaim without understanding of infrastructure (It is more common to use dynamic volume provisioning instead of volume provisioning by admin) Pod Container Volume Bind PersistentVolu meClaim PersistentVolume <<Created by admin>> create admin Select Physical disk create
  • 11. PersistentVolume ● PersistentVolume ● Capacity : Storage size In future it will include IOPS,throughput etc ● VolumeMode (from 1.9) : Filesystem(default) or rawblock device ● Reclaim Policy ○ Retain – manual reclamation ○ Recycle – basic scrub (rm -rf /thevolume/*) ○ Delete – associated storage asset such as AWS EBS, GCE PD, Azure Disk, or OpenStack Cinder volume is deleted ● Mount Option Additional mount options for when a Persistent Volume is mounted on a node Currently, only NFS and HostPath support recycling. AWS EBS, GCE PD, Azure Disk, and Cinder volumes support deletion.
  • 12. PersistentVolume ● AccessModes ○ ReadWriteOnce (RWO)– the volume can be mounted as read-write by a single node ○ ReadOnlyMany (ROX) – the volume can be mounted read-only by many nodes ○ ReadWriteMany (RWX) – the volume can be mounted as read-write by many nodes A volume can only be mounted using one access mode at a time, even if it supports many. For example, a GCEPersistentDisk can be mounted as ReadWriteOnce by a single node or ReadOnlyMany by many nodes, but not at the same time. RWO를 여러 Pod에 붙일려고 하면 에러가 남
  • 13. PersistentVolume Phase ● Available - a free resource that yet is not bound to a claim ● Bound - the volume is bound to a claim ● Released - the claim has been deleted but the resource is not yet reclaimed by the cluster ● Failed - the volume has failed its automatic reclamation
  • 14. Lifecycle of volume and claim Provisioning ● Static ● Dynamic : Dynamically create volume with storage class. Binding ● Bind PV to PVC Using ● Bind PVC to Pod and starts to use Reclaiming
  • 15. PersistentVolumeClaim (PVC) Claiming a PersistentVolume is a completely separate process from creating a pod, because you want the same PersistentVolumeClaim to stay available even if the pod is rescheduled (remember, rescheduling means the previous pod is deleted and a new one is created apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mongodb-pvc spec: resources: requests: storage: 1Gi accessModes: - ReadWriteOnce storageClassName: "" List PersistentVolume List PersistentVolumeClaim 쿠버네티스 인액션 예제 Nobody else can claim the same volume until you release it.
  • 16. PersistentVolumeClaim (PVC) https://kubernetes.io/docs/concepts/s torage/persistent- volumes/#persistentvolumeclaims ● accessMode : same as volume ● VolumeMode : same as volume ● Resource : Claims, like pods, can request specific quantities of a resource. In this case, the request is for storage. The same resource model applies to both volumes and claims. ● Selector : Claims can specify a label selector to further filter the set of volumes. Only the volumes whose labels match the selector can be bound to the claim All of the requirements, from both matchLabels and matchExpressions are ANDed together – they must all be satisfied in order to match
  • 17. Using PVC in Pod 쿠버네티스 인액션 예제 Google Cloud Kubernetes example
  • 18. Recycling ● When a user is done with their volume, they can delete PVC object which allows reclamation of the resource ● Reclaim policy for PersistentVolume tell the cluster what to do with the volume after it has been released of its claim ● Reclaim policy ○ Retain (remain / cannot reuse) After PVC is deleted, PV still exists and volume is marked as “released”. It cannot be reused by other PVC. (it needs to delete manually) ○ Delete Delete volume as well as associated storage asset in external infrastructure ○ Recycle (remain / can reuse) Perform basic scrub (rm -f /thevolume/*) on volume and makes it available again for a new claim The Recycle reclaim policy is deprecated. Instead, the recommended approach is to use dynamic provisioning.
  • 19. Volume retaining test Change Reclaim policy from DELETE to Retain Recreat PVC & POD Delete Pod and PVC
  • 20. Dynamic Provisioning From k8s 1.6. Without manual provisioning of PersistentVolume. Kubernetes can dynamically create volume based on PersistentVolumeClaim (and storage class) Pod Container Volume Bind PersistentVolu meClaim PersistentVolume <<DynamicVoume>> k8s Dynamically created Instead of admin to create PV manually, admin can deploy Persistent volume provisioner and define one or more storage class object to let users choose type of PV. PersistentVolume <<DynamicVoume>> Dynamically created Request Volume with storage class
  • 21. Dynamic Provisioning apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast provisioner: kubernetes.io/gce- pd parameters: type: pd-ssd zone: europe-west1-b apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mongodb-pvc spec: storageClassName: fast resources: requests: storage: 100Mi accessModes: - ReadWriteOnce
  • 23. Storage Class Get default storage class
  • 25. Replicating stateful pod ● With ReplicaSet All pod will refer same PVC and same PV ● Option : Create RS per Pod
  • 26. Statefulset ● To support stateful application (like database) ● It is supported from k8s 1.9(GA) ● Pod Naming : ${Stateful set name}-${ordinary index} It is not that the pods having a predictable name and hostname. It is better to use service for naming. ● When Pod (under statefulset) is restarted (by crash), Pod name will not be changed. Cf. in case of RS, the Pod name is changed to new one.
  • 27. StatefulSet specification example apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: selector: matchLabels: app: nginx # has to match .spec.template.metadata.labels serviceName: "nginx" replicas: 3 # by default is 1 template: metadata: labels: app: nginx # has to match .spec.selector.matchLabels spec: terminationGracePeriodSeconds: 10 containers: - name: nginx image: k8s.gcr.io/nginx-slim:0.8 ports: - containerPort: 80 name: web volumeMounts: - name: www mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: www spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "my-storage-class" resources: requests: storage: 1Gi
  • 28. StatefulSet Pod to PV mapping ● Each pod will get its own PVC and PV ● The PVC will be created “Volume claim template” in StatefulSet specification POD-1 PVC-1 PV-1 POD-2 PVC-2 PV-2 POD-3 PVC-3 PV-3 Volume claim template Pod Template Create Create
  • 29. Scale in & out ● Scale in POD-1 PVC-1 PV-1 POD-2 PVC-2 PV-2 PVC-3 PV-3 POD-1 PVC-1 PV-1 POD-2 PVC-2 PV-2 POD-3 PVC-3 PV-3 POD-3 is removed by scale in , But PVC-3 and PV-3 are not deleted ● Scale out POD-3 is added by scale out. It is attached previous PVC-3 and PV-3 It enables POD connect to same PVC and PV and it POD to retain same stateful information in disk Because of this reason, PVC and PV are not automatically deleted. To delete PVC and PV, admin needs to manually delete the PVC and PV
  • 30. Pod management policies ● .spec.podManagementPolicy ○ OrderedReady (default) : start and terminate Pod sequential. ○ Parallel : start Pod and terminate in parallel

Editor's Notes

  1. https://kubernetes.io/blog/2017/03/dynamic-provisioning-and-storage-classes-kubernetes/ DynamicVolume (2017.03 from K8s 1.6)
  2. https://kubernetes.io/blog/2017/03/dynamic-provisioning-and-storage-classes-kubernetes/ DynamicVolume (2017.03 from K8s 1.6)
  3. Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 46m default-scheduler Successfully assigned nginx to gke-terrycho-gke10-default-pool-d32c133e-phhw Warning FailedAttachVolume 46m attachdetach-controller Multi-Attach error for volume "pvc-c7218b58-63b8-11e8-b940-42010a920151" Volume is already used by pod(s) redis Normal SuccessfulMountVolume 46m kubelet, gke-terrycho-gke10-default-pool-d32c133e-phhw MountVolume.SetUp succeeded for volume "default-token-45kpp" Warning FailedMount 1m (x20 over 44m) kubelet, gke-terrycho-gke10-default-pool-d32c133e-phhw Unable to mount volumes for pod "nginx_default(fa5b322b-63c5-11e8-b940-42010a920151)": timeout expired waiting for volumes to attach or mount for pod "default"/"nginx". list of unmounted volumes=[nginx-data]. list of unattached volumes=[nginx-data default-token-45kpp]
  4. https://kubernetes.io/blog/2017/03/dynamic-provisioning-and-storage-classes-kubernetes/ DynamicVolume (2017.03 from K8s 1.6)
  5. https://kubernetes.io/blog/2017/03/dynamic-provisioning-and-storage-classes-kubernetes/ DynamicVolume (2017.03 from K8s 1.6)