Tutorial: Snapshotting Persistent Volume Claims in Kubernetes
 
        
      In this blog I will show you how to create snapshots of Persistent volumes in Kubernetes clusters and restore them again by only talking to the api server. This can be useful for either backups or when scaling stateful applications that need “startup data”.
The snapshot feature was introduced as Alpha in Kubernetes v1.12. So, for this to work, you need to enable the VolumeSnapshotDataSource feature gate on your Kubernetes cluster API server.
--feature-gates=VolumeSnapshotDataSource=trueI will be using Rook to provision my storage as they support layered filesystems and the CSI driver.
I assume you have an application up and running in your cluster. In my case, I have Jira Software running in Data Center mode with one active node provisioned with ASK.
In order to scale horizontally, I need a copy of Node0 home folder before I can start Node1. So, we start by defining some objects in Kubernetes.
Creating the StorageClass
When you create your StorageClass for Rook, you need to add imageFeatures and set it to layering as shown below:
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
    # clusterID is the namespace where the rook cluster is running
    clusterID: rook-ceph
    # Ceph pool into which the RBD image shall be created
    pool: replicapool
    # RBD image format. Defaults to "2".
    imageFormat: "2"
    # RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature.
    imageFeatures: layering
    # The secrets contain Ceph admin credentials.
    csi.storage.k8s.io/provisioner-secret-name: rook-ceph-csi
    csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
    csi.storage.k8s.io/node-stage-secret-name: rook-ceph-csi
    csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
    # Specify the filesystem type of the volume. If not specified, csi-provisioner
    # will set default as `ext4`.
    csi.storage.k8s.io/fstype: xfs
# Delete the rbd volume when a PVC is deleted
reclaimPolicy: DeleteWhen we deploy Jira with ASK, we simply use this storageclass, and Rook will create the storage when needed.
So, now we have a PVC for the home folder and one for the Data Center volume.
The Data Center volume is out of scope for this blogpost, as it’s not a block storage but a shared filesystem (Read Write Many) in Rook.
Creating the VolumeSnapshotClass and your first Snapshot
Now we define a VolumeSnapshotClass to handle our snapshots
apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshotClass
metadata:
  name: csi-rbdplugin-snapclass
snapshotter: rook-ceph.rbd.csi.ceph.com
parameters:
  # Specify a string that identifies your cluster. Ceph CSI supports any
  # unique string. When Ceph CSI is deployed by Rook use the Rook namespace,
  # for example "rook-ceph".
  clusterID: rook-ceph
  csi.storage.k8s.io/snapshotter-secret-name: rook-ceph-csi
  csi.storage.k8s.io/snapshotter-secret-namespace: rook-cephAnd then we are ready to create snapshots of the source PVC, in this case jira-persistent-storage-jira-0.
apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshot
metadata:
  name: rbd-pvc-snapshot
spec:
  snapshotClassName: csi-rbdplugin-snapclass
  source:
    name: jira-persistent-storage-jira-0
    kind: PersistentVolumeClaimThis will give us a volumesnapshots, as seen here:
kubectl get volumesnapshots -n jira-production
NAME               AGE
rbd-pvc-snapshot   57mCreating a new PVC from our snapshot
Now, if we want to create a new PVC based on this VolumeSnapshots, we define it like this:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: jira-persistent-storage-jira-1
spec:
  storageClassName: rook-ceph-block
  dataSource:
    name: rbd-pvc-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi 
Now we have a second PVC called jira-persistent-storage-jira-1, based on the PVC jira-persistent-storage-jira-0 with all its data from that point. So now we can scale our statefulset Jira, and the new Jira node1 will use this PVC which is a copy of Node0.
NAME                             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
jira-datacenter-pvc              Bound    pvc-a286df18-f9a3-4c52-b0a0-377a193f04de   5Gi        RWX            atlassian-dc-cephfs   93m
jira-persistent-storage-jira-0   Bound    pvc-b73b96d6-c3f7-4448-9f12-d9956efe2989   5Gi        RWO            rook-ceph-block       93m
jira-persistent-storage-jira-1   Bound    pvc-1984c9de-d13e-435e-b59c-28731d8f30bc   5Gi        RWO            rook-ceph-block       60mVerification
We can verify it by looking at the mountpoint inside the container, once it has started up. The reason why the cluster.properties has a different timestamp, is because our entrypoint script makes changes to it, before starting Jira.
$ kubectl exec -ti jira-0 -n jira-production -- ls -l /var/atlassian/application-data/jira/
total 12
drwxrws---. 4 jira jira   46 Aug 22 13:43 caches
-rw-rw-r--. 1 jira jira  633 Aug 22 13:42 cluster.properties
-rw-rw----. 1 jira jira 1102 Aug 22 13:30 dbconfig.xml
drwxr-s---. 2 jira jira 4096 Aug 22 13:58 localq
drwxrws---. 2 jira jira  132 Aug 22 14:01 log
drwxrws---. 2 jira jira   76 Aug 22 13:32 monitor
drwxrws---. 6 jira jira  100 Aug 22 13:31 plugins
drwxrws---. 3 jira jira   26 Aug 22 13:24 tmp
$ kubectl exec -ti jira-1 -n jira-production -- ls -l /var/atlassian/application-data/jira/
total 12
drwxrws---. 4 jira jira   46 Aug 22 13:43 caches
-rw-rw-r--. 1 jira jira  633 Aug 22 13:57 cluster.properties
-rw-rw----. 1 jira jira 1102 Aug 22 13:30 dbconfig.xml
drwxr-s---. 2 jira jira 4096 Aug 22 13:58 localq
drwxrws---. 2 jira jira  100 Aug 22 13:32 log
drwxrws---. 2 jira jira   76 Aug 22 13:32 monitor
drwxrws---. 6 jira jira  100 Aug 22 13:31 plugins
drwxrws---. 3 jira jira   26 Aug 22 13:24 tmpWe can also see that we now have a VolumeSnapshotContent object in our cluster
$ kubectl get VolumeSnapshotContent
NAME                                               AGE
snapcontent-05166c28-cdf9-4504-89c8-29c67ee23c11   73m
$ kubectl describe VolumeSnapshotContent snapcontent-05166c28-cdf9-4504-89c8-29c67ee23c11
Name:         snapcontent-05166c28-cdf9-4504-89c8-29c67ee23c11
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1alpha1
Kind:         VolumeSnapshotContent
Metadata:
  Creation Timestamp:  2019-08-22T11:56:35Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshotcontent-protection
  Generation:        1
  Resource Version:  176903
  Self Link:         /apis/snapshot.storage.k8s.io/v1alpha1/volumesnapshotcontents/snapcontent-05166c28-cdf9-4504-89c8-29c67ee23c11
  UID:               0a6afd6d-032d-4bf6-841d-a37146daf799
Spec:
  Csi Volume Snapshot Source:
    Creation Time:    1566474995000000000
    Driver:           rook-ceph.rbd.csi.ceph.com
    Restore Size:     5368709120
    Snapshot Handle:  0001-0009-rook-ceph-0000000000000003-e322e4c4-c4d3-11e9-afc8-0a580a2a0033
  Deletion Policy:    Delete
  Persistent Volume Ref:
    API Version:        v1
    Kind:               PersistentVolume
    Name:               pvc-b73b96d6-c3f7-4448-9f12-d9956efe2989
    Resource Version:   171532
    UID:                b8eed866-4e73-4a6a-bf74-d8fba8c9a8f5
  Snapshot Class Name:  csi-rbdplugin-snapclass
  Volume Snapshot Ref:
    API Version:       snapshot.storage.k8s.io/v1alpha1
    Kind:              VolumeSnapshot
    Name:              rbd-pvc-snapshot
    Namespace:         jira-production
    Resource Version:  176889
    UID:               05166c28-cdf9-4504-89c8-29c67ee23c11
Events:                <none>Get Kubernetes to do it for you
So, what is this all good for? you ask? Well. So far we had to help Kubernetes each time we had to scale our Jira, Confluence or Bitbucket Data Center installation, as we needed to copy the data around. This could be automated with scripts, but now we can get Kubernetes to do it for us.
Although this is still in Alpha, and as of writing this blogpost, only supported by Block Storage by Rook but the developers told us that they are working on getting Shared Filesystem to be supported as well.
Also, we can now create snapshots as backups of our running applications. If we want, we can then start a backup pod that will mount this backup PVC and copy it outside the cluster to some cold backup location.
Useful links
Published:
Updated:
