Kubernetes Stateful Pods

Alex Egg,

Kubernetes Pods by design are stateless. Accordingly the disk attached to a Pod instance is ephemeral. If you want anything to survive past a Pod restart, like a database for example, you need to mount an external disk. This guide will demonstrate how to make an example DB Pod stateful by mounting a PersistantVolume. The merits and risks of running a database inside a container is also worth another discussion, which I plan to write up later.

Provision Disk

This step is independent from Kubernetes. You simple need to setup the networked disk using your 3rd party provider. Some common ones are AWS or Google Cloud Engine, which I will use.

Kubernetes comes w/ a few adapters to work with different storage providers, here are a few popular ones:

The kubernetes primitive for one of the networked drives above is called a PersistentVolume. Define you PV below:

pv.yml

kind: PersistentVolume
apiVersion: v1
metadata:
  name: db-pv
  labels:
    type: GKE
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  gcePersistentDisk:
    pdName: com-eggie5-blog-production
    fsType: ext4

You can see I’m using the GCE adapter and referencing a disk I created called com-eggie5-blog-production. The other important feature is the name label db-pv I’ll use this to reference the PV.

Send to cluster:

kubectl create -f k8s/production/pv.yml
persistentvolume "db-pv" created

Check Status:

$ kubectl get pv
NAME      CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS      CLAIM     REASON    AGE
db-pv     10Gi       RWO           Retain          Available

As you can see, it successfully discovered and attached the manually provisioned GCE disk. As you can see the state is available, meaning it hasn’t been mounted or claimed. Now I can setup the claim:

Request Storage

The next Kubernetes primitive called PersistentVolumeClaim declares that my Pod requests storage that fits the characteristics defined in resources block.

pvc.yml

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: db-pvc
  labels:
    type: amazonEBS
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Send to cluster:

$ kubectl create -f k8s/production/pvc.yml --namespace production
persistentvolumeclaim "db-pvc" created

Check status:

$ kubectl get pvc --namespace production
NAME      STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
db-pvc    Bound     db-pv     10Gi       RWO           22s

Now if we check out disk PV again, it’s status should have changed:

$ kubectl get pv
NAME      CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM               REASON    AGE
db-pv     10Gi       RWO           Retain          Bound     production/db-pvc             6m

Use Claim

As you cans see it is now bound and claimed. Now we can setup our Pod to use that claim:

pod.yml

apiVersion: v1
kind: Pod
metadata:
  name: test-pd
spec:
  containers:
  - image: gcr.io/google_containers/test-webserver
    name: test-container
    volumeMounts:
    - mountPath: /test-pd
      name: test-volume
  volumes:
  - name: test-volume
    # This GCE PD must already exist.
    persistantVolumeClaim:
      claimName: db-pvc

At this point we declaring a Pod that needs a network drive w/ 10GB of space and Kubernetes is putting the pieces together.

Appendix

There is a way to attach a PV to a Pod w/o using a PVC, however, I found it to be buggy and after a day or so it would get into a bad state and not be able to mount the drive.

excerpt from pod config

      volumes:
        - name: mypd
          gcePersistentDisk:
            pdName: com-eggie5-blog-production
            fsType: ext4

The above config doesn’t go through a PVC but rather connects to the PV directly.



Permalink: kubernetes-stateful-pods

Tags:

Last edited by Alex Egg, 2016-11-17 21:00:04
View Revision History