Rails Docker App: Deployment (Kubernetes)

Alex Egg,

This is the second installment of a 2 part article on migrating a rails app to Docker and Kubernetes. This page is about deployment with Kubernetes. See part 1 here: http://www.eggie5.com/81-rails-docker-app

Kubernetes

Kubernetes is orchestration software that allows you to abstract a cluster of hosts into one virtual machine. Then try to model you app into Kubernetes atomic units called pods. Pods run your Docker Container. Kubernetes is nice b/c it’s declarative. You simple define the pods you want and how to connect them and Kubernetes will setup it up. Kubernetes is fairly low-level but it takes care of most of the work needed to run your app suite in a cluster, we will just need to setup a CI/CD tool to do our deployment scenarios.

Deployment

Deployments allow us to declaratively define a set of Kubernetes resources that we want allocated. Then Kubernetes will ensure that those services continue to run continuously. For example, if there is an error and the server restarts, Kubernetes ensures it’s discoverable in the cluster again.

Pod

If you need a guarantee that two components (containers) share the same file system (host), define them in the same pod. Other wise there is no guarantee where the k8 scheduler will allocate the container in the cluster.

Setup Cluster

Setup GCE cluster w/ at least 2 CPUs and 7.5GB memory. (Usually 2 std machines)

$ gcloud container clusters create jenkins-cd \
  --num-nodes 3 \
  --scopes "https://www.googleapis.com/auth/projecthosting,storage-rw"

These scope settings are very important as they will allow our Jenkins install (described below) to authenticate w/ google using metadata.

  1. Ensure cluster is ready: gcloud container clusters list
  2. Login: gcloud container clusters get-credentials eggie5-blog --zone us-west1-a
  3. Ensure nodes are ready: kubectl get nodes

fig1
Figure 1: Cluster of arbitrary size

Now remember, we don’t need to think in terms of VMs anymore, Kubernetes gives us a cluster modeled as 1 VM.

fig1
Figure 2: The Kubernetes cluster abstraction and the architecture we want to build.

DB Pod

Here is out DB deployment file which declaratively tells Kubernetes to run the PG docker container and to mount a persistent disk called harambi-disk (which I provisioned via GCE).

db-deployment.yml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: eggie5-db
spec:
  replicas: 1
  selector:
    matchLabels:
      name: db
  template:
    metadata:
      labels:
        name: db
    spec:
      containers:
      - image: postgres
        name: db
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: prod-db-secret
              key: password
        - name: POSTGRES_USER
          valueFrom:
            secretKeyRef:
              name: prod-db-secret
              key: username
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        ports:
        - name: pg
          containerPort: 5432
          hostPort: 5432
        volumeMounts:
          # name must match the volume name below
        - name: mypd
          # mount path within the container
          mountPath: /var/lib/postgresql/data
      volumes:
        - name: mypd
          gcePersistentDisk:
            pdName: harambi-disk
            fsType: ext4
      

Next we will setup a service which will expose our DB on port 5432 to the cluster internally b/c the labels db match in the above deployment file:

db-service.yml

apiVersion: v1
kind: Service
metadata:
  labels:
    name: db
  name: db
spec:
  ports:
    - port: 5432
      targetPort: 5432
  selector:
    name: db

Rails Pod

Now we will tell Kubernetes how we want our app server pod setup. Notice that declarative syntax:

web-deployment.yml

kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: eggie5-web
spec:
  replicas: 1
  selector:
    matchLabels:
      name: web
  template:
    metadata:
      labels:
        app: web
        name: web
    spec:
      containers:
      - name: web
        image: gcr.io/eggie5-blog/blog:latest
        env:
          - name: POSTGRES_PASSWORD
            valueFrom:
              secretKeyRef:
                name: prod-db-secret
                key: password
          - name: POSTGRES_USER
            valueFrom:
              secretKeyRef:
                name: prod-db-secret
                key: username
          #move this OUT
          - name: RAILS_ENV
            value: development
          - name: RACK_ENV
            value: development
          - name: PORT
            value: "3000"
        imagePullPolicy: Always
        ports:
        - containerPort: 3000
          name: http-server
        livenessProbe:
          httpGet:
            path: /
            port: 3000
          initialDelaySeconds: 30
          timeoutSeconds: 10

In the config above, we specify the docker image for the rails app and then pass in some ENV vars the rails app expects.

In our rails db config, we can now set the host to db as defined in the database service config.

database.yml

production:
  adapter: postgresql
  encoding: unicode
  database: mysite_production
  pool: 5
  host: db
  username: <%= ENV["POSTGRES_USER"] %>
  password: <%= ENV["POSTGRES_PASSWORD"] %>

Next, we need to expose our web pod via port 80, which we will then expose to the internet behind a load balancer.

web-service.yml

apiVersion: v1
kind: Service
metadata:
  name: blog-frontend
  labels:
    name: web
spec:
  type: NodePort
  ports:
    # The port that this service should serve on.
    - port: 80
      targetPort: 3000
      protocol: TCP
  # Label keys and values that must match in order to receive traffic for this service.
  selector:
    name: web

Ingress

K8s has a primitive called Ingress which you can configure to act like a low-level load balancer. Optimally you would configure this yourself using nginx or something, however, since we are running GKE we can use the convenience.

This is the top level ingress point for the cluster DNS points *.eggie5.com here:

ingress.yml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: eggie5-blog
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "com-eggie5-ingress"
spec:
  tls:
  - secretName: tls
  backend:
    serviceName: blog-frontend
    servicePort: 80

The Ingress annotation is a GKE specific notation that references my static IP allocated in GCE. Also we are setting up *TLS termination using this Ingress too. You then can setup your DNS to point to that static IP. When someone it’s the Ingress at that IP the TLS will terminate and the request will be routed to the web pod on port 80 which was exposed by the previous Service.

Cluster CNC

Each Kubernetes cluster has an API to which you can send commands. The CLI is most commonly used: kubectl. To send my declarative configs that I defined above to the cluster, I just run:

kubectl apply -f configs/

And it will send all the configs in the directory to the cluster and start modeling it. It will pull docker images from the repo and allocate pods across nodes/vms in the cluster.

If you want to start exploring deployment strategies. A common pattern is a staging and production environment. One **elegant way to do this in Kubernetes is with namespaces. This will require us to modify a lot of config files, so to look at some tools to automate the work for us: a build server that will deploy for us.

** See appendix for discussion about namespaces for stag/prod

Continuous Deployment

We will use a *** Jenkins CD/CI server that will run our build job when we push to our git repo. Here are the build/deployment steps

  1. Pull the branch
  2. Build Docker image
  3. Run Test Suite
  4. Push Docker Image to Repo
  5. Create respective Kubernetes namespace (stage/prod/dev)
  6. Update the web deployment file to reference the new Docker Image
  7. Apply the respective Kubernetes config files to the cluster

This is all automated using, the Jenkins file:

node {
  def project = 'eggie5-blog'

  def appName = 'blog'
  def feSvcName = "${appName}-frontend"
  def configName = env.BRANCH_NAME

  if (env.BRANCH_NAME == "production") {
    namespace = "production"
    configName = "production"    
  } else if (env.BRANCH_NAME == "master") {
    namespace = "staging"
    configName = "staging"
  } else {
    namespace = env.BRANCH_NAME
    configName = "dev"
  }

  //def imageTag = "gcr.io/${project}/${appName}:${namespace}:${env.BRANCH_NAME}.${env.BUILD_NUMBER}"
  def imageTag = "gcr.io/${project}/${appName}:${env.BRANCH_NAME}.${env.BUILD_NUMBER}"

  checkout scm
 
  stage 'Build image'
  sh("docker build -t ${imageTag} .")

  stage 'Run Go tests'
  //sh("docker run ${imageTag} go test")

  stage 'Push image to registry'
  sh("gcloud docker -- push ${imageTag}")

  stage "Deploy Application to ${namespace} namespace"

  // Create namespace if it doesnt exist
  sh("kubectl get ns ${namespace} || kubectl create ns ${namespace}")

  //look into kubecfg to do this instead of sed
  //  kubectl set image deployment/web nginx=nginx:1.9.1
  sh("sed -i.bak 's#gcr.io/eggie5-blog/blog:latest#${imageTag}#' ./Kubernetes/${configName}/*.yml")

  switch (namespace) {
    case ["staging", "production"]:
        sh("kubectl --namespace=${namespace} apply -f Kubernetes/${namespace}/")
        sh("kubectl --namespace=${namespace} apply -f Kubernetes/${namespace}/services/")
        sh("echo http://`kubectl --namespace=${namespace} get service/${feSvcName} --output=json | jq -r '.status.loadBalancer.ingress[0].ip'` > ${feSvcName}")
        
        break

    default:
        echo "Running 1-off deploy for branch: ${env.BRANCH_NAME} on namespace/env: ${namespace}"
        // Don't use public load balancing for development branches
        sh("sed -i.bak 's#LoadBalancer#ClusterIP#' ./Kubernetes/services/frontend.yml")
        sh("kubectl --namespace=${namespace} apply -f Kubernetes/dev/")
        sh("kubectl --namespace=${namespace} apply -f Kubernetes/services/")
        echo 'To access your environment run `kubectl proxy`'
        echo "Then access your service via http://localhost:8001/api/v1/proxy/namespaces/${namespace}/services/${feSvcName}:80/"
        
  }
}

Then Kubernetes will work to build the defined architecture.

*** See appendix of instructions on how to setup Jenkins in your cluster

Appendix

One-off commands

Run rake task

kubectl exec [name_of_pod] rake db:setup

Rails Console

kubectl exec -ti [name_of_pod] rails c

Staging / Prod Environments (Namespaces)

The staging/prod deployment workflow can be handled by creating a separate cluster in GCE or you can use the suggested method of using a Kubernetes primitive called Namespaces.

Create a new namespace for staging:

staging-namepsace.yml

kind: Namespace
apiVersion: v1
metadata:
  name: staging-namespace
  labels:
    name: staging-namespace
kubectl create -f staging-namespace.yml

See current namespaces, K8s uses default by default.

kubectl get namespaces

Now lets see the servcies running in staging:

kubectl get services --namespace=staging-namespace

Setting up Jenkins

Create the Jenkins Home Volume

In order to pre-populate Jenkins with the necessary plugins and configuration for the rest of the tutorial, you will create a volume from an existing tarball of that data.

gcloud compute images create jenkins-home-image --source-uri https://storage.googleapis.com/solutions-public-assets/jenkins-cd/jenkins-home-v2.tar.gz
gcloud compute disks create jenkins-home --image jenkins-home-image --zone us-west1-a

Create the jenkins namespace:

kubectl create ns jenkins

Bootstrap Jenkins Install

Here you’ll create a Deployment running a Jenkins container with a persistent disk attached containing the Jenkins home directory.

First, set the password for the default Jenkins user. Edit the password in jenkins/Kubernetes/options with the password of your choice by replacing CHANGE_ME. To Generate a random password and replace it in the file, you can run:

$ PASSWORD=`openssl rand -base64 15`; echo "Your password is $PASSWORD"; sed -i.bak s#CHANGE_ME#$PASSWORD# jenkins/Kubernetes/options
Your password is 2UyiEo2ezG/dKnUcdPdt

Now create the secret using kubectl:

$ kubectl create secret generic jenkins --from-file=jenkins/Kubernetes/options --namespace=jenkins
secret "jenkins" created

Now, run the Jenkins K8s deployment files:

$ kubectl apply -f jenkins/Kubernetes/
deployment "jenkins" created
service "jenkins-ui" created
service "jenkins-discovery" created

Check that your master pod is in the running state

$ kubectl get pods --namespace jenkins
NAME                   READY     STATUS    RESTARTS   AGE
jenkins-master-to8xg   1/1       Running   0          30s

Now, check that the Jenkins Service was created properly:

$ kubectl get svc --namespace jenkins
NAME                CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
jenkins-discovery   10.79.254.142   <none>        50000/TCP   10m
jenkins-ui          10.79.242.143   nodes         8080/TCP    10m

Kubernetes makes it simple to deploy an Ingress resource to act as a public load balancer and SSL terminator.

The Ingress resource is defined in jenkins/Kubernetes/lb/ingress.yaml. We used the Kubernetes secrets API to add our certs securely to our cluster and ready for the Ingress to use.

In order to create your own certs run:

$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /tmp/tls.key -out /tmp/tls.crt -subj "/CN=jenkins/O=jenkins"

Now you can upload them to Kubernetes as secrets:

$ kubectl create secret generic tls --from-file=/tmp/tls.crt --from-file=/tmp/tls.key --namespace jenkins

Now that the secrets have been uploaded, create the ingress load balancer. Note that the secrets must be created before the ingress, otherwise the HTTPs endpoint will not be created.

$ kubectl apply -f jenkins/Kubernetes/lb

Wait for the load balancer to be ready:

$  kubectl get ingress --namespace jenkins
NAME      RULE      BACKEND            ADDRESS         AGE
jenkins      -         master:8080        130.X.X.X      4m

Create a pipeline

This is all copied from: https://github.com/GoogleCloudPlatform/continuous-deployment-on-kubernetes Look here for more details

Phase 1: Add your service account credentials

First we will need to configure our GCP credentials in order for Jenkins to be able to access our code repository

  1. In the Jenkins UI, Click “Credentials” on the left
  2. Click either of the “(global)” links (they both route to the same URL)
  3. Click “Add Credentials” on the left
  4. From the “Kind” dropdown, select “Google Service Account from metadata”
  5. Click “OK”

You should now see 2 Global Credentials. Make a note of the name of second credentials as you will reference this in Phase 2:

Phase 2: Create a job

This lab uses Jenkins Pipeline to define builds as groovy scripts.

Navigate to your Jenkins UI and follow these steps to configure a Pipeline job (hot tip: you can find the IP address of your Jenkins install with kubectl get ingress --namespace jenkins):

  1. Click the “Jenkins” link in the top left of the interface
  2. Click the New Item link in the left nav
  3. Name the project sample-app, choose the Multibranch Pipeline option, then click OK
  4. Click Add Source and choose git
  5. Paste the HTTPS clone URL of your sample-app repo on Cloud Source Repositories into the Project Repository field.
    It will look like: https://source.developers.google.com/p/REPLACE_WITH_YOUR_PROJECT_ID/r/default
  6. From the Credentials dropdown select the name of new created credentials from the Phase 1.
  7. Under “Build Triggers”, check “Build Periodically” and enter * * * * * in to the “Schedule” field, this will ensure that Jenkins will check our repository for changes every minute.
  8. Click Save, leaving all other options with their defaults

A job entitled “Branch indexing” was kicked off to see identify the branches in your repository. If you refresh Jenkins you should see the master branch now has a job created for it.

Other CD Solutions

Deis

Deis is esentially a private Heroku PaaS. It’s very impressive software packed up as a Helm Chart. I was able to install Deis to my cluster in less that 5 minutes and then deploy a simple rails app just like I did on heroku.

$ helm repo add deis https://charts.deis.com/workflow 
$ helm install deis/workflow --version=v2.8.0 --namespace=deis 

Then to deploy (just like heroku):

git push deis master

Deis is native Kubernetes. It does something pretty elegant: each app you deploy to your PaaS is modeled as a Kubernetes namespace: I thought that was a clean way to this Kubernetes primitiave.

Deis is a very high-level construct and from my initial usage the app developer is never exposed to Kubernetes. If your looking for a lower bill, this could be a smooth migration route from heroku.

Fabric8

Seems like a nice platform to handle all the CI/CD. Tried to get this setup
for a few hours on GCE but it doesn’t seem to install very well and I saw
a comment about how it doesn’t support the PVs on GCE. I also tried a managed Fabric8
install by stackpoint.io on EC2 but that install wasn’t functional (missing runtimes) either.

OpenShift Origin

Looks interesting, just found this today. Seems to be an opinionated layer on top of Kubernetes. I will look at this more

Secrets API

Storing passwords to databases and services

set the password in the cloud
kubectl create secret generic prod-db-secret --from-literal=username=postgres --from-literal=password=

Then expose the secrets via ENV vars in in the container:

        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: prod-db-secret
              key: password

Persistent Disk

The container file system only lives as long as the container does. So if your app state needs to survive relocation, reboots, and crashes, you’ll need to configure some persistent storage. To access more-persistent storage, outside the container file system, you need a volume. This is especially important to stateful applications, such as key-value stores and databases.

Our database pod, and each pod we create, comes w/ a ephemeral disk already mounted. However, we want something permanent to store out database files. To fix this we need to setup a persistent disk for out database and then configure our DB Pod to discover it and mount it. In K8s this is called a Persistant Volume.

It’s easy to create a persistent disk using the Google Cloud Compute website, however, the hard part is formatting the disk as it will come unformatted. We’ll walk through the steps here:

Provision Disk:

gcloud compute disks create [DISK_NAME] --size 10 --type pd-standard --zone us-west1-a
The following updates regarding Dynamic Disk Provisioning were added on 11/28/16. Previously there were instructions on how to manually format the disk. I was unaware that the feature was added in 1.2 and most literature on the internet still shows manual provisioning.

Before March 2016 it was necessary to manually format disks. The process required staring up a GCE instance and mounting the volument and then formatting it as documented here: https://cloud.google.com/compute/docs/disks/add-persistent-disk#formatting The newly formatted disk can then be used in Kubernetes as a PV.

However, as of version 1.2.0 (March 2016), Kubernetes will automatically format the disk for you: https://github.com/kubernetes/kubernetes/pull/16948 This is a very nice convenience add. Unfortunately, a lot of literature online still tells users to mnaully format disks which is a burden to setup a stateful pod.

From the release notes: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md#v120

Dynamic Provisioning of PersistentVolumes: Kubernetes previously required all volumes to be manually provisioned by a cluster administrator before use. With this feature, volume plugins that support it (GCE PD, AWS EBS, and Cinder) can automatically provision a PersistentVolume to bind to an unfulfilled PersistentVolumeClaim.

Tips



Permalink: rails-docker-app-deployment-kubernetes

Tags:

Last edited by Alex Egg, 2016-11-28 17:15:27
View Revision History