Note: I'm assuming that you're somewhat familiar with Docker.
The killer feature of Docker for us is that it allows us to make layered binary images of our app. What this means is that you can start with a minimal base image, then make a python image on top of that, then an app image on top of the python one, etc..
Here's the hierarchy of our docker images:
Piece of advice: If you used to run your app using supervisord before I would advise to avoid the temptation to do the same with docker, just let your container crash and let k8s handle it.
There is an excellent post about the differences between container deployments which also settles for k8s.
I'll also just assume that you already did your homework and you plan to use k8s. But just to put more data out there:
Main reason: We are using Google Cloud already and it provides a ready to use Kubernetes cluster on their cloud.
This is huge as we don't have to manage the k8s cluster and can focus on deploying our apps to production instead.
Let's begin by making a list of what we need to run our app in production:
We ran the above in a normal VM environment, why would we need k8s? To understand this, let's dig a bit into what k8s offers:
There are more concepts like volumes, claims, secrets, but let's not worry about them for now.
We're using Postgres as our main storage and we are not running it using Kubernetes.
Now we are running postgres in k8s (1 hot standby + pghoard), you can ignore the rest of this paragaph.
The reason here is that we wanted to run Postgres using provisioned SSD + high memory instances. We could have created a cluster just for postgres with these types of machines, but it seemed like an overkill.
The philosophy of k8s is that you should design your cluster with the thought that pods/nodes of your cluster are just gonna die randomly. I haven't figured our how to setup Postgres with this constraint in mind. So we're just running it replicated with a hot-standby and doing backups with wall-e for now. If you want to try it with k8s there is a guide here. And make sure you tell us about it.
RabbitMQ (used as message broker for Celery) is running on k8s as it's easier (than Postgres) to make a cluster. Not gonna dive into the details. It's using a replication controller to run 3 pods containing rabbitmq instances. This guide helped: https://www.rabbitmq.com/clustering.html
As I mentioned before, we're using a replication controller to run 3 pods, each containing uWSGI & NGINX containers duo: gorgias/web & gorgias/nginx. Here's our replication controller web-rc.yaml config:
replicas: 3 # how many copies of the template below we need to run
- name: web
image: gcr.io/your-project/web:latest # the image that you pushed to Google Container Registry using gcloud docker push
ports: # these are the exposed ports of your Pods that are later used by the k8s Service
- containerPort: 3033
- containerPort: 9099
- name: nginx
- containerPort: 8000
- containerPort: 4430
volumeMounts: # this holds our SSL keys to be used with nginx. I haven't found a way to use the http load balancer of google with k8s.
- name: "secrets"
- name: "secrets"
And now the web-service.yaml:apiVersion: v1
- port: 80
- port: 443
That type: LoadBalancer at the end is super important because it tells k8s to request a public IP and route the network to the Pods with the selector=app:web.
If you're doing a rolling-update or just restarting your pods, you don't have to change the service. It will look for pods matching those labels.
Also a replication controller that runs 4 pods containing a single container: gorgias/worker, but doesn't need a service as it only consumes stuff. Here's our worker-rc.yaml:
- name: worker
With Kubernetes, docker finally started to make sense to me. It's great because it provides great tools out of the box for doing web app deployment. Replication controllers, Services (with LoadBalancer included), Persistent Volumes, internal DNS. It should have all you need to make a resilient web app fast.
At Gorgias we're building a next generation helpdesk that allows responding 2x faster to common customer requests and having a fast and reliable infrastructure is crucial to achieve our goals.
If you're interested in working with this kind of stuff (especially to improve it): we're hiring!