Skip to content

Ansible role with sane defaults to deploy a Django app to Kubernetes.

License

Notifications You must be signed in to change notification settings

caktus/ansible-role-django-k8s

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

caktus.django-k8s

An Ansible role with sane defaults to deploy a Django app to Kubernetes.

It also tries to arrange to be able to update the deploy from an automated CI service without requiring manual intervention for authentication.

License

This Ansible role is released under the BSD License. See the LICENSE file for more details.

Development sponsored by Caktus Consulting Group, LLC.

Requirements

  • Kubernetes 1.19+ (use 1.4.x of this role for prior versions of Kubernetes)

  • The openshift and kubernetes-validate Python packages are required. For kubernetes-validate, install the latest minor release for the version of your Kubernetes cluster. For example, for Kubernetes 1.23:

    pip install -U openshift kubernetes-validate~=1.23.0
    

Installation

  1. Add to your requirements.yml:
---
# file: deployment/requirements.yaml

- src: https://github.com/caktus/ansible-role-django-k8s
  name: caktus.django-k8s
  1. Add the role to your playbook:
---
# file: deploy.yaml

- hosts: k8s_clusters
  vars:
    ansible_python_interpreter: "{{ ansible_playbook_python }}"
  roles:
  - role: caktus.django-k8s
  1. Create an inventory file for your clusters:
---
# file: inventory.yaml

all:
  children:
    k8s_clusters:
      vars:
        ansible_connection: local
      hosts:
        gcp-staging:
          k8s_auth_host: <https://....Cluster API endpoint URL.....>
          k8s_domain_names:
          - www.example.com

See defaults/main.yml for all the variables that can be overridden.

The k8s_auth_host variable is absolutely required to be set. This is the API endpoint URL of the cluster to use. Here are some examples so you can see what it might look like:

AKS: https://ratom-staging-dns-ba5d6fd2.hcp.eastus.azmk8s.io:443

AWS: https://74406E3AD450E7845D0EF653E7C6F020.gr7.us-west-2.eks.amazonaws.com

Digital Ocean: https://fc22cd06-0dc4-4e19-a1cd-e0064d2d151e.k8s.ondigitalocean.com

GKE: https://104.196.6.244

Minikube: https://192.168.99.100:8443

If you're sure you have kubectl set up to talk to your cluster, then you can run this to print your k8s_auth_host value:

kubectl config view --minify=true -o jsonpath='{.clusters[0].cluster.server}' --raw

Alternatively, if you used aws-web-stacks to create an EKS cluster, then the ClusterEndpoint output in CloudFormation is the value to use.

Usage

This should be run first interactively by a user who is already set up to access the cluster using kubectl. E.g., they can run kubectl cluster-info and see the cluster info. How to achieve that will differ by Kubernetes hosting environment.

(If when you try to run this role the first time, you get a bunch of SSL errors, check that k8s_auth_host and your current kubectl context are both pointing to the same cluster.)

When run the first time, this will figure out some information using the user's kubectl access that the user will need to save in Ansible variables for later use by the CI test service.

This role will also create a "deploy account" in Kubernetes which has the necessary permissions to deploy.

Follow the instructions that are printed during that first run (putting some information into variables and files). Then run again, and this time it should complete successfully having created the various K8S objects.

After that, the role should work without having to have kubectl access to the cluster. The user or service running it just needs access to the Ansible vault password, so ansible can decrypt the k8s_auth_api_key value.

Configuration

Review all of the variables in defaults/main.yml to see which configuration options are available.

Celery

# Required to enable:
k8s_worker_enabled: true
k8s_worker_celery_app: "<app.celery.name>"
k8s_worker_beat_enabled: true  # only if beat is needed

# Optional variables (with defaults):
k8s_worker_replicas: 2
k8s_worker_image: "{{ k8s_container_image }}"
k8s_worker_image_pull_policy: "{{ k8s_container_image_pull_policy }}"
k8s_worker_image_tag: "{{ k8s_container_image_tag }}"
k8s_worker_resources: "{{ k8s_container_resources }}"

RabbitMQ Support

Due to the number of related dependencies, RabbitMQ is not directly supported by this role and using RabbitMQ is not recommended unless required by your application. Version 1.4.0 of this role did briefly support RabbitMQ. If you need to maintain existing cluster, this section may help.

It's possible to create a cluster in the project namespace using the RabbitMQ Cluster Operator for Kubernetes. You can install it in your cluster by setting the k8s_rabbitmq_operator_version variable to the latest release (e.g., v1.9.0) and including a playbook like this along side your other deployment scripts:

---
# file: rabbitmq-operator.yaml

- hosts: k8s
  vars:
    ansible_python_interpreter: "{{ ansible_playbook_python }}"
  tasks:
  - name: Download cluster-operator manifest
    ansible.builtin.get_url:
      url: "https://github.com/rabbitmq/cluster-operator/releases/download/{{ k8s_rabbitmq_operator_version }}/cluster-operator.yml"
      dest: /tmp/rabbitmq-cluster-operator.yml
      mode: '0644'

  - name: Apply cluster-operator manifest to the cluster
    community.kubernetes.k8s:
      state: present
      src: /tmp/rabbitmq-cluster-operator.yml

Once the operator is installed and running, you can create and customize a RabbitMQ cluster by setting some variables:

# file: group_vars/k8s.yaml
#
# NOTE: Using RabbitMQ relies on the RabbitMQ Cluster Kubernetes Operator.
# See rabbitmq-operator.yaml in this repo. The Operator also controls the
# version of RabbitMQ that is installed (support for customizing spec.image
# could be considered for the future, if needed).
k8s_rabbitmq_enabled: true
# Using odd numbers is "highly recommended," and reducing this number ("cluster
# scale down") is not supported.
# See: https://www.rabbitmq.com/kubernetes/operator/using-operator.html#update
k8s_rabbitmq_replicas: 3
k8s_rabbitmq_cluster_name: rabbitmq
# Important: Updating the volume size after cluster creation does not appear
# to be supported by the Operator (as of v1.9.0 at least). You'll need to
# delete and recreate the cluster (by setting k8s_rabbitmq_enabled to false
# temporarily) to effect a change in the volume size.
k8s_rabbitmq_volume_size: "20Gi"
k8s_rabbitmq_service_type: ClusterIP
# If service_type is LoadBalancer, you can optionally assign a fixed IP for your
# load balancer (if suppported by the provider):
# k8s_rabbitmq_load_balancer_ip: (w.x.y.z)

k8s_rabbitmq_enabled: true
k8s_rabbitmq_replicas: 3

Creating a template:

# file: templates/rabbitmq.yaml.j2

apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
  name: "{{ k8s_rabbitmq_cluster_name }}"
  namespace: "{{ k8s_namespace }}"
spec:
  # Adapted from:
  # https://github.com/rabbitmq/cluster-operator/blob/main/docs/examples/production-ready/rabbitmq.yaml
  replicas: {{ k8s_rabbitmq_replicas }}
  rabbitmq:
    additionalConfig: |
      cluster_partition_handling = pause_minority
      vm_memory_high_watermark_paging_ratio = 0.99
      disk_free_limit.relative = 1.0
      collect_statistics_interval = 10000
  persistence:
{% if k8s_storage_class_name is defined %}
    storageClassName: "{{ k8s_storage_class_name }}"
{% endif %}
    storage: "{{ k8s_rabbitmq_volume_size }}"
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - "{{ k8s_rabbitmq_cluster_name }}"
        topologyKey: kubernetes.io/hostname
  override:
    service:
      spec:
        type: "{{ k8s_rabbitmq_service_type }}"
{% if k8s_rabbitmq_load_balancer_ip is defined %}
        loadBalancerIP: "{{ k8s_rabbitmq_load_balancer_ip }}"
{% endif %}

And creating a playbook to deploy the cluster itself:

- name: RabbitMQ
  hosts: k8s
  tags: rabbitmq
  tasks:
  - name: Deploy RabbitMQ cluster
    kubernetes.core.k8s:
      context: "{{ k8s_context|mandatory }}"
      kubeconfig: "{{ k8s_kubeconfig }}"
      definition: "{{ lookup('template', item['name']) }}"
      state: "{{ item['state'] }}"
      # Ensure we see any failures in CI
      wait: yes
      validate:
        fail_on_error: "yes"
        strict: "yes"
    with_items:
      - name: rabbitmq.yaml.j2
        state: present

Amazon S3: IAM role for service accounts

Web applications running on AWS typically use Amazon S3 for static and media resources. caktus.django-k8s optionally supports enabling a Kubernetes service account and associated IAM role that defines the access to public and private S3 buckets. This provides similar functionality of EC2 instance profiles within Kubernetes namespaces. This AWS blog post also provides a good overview.

At a high level, the process is:

  1. Create public and private S3 buckets
  2. Enable IAM roles for cluster service accounts
    • Requirement: eksctl must be installed
  3. Create an IAM role with a trust relatinoship and S3 policy for a service account
  4. Annotate the service account with the ARN of the IAM role

Required variables:

  • k8s_s3_cluster_name: name of EKS cluster in AWS

A separate playbook can be used to invoke this functionality:

---
# file: deploy-s3.yaml

- hosts: k8s
  vars:
    ansible_connection: local
    ansible_python_interpreter: "{{ ansible_playbook_python }}"
  tasks:
    - name: configure Amazon S3 buckets
      import_role:
        name: caktus.django-k8s
        tasks_from: aws_s3

Run with: ansible-playbook deploy-s3.yaml.

Amazon IAM: Adding a limited AWS IAM user for CI deploys

In order to be able to deploy to AWS from CI systems, you'll need to be able to authenticate as an IAM user that has the permissions to push to the AWS ECR (Docker registry), and possibly need to be able to read a secret from AWS Secrets Manager (the .vault_pass value). This playbook can create that user for you with the proper permissions. You can configure this with the following variables (defaults shown):

k8s_ci_username: myproject-ci-user
k8s_ci_repository_arn: "" # format: arn:aws:ecr:<REGION>:<ACCOUNT_NUMBER>:repository/<REPO_NAME>
k8s_ci_vault_password_arn: "" # format: arn:aws:secretsmanager:<REGION>:<ACCOUNT_NUMBER>:secret:<NAME_OF_SECRET>

Only k8s_ci_repository_arn is required. The REPO_NAME portion can be found here. The k8s_ci_vault_password_arn is an optional pointer to a single secret in AWS Secrets Manager. The ARN can be found by going to this link and then clicking on the secret you're sharing with the user. On some projects, we store the Ansible vault password in SecretsManager and then use an AWS CLI command to read the secret so other secrets in the repo can be decrypted. This allows the CI user to access that command.

You'll need to create a separate playbook to invoke this functionality because, once created, we don't need to try to recreate the user on each deploy AND because the CI user will not have the permissions to create itself, so we don't want this playbook to run on CI deploys. Create a playbook that looks like this:

---
# file: deploy-ci.yaml

- hosts: k8s
  vars:
    ansible_connection: local
    ansible_python_interpreter: "{{ ansible_playbook_python }}"
  tasks:
    - name: configure CI IAM user
      import_role:
        name: caktus.django-k8s
        tasks_from: aws_ci

Normally we would just run this with ansible-playbook deploy-ci.yaml, but unfortunately the Ansible IAM role still uses boto (instead of boto3) and boto is not compatible with using AWS profiles or AssumeRoles which we usually use to get access to AWS subaccounts.

If using kubesae, make sure c.config["aws"]["profile_name"] is configured in your tasks.py, and the following temporary credentials generation will occur automatically.

Otherwise, you'll have to run this python script, which takes your profile (saguaro-cluster in this example) and converts that into credentials that boto can use. Here is the python script:

import boto3

session = boto3.Session(profile_name="saguaro-cluster")
credentials = session.get_credentials().get_frozen_credentials()

print(f'export AWS_ACCESS_KEY_ID="{credentials.access_key}"')
print(f'export AWS_SECRET_ACCESS_KEY="{credentials.secret_key}"')
print(f'export AWS_SECURITY_TOKEN="{credentials.token}"')
print(f'export AWS_SESSION_TOKEN="{credentials.token}"')

The script will print statements to your console. Copy and paste those into your console and then run ansible-playbook deploy-ci.yaml and it should work.

After you run this role, the IAM user will be created with the proper permissions. You'll then need to use the AWS console to create an access key and secret key for that user. Take note of the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY values.

Copy those 2 variables (and AWS_DEFAULT_REGION) into the CI environment variables console.

NOTE: Be aware that you'll need to make sure that k8s_rollout_after_deploy is disabled (which is the default), because the rollout commands use your local kubectl which likely has more permissions than the IAM service account that this role depends on. See #25.