Nvidia gpu power in k8s! #2913

AtzeDeVries · 2018-06-21T17:26:41Z

So here is the followup on the PR #2478. This setup uses container_engine_accelerator. Currently only Ubuntu Xenial (16.04) is supported.

To enable gpu capabillity set the following vars
all.yml

docker_storage_options: -s overlay2

k8s-cluster.yml

## Container Engine Acceleration
## Enable container accelartion feature, for example use gpu acceleration in containers
container_engine_acceleration_enabled: true
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers.
## Nodes won't get labels gpu labels if they are not in the array.
## Important: this should be set in all.yml 'docker_storage_options: -s overlay2'
nvida_gpu_nodes:
  - kube-gpu-001
## flavor can be tesla or gtx
nvida_gpu_flavor: gtx

This will label the nodes correctly (also setup taint for only scheduling gpu jobs to gpu). There is a container that installs the driver.
The container provided by the https://github.com/GoogleCloudPlatform/container-engine-accelerators is a bit simple and does not survive reboots. I've updated it and created a PR (GoogleCloudPlatform/container-engine-accelerators#70)
For now the container to install the drivers can be found here: atzedevries/nvidia-ubuntu-driver-installer:10

I'm planning to add support for CentOS next.

AtzeDeVries · 2018-06-22T07:50:45Z

there is currently a issue with scheduling (device plugin service is not detecting gpu's) and i'm working on it. This happens after a reboot, so initial install works ok.

ant31 · 2018-06-22T09:57:09Z

cc @squat

AtzeDeVries · 2018-06-28T07:32:38Z

Google Updated their install container so it survives reboots. Set this as default install container. Also fixed a typo, but appierently GTX cards install correctly under TESLA drivers..

squat

Overall this looks really good, @AtzeDeVries! I have a few questions about some of the implementation. Please take a look.

squat · 2018-06-28T07:51:21Z

...tainer_engine_accelerator/nvidia_gpu/templates/xenial-nvidia-driver-install-daemonset.yml.j2

+        hostPath:
+          path: /lib/modules
+      initContainers:
+      - image: atzedevries/xenial-nouveau-unloader:1


This DaemonSet manifest differs from the Ubuntu DS made public by the GCP accelerators team [0]. I think we should be as close to that reference as possible, or even wholesale, and add in any mandatory templating. This way kubespray does not have to be in the business of maintaining NVIDIA GPU features. Any new functionality should really be contributed upstream or maintained independently of kubespray, which should just import those components.

[0] https://github.com/GoogleCloudPlatform/container-engine-accelerators/blob/master/nvidia-driver-installer/ubuntu/daemonset.yaml

Should kubespray use this image? Mirror it? Or not use it at all as Google does not do this?

There are 3 docker images in play. A pause image, a installer image and a nouveau unloader. The nouveau unloader is not in the GCP since nouveau is not available in GCP (and they are not going to add it). GCP systems are not same as most of 'our' installs.

By default in ubuntu/centos installed nouveau is loaded. Unloading before install is required. We can ofcourse leave this to the end user. By adding this simple extra init container (with image xenial-nouveau-unloader) we can control this step and make sure the nvidia driver is correctly loaded. The nvidia driver can only be loaded by the gcr.io/google-containers/ubuntu-nvidia-driver-installer container.

My advice would be to keep it like this since it add more control to a succesfull install and the difference is only a extra initContianer, so keeping up with GCP source should be simple.

An other difference is:

- matchExpressions: - key: cloud.google.com/gke-accelerator operator: Exists

This is very GCP specific, so i've changed it to

- matchExpressions: - key: "nvidia.com/gpu" operator: Exists

To make it more uniform. But this is just a style choice.

I'm not fully aware of how kupespray hosts self developed images, but for me it makes sense if kubespray hosts this image. I can deliver the Dockerfile, or we can rewrite it into a default xenial image with a command and some args which does the trick.

squat · 2018-06-28T07:51:38Z

...tainer_engine_accelerator/nvidia_gpu/templates/xenial-nvidia-driver-install-daemonset.yml.j2

+          mountPath: /lib/modules
+        - name: dev
+          mountPath: /dev
+        # - image: gcr.io/google-containers/ubuntu-nvidia-driver-installer@sha256:7ffaf40fcf6bcc5bc87501b6be295a47ce74e1f7aac914a9f3e6c6fb8dd780a4


This line should not be here

will be in next commit

squat · 2018-06-28T08:01:08Z

roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/defaults/main.yml

@@ -0,0 +1,17 @@
+---
+container_engine_accelerator_enabled: false


I think this variable should be renamed to something like “nvidia_accelerator_enabled”. NVIDIA GPUs are not the only type of accelerator (they’re not even the only kinds of GPUs).

i agree and will also be in next commit.

squat · 2018-06-28T08:19:13Z

roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/defaults/main.yml

+nvidia_gpu_flavor: tesla
+nvidia_url_end: "{{nvidia_driver_version}}/NVIDIA-Linux-x86_64-{{nvidia_driver_version}}.run"
+## this should end up in var/ubuntu.yml
+#nvidia_driver_install_container: atzedevries/nvidia-ubuntu-driver-installer:21-1


This should probably be removed as well

will be in next commit.

squat · 2018-06-28T08:23:43Z

roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/defaults/main.yml

+  - xenial
+## Download URL of Nvidia GPU drivers.
+## Use this for Tesla based cards
+# nvidia_driver_download_url_default: https://us.download.nvidia.com/tesla/{{nvidia_driver_version}}/NVIDIA-Linux-x86_64-{{nvidia_driver_version}}.run


The default is not something that the user should be setting. I think the default should be maintained by kubespray and the user can override the nvidia_driver_download_url. Or is this leftover and should be deleted?

left over and should be deleted. Dowload url is now controlled by the variable nvidia_gpu_flavor

squat · 2018-06-28T08:25:54Z

.../container_engine_accelerator/nvidia_gpu/templates/k8s-device-plugin-nvidia-daemonset.yml.j2

+  namespace: kube-system
+  labels:
+    k8s-app: nvidia-gpu-device-plugin
+    addonmanager.kubernetes.io/mode: Reconcile


Does kubespray deploy the addon manager? If not then we should probably eliminate this

I'll have to check to this.

So i don't think the addon manager is deployed by kubespray. But checking the usage of addonmanager.kubernetes.io/mode in the kubespray project it is used pretty much in each addon. This also comes default with the k8s-device-plugin-nvidia-daemon from GCP.

grep -nr addonmanager * roles/dnsmasq/templates/dnsmasq-autoscaler.yml.j2:24: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/templates/k8s-device-plugin-nvidia-daemonset.yml.j2:8: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/registry/templates/registry-pvc.yml.j2:9: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/registry/templates/registry-rs.yml.j2:11: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/registry/templates/registry-svc.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/ansible/templates/kubedns-autoscaler.yml.j2:24: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/ansible/templates/coredns-deployment.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/ansible/templates/coredns-clusterrole.yml.j2:7: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/ansible/templates/kubedns-deploy.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/ansible/templates/coredns-config.yml.j2:8: addonmanager.kubernetes.io/mode: EnsureExists roles/kubernetes-apps/ansible/templates/coredns-sa.yml.j2:9: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/ansible/templates/coredns-svc.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/ansible/templates/coredns-clusterrolebinding.yml.j2:9: addonmanager.kubernetes.io/mode: EnsureExists roles/kubernetes-apps/ansible/templates/kubedns-svc.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/efk/fluentd/templates/fluentd-config.yml.j2:9: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/efk/fluentd/templates/fluentd-ds.yml.j2:12: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/efk/elasticsearch/templates/efk-sa.yml:9: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/efk/elasticsearch/templates/efk-clusterrolebinding.yml:9: addonmanager.kubernetes.io/mode: Reconcile roles/kubernetes-apps/efk/elasticsearch/templates/elasticsearch-deployment.yml.j2:12: addonmanager.kubernetes.io/mode: Reconcile

squat · 2018-06-28T08:27:53Z

inventory/sample/group_vars/k8s-cluster.yml

+
+## Container Engine Acceleration
+## Enable container accelartion feature, for example use gpu acceleration in containers
+container_engine_acceleration_enabled: false


The default is already false so I think the sample should be

## Uncomment to enable GPU acceleration: # container_engine_acceleration_enabled: true

ok will be in next commit

squat · 2018-06-28T08:29:19Z

inventory/sample/group_vars/k8s-cluster.yml

+## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
+## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers.
+## Nodes won't get labels gpu labels if they are not in the array.
+## Important: this should be set in all.yml 'docker_storage_options: -s overlay2'


Is the docker storage options actually relevant to GPU installation?

yes, because the ubuntu nvidia dirver installer image user overlay mounts. If the storage option is not set to overlay2 ubuntu will use aufs and the driver installer will result is a kernel panic.

Atoms · 2018-07-06T05:14:33Z

roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/tasks/main.yml

+    - { name: "{{ansible_distribution_release}}-nvidia-driver-install-daemonset", file: "{{ansible_distribution_release}}-nvidia-driver-install-daemonset.yml", type: daemonset }
+    - { name: k8s-device-plugin-nvidia-daemonset, file: k8s-device-plugin-nvidia-daemonset.yml, type: daemonset }
+  register: container_engine_accelerator_manifests
+  when: inventory_hostname == groups['kube-master'][0] and ansible_distribution_release in nvidia_driver_installer_supported_distrubion_release


use syntax:

when: - this - that - anything else

Atoms · 2018-07-06T05:14:54Z

roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/tasks/main.yml

+    filename: "{{ kube_config_dir }}/addons/container_engine_accelerator/{{ item.item.file }}"
+    state: "latest"
+  with_items: "{{ container_engine_accelerator_manifests.results }}"
+  when: inventory_hostname == groups['kube-master'][0] and ansible_distribution_release in nvidia_driver_installer_supported_distrubion_release


use syntax:

when: - this - that - anything else

Atoms · 2018-07-06T05:15:20Z

roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/tasks/main.yml

+    resource: "{{ item.item.type }}"
+    filename: "{{ kube_config_dir }}/addons/container_engine_accelerator/{{ item.item.file }}"
+    state: "latest"
+  with_items: "{{ container_engine_accelerator_manifests.results }}"


with_items: - "{{ something }}"

flx42 · 2018-07-27T02:56:08Z

In case you are interested, we (NVIDIA) has just released a new set of container images on DockerHub:
https://hub.docker.com/r/nvidia/driver/

These images already bundle the driver installer and has prebuilt objects for the latest kernel supported by the distribution. If you are running the same kernel, installing the driver should only be a matter of seconds.

We have documentation on our wiki:
https://github.com/NVIDIA/nvidia-docker/wiki/Driver-containers-(EXPERIMENTAL)

You can find technical details in this presentation:
https://docs.google.com/presentation/d/1NY4X2K6BMaByfnF9rMEcNq6hS3NtmOKGTfihZ44zfrw/edit?usp=sharing

squat · 2018-07-27T06:53:09Z

@flx42 I am happy to see NVIDIA adopting these approaches! From reading the slides and wiki I see that this new driver container currently relies on the NVIDIA container runtime, correct? For simplicity and maintainability, I am not sure Kubespray should be installing new runtimes.

flx42 · 2018-07-27T22:51:56Z

@squat the driver container load the kernel modules and also expose the user-space driver libraries on the host through mount propagation. Technically you could use any method afterwards to enable GPU support in the runtime as long as you can point to this folder.
Also, when using CRI-O you can just register a hook, no need to swap the default runtime.

squat · 2018-07-28T04:13:33Z

Ah I see. That’s in line with the model of other GPU installers 👍 the slides seemed to specifically indicate the NVIDIA runtime but I see now that was just implementation detail. That sounds like a good option for the OSs that NVIDIA is targeting with the builds.

AtzeDeVries · 2018-07-30T14:23:01Z

@squat So i want to reopen the discussion on the extra initcontainer which unload the nouveau module. While working on the Centos version i ran into a issue unloading the nouveau module, it caused a reboot of the system. So to get is PR working i think it might be better to remove this unloading and add a readme with a how to on how to disable nouveau. What do you think?

I've also got the CentOS driver install working, but i want to rename a few variables, so a new commit on this will be comming soon.

AtzeDeVries · 2018-08-07T15:11:50Z

I've been trying some stuff with the nvidia container-driver-installer. It seems really good and usefull if nvidia support installing drivers for ubuntu/centos. I also like the automatic update feature with dkms.

I don't (yet) see how it is usefull at the moment since k8s nvidia device plugin daemon requests a specific directory format which the nvidia container-driver-installer does not provide. The k8s nvidia device plugin daemon requests a bin/nvidia binaries and a lib64/nvidialibs and mounts them into the pod at the location which is requested by for example the nvidia/cuda:9.2 (https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.2/base/Dockerfile) .

@flx42 and @squat do you agree on this, or am i missing something?

AtzeDeVries · 2018-08-11T19:01:52Z

So i think this PR is pretty much ready.

I've removed the nouveau unload init container. I have a centos install container. I can deliver a Dockerfile for it so you can host until GCP will host it by them selves.

I can also provide a ansible playbook which disabled nouveau.

AtzeDeVries · 2018-08-13T07:14:37Z

GCP won't build and test a CentOS driver installer:
GoogleCloudPlatform/container-engine-accelerators#91

ant31 · 2018-08-22T15:16:19Z

can we merge this?
Could you add a CI test on GCE ?

ant31 · 2018-08-22T15:16:32Z

/assign ant31

Atoms · 2018-08-24T08:31:59Z

please rebase

reverson · 2018-08-28T22:25:03Z

inventory/sample/group_vars/k8s-cluster.yml

@@ -246,3 +246,16 @@ persistent_volumes_enabled: false
 ## See https://github.com/kubernetes-incubator/kubespray/issues/2141
 ## Set this variable to true to get rid of this issue
 volume_cross_zone_attachment: false
+
+## Container Engine Acceleration
+## Enable container accelartion feature, for example use gpu acceleration in containers


typo on the first acceleration

reverson · 2018-08-28T22:25:56Z

inventory/sample/group_vars/k8s-cluster.yml

+# nvidia_accelerator_enabled: true
+## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
+## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers.
+## Nodes won't get labels gpu labels if they are not in the array.


This line is a bit confusing

reverson · 2018-08-28T22:26:31Z

inventory/sample/group_vars/k8s-cluster.yml

+## Enable container accelartion feature, for example use gpu acceleration in containers
+# nvidia_accelerator_enabled: true
+## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
+## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers.


Can I have a label that I define in my inventory file that would install the gpu stuff?

Yes, you can the driver will install, but there is also a taint set on the nodes in nvidia_gpu_nodes which prevents scheduling of non gpu pods on gpu nodes. That is why this array exsists.

If you would add the label nvidia.com/gpu=true via inventory node and have nvidia_accelerator_enabled then the driver will be installed but then the taint won't be set.

inventory/sample/group_vars/k8s-cluster.yml

reverson · 2018-08-28T22:28:10Z

roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/defaults/main.yml

@@ -0,0 +1,9 @@
+---
+nvidia_accelerator_enabled: false
+nvidia_driver_version: "384.111"


390.87 was just released, not sure if that would work.

let me test this. I think it should work

reverson · 2018-09-07T18:44:49Z

Is this able to be merged?

k8s-ci-robot · 2018-09-10T13:39:32Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: ant31

If they are not already assigned, you can assign the PR to them by writing /assign @ant31 in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

AtzeDeVries · 2018-09-10T13:47:00Z

so vacation is over, time to get this fixed ;)

@ant31 On CI testing on GCE: Need to check that out, not experience with that. -> so i checked it.
To start a GCE instance we need a gce instantace with an accelerator. (check https://cloud.google.com/compute/pricing, search for GPU). Pricewise P4 is the most interesting.

To be able to start a build a gce instance with a P4 card we need to use gce_instance_template and need a future version of ansible to be able to use accelerators (check ansible/ansible#22204). I'm not sure we it is a good idea to push this now, maybe create a issue for it, to be picked up later when ansible is ready.

ant31 · 2018-09-12T06:37:08Z

ci check this

ant31 · 2018-09-12T06:37:17Z

/lgtm

k8s-ci-robot · 2018-09-12T07:01:56Z

New changes are detected. LGTM label has been removed.

ant31 · 2018-09-12T07:11:47Z

https://gitlab.com/kargo-ci/kubernetes-incubator__kubespray/-/jobs/96868630

ant31 · 2018-09-12T10:25:10Z

please fix until tests are green, and then squash the 'commits'

ant31 · 2018-09-12T17:21:22Z

lgtm,
could you squash / fixup all in one/two commits ?

I'll create the CI in a following PR, I've requested GPU on kubespray's GCE account and @squat will provide a test pod

ant31 · 2018-09-12T21:30:12Z

@AtzeDeVries can you reach me on the kubernetes slack @ant31

AtzeDeVries · 2018-09-19T14:15:15Z

so this is now merged in #3304 . So i'll close this.

jayunit100 · 2019-01-25T15:21:26Z

Has anyone tested this ? would love to try it out.

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 21, 2018

squat suggested changes Jun 28, 2018

View reviewed changes

Atoms reviewed Jul 6, 2018

View reviewed changes

bhack mentioned this pull request Jul 20, 2018

[GPU] Updated Kops GPU Setup Hook kubernetes/kops#4971

Merged

k8s-ci-robot assigned ant31 Aug 22, 2018

reverson reviewed Aug 28, 2018

View reviewed changes

k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Sep 12, 2018

AtzeDeVries force-pushed the nvida_k8s_take_2 branch from 7a87772 to c838a8b Compare September 12, 2018 18:40

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 12, 2018

Nvidia GPU in k8s with kubespray

c838a8b

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 12, 2018

Synced with master

7fe2631

ant31 mentioned this pull request Sep 13, 2018

Add support for GPU accelerator #3304

Merged

AtzeDeVries closed this Sep 19, 2018

riverzhang mentioned this pull request Sep 21, 2018

WIP: nvidia gpu support #2158

Closed

6 tasks

		@@ -0,0 +1,17 @@
		---
		container_engine_accelerator_enabled: false

Nvidia gpu power in k8s! #2913

Nvidia gpu power in k8s! #2913

Conversation

AtzeDeVries commented Jun 21, 2018

AtzeDeVries commented Jun 22, 2018 • edited Loading

ant31 commented Jun 22, 2018

AtzeDeVries commented Jun 28, 2018

squat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flx42 commented Jul 27, 2018

squat commented Jul 27, 2018

flx42 commented Jul 27, 2018

squat commented Jul 28, 2018

AtzeDeVries commented Jul 30, 2018 • edited Loading

AtzeDeVries commented Aug 7, 2018

AtzeDeVries commented Aug 11, 2018

AtzeDeVries commented Aug 13, 2018

ant31 commented Aug 22, 2018

ant31 commented Aug 22, 2018

Atoms commented Aug 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reverson commented Sep 7, 2018

k8s-ci-robot commented Sep 10, 2018

AtzeDeVries commented Sep 10, 2018 • edited Loading

ant31 commented Sep 12, 2018

ant31 commented Sep 12, 2018

k8s-ci-robot commented Sep 12, 2018

ant31 commented Sep 12, 2018

ant31 commented Sep 12, 2018 • edited Loading

ant31 commented Sep 12, 2018 • edited Loading

ant31 commented Sep 12, 2018

AtzeDeVries commented Sep 19, 2018

jayunit100 commented Jan 25, 2019

AtzeDeVries commented Jun 22, 2018 •

edited

Loading

AtzeDeVries commented Jul 30, 2018 •

edited

Loading

AtzeDeVries commented Sep 10, 2018 •

edited

Loading

ant31 commented Sep 12, 2018 •

edited

Loading

ant31 commented Sep 12, 2018 •

edited

Loading