Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia gpu power in k8s! #2913

Closed

Conversation

AtzeDeVries
Copy link
Contributor

So here is the followup on the PR #2478. This setup uses container_engine_accelerator. Currently only Ubuntu Xenial (16.04) is supported.

To enable gpu capabillity set the following vars
all.yml

docker_storage_options: -s overlay2

k8s-cluster.yml

## Container Engine Acceleration
## Enable container accelartion feature, for example use gpu acceleration in containers
container_engine_acceleration_enabled: true
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers.
## Nodes won't get labels gpu labels if they are not in the array.
## Important: this should be set in all.yml 'docker_storage_options: -s overlay2'
nvida_gpu_nodes:
  - kube-gpu-001
## flavor can be tesla or gtx
nvida_gpu_flavor: gtx

This will label the nodes correctly (also setup taint for only scheduling gpu jobs to gpu). There is a container that installs the driver.
The container provided by the https://github.com/GoogleCloudPlatform/container-engine-accelerators is a bit simple and does not survive reboots. I've updated it and created a PR (GoogleCloudPlatform/container-engine-accelerators#70)
For now the container to install the drivers can be found here: atzedevries/nvidia-ubuntu-driver-installer:10

I'm planning to add support for CentOS next.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 21, 2018
@AtzeDeVries
Copy link
Contributor Author

AtzeDeVries commented Jun 22, 2018

there is currently a issue with scheduling (device plugin service is not detecting gpu's) and i'm working on it. This happens after a reboot, so initial install works ok.

@ant31
Copy link
Contributor

ant31 commented Jun 22, 2018

cc @squat

@AtzeDeVries
Copy link
Contributor Author

Google Updated their install container so it survives reboots. Set this as default install container. Also fixed a typo, but appierently GTX cards install correctly under TESLA drivers..

Copy link

@squat squat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks really good, @AtzeDeVries! I have a few questions about some of the implementation. Please take a look.

hostPath:
path: /lib/modules
initContainers:
- image: atzedevries/xenial-nouveau-unloader:1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This DaemonSet manifest differs from the Ubuntu DS made public by the GCP accelerators team [0]. I think we should be as close to that reference as possible, or even wholesale, and add in any mandatory templating. This way kubespray does not have to be in the business of maintaining NVIDIA GPU features. Any new functionality should really be contributed upstream or maintained independently of kubespray, which should just import those components.

[0] https://github.com/GoogleCloudPlatform/container-engine-accelerators/blob/master/nvidia-driver-installer/ubuntu/daemonset.yaml

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should kubespray use this image? Mirror it? Or not use it at all as Google does not do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 docker images in play. A pause image, a installer image and a nouveau unloader. The nouveau unloader is not in the GCP since nouveau is not available in GCP (and they are not going to add it). GCP systems are not same as most of 'our' installs.

By default in ubuntu/centos installed nouveau is loaded. Unloading before install is required. We can ofcourse leave this to the end user. By adding this simple extra init container (with image xenial-nouveau-unloader) we can control this step and make sure the nvidia driver is correctly loaded. The nvidia driver can only be loaded by the gcr.io/google-containers/ubuntu-nvidia-driver-installer container.

My advice would be to keep it like this since it add more control to a succesfull install and the difference is only a extra initContianer, so keeping up with GCP source should be simple.

An other difference is:

- matchExpressions:
              - key: cloud.google.com/gke-accelerator
                operator: Exists

This is very GCP specific, so i've changed it to

- matchExpressions:
              - key: "nvidia.com/gpu"
                operator: Exists

To make it more uniform. But this is just a style choice.

I'm not fully aware of how kupespray hosts self developed images, but for me it makes sense if kubespray hosts this image. I can deliver the Dockerfile, or we can rewrite it into a default xenial image with a command and some args which does the trick.

mountPath: /lib/modules
- name: dev
mountPath: /dev
# - image: gcr.io/google-containers/ubuntu-nvidia-driver-installer@sha256:7ffaf40fcf6bcc5bc87501b6be295a47ce74e1f7aac914a9f3e6c6fb8dd780a4
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should not be here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be in next commit

@@ -0,0 +1,17 @@
---
container_engine_accelerator_enabled: false
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this variable should be renamed to something like “nvidia_accelerator_enabled”. NVIDIA GPUs are not the only type of accelerator (they’re not even the only kinds of GPUs).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree and will also be in next commit.

nvidia_gpu_flavor: tesla
nvidia_url_end: "{{nvidia_driver_version}}/NVIDIA-Linux-x86_64-{{nvidia_driver_version}}.run"
## this should end up in var/ubuntu.yml
#nvidia_driver_install_container: atzedevries/nvidia-ubuntu-driver-installer:21-1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be removed as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be in next commit.

- xenial
## Download URL of Nvidia GPU drivers.
## Use this for Tesla based cards
# nvidia_driver_download_url_default: https://us.download.nvidia.com/tesla/{{nvidia_driver_version}}/NVIDIA-Linux-x86_64-{{nvidia_driver_version}}.run
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is not something that the user should be setting. I think the default should be maintained by kubespray and the user can override the nvidia_driver_download_url. Or is this leftover and should be deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left over and should be deleted. Dowload url is now controlled by the variable nvidia_gpu_flavor

namespace: kube-system
labels:
k8s-app: nvidia-gpu-device-plugin
addonmanager.kubernetes.io/mode: Reconcile
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does kubespray deploy the addon manager? If not then we should probably eliminate this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to check to this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So i don't think the addon manager is deployed by kubespray. But checking the usage of addonmanager.kubernetes.io/mode in the kubespray project it is used pretty much in each addon. This also comes default with the k8s-device-plugin-nvidia-daemon from GCP.

 grep -nr addonmanager * 
roles/dnsmasq/templates/dnsmasq-autoscaler.yml.j2:24:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/templates/k8s-device-plugin-nvidia-daemonset.yml.j2:8:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/registry/templates/registry-pvc.yml.j2:9:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/registry/templates/registry-rs.yml.j2:11:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/registry/templates/registry-svc.yml.j2:10:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/kubedns-autoscaler.yml.j2:24:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-deployment.yml.j2:10:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-clusterrole.yml.j2:7:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/kubedns-deploy.yml.j2:10:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-config.yml.j2:8:    addonmanager.kubernetes.io/mode: EnsureExists
roles/kubernetes-apps/ansible/templates/coredns-sa.yml.j2:9:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-svc.yml.j2:10:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-clusterrolebinding.yml.j2:9:    addonmanager.kubernetes.io/mode: EnsureExists
roles/kubernetes-apps/ansible/templates/kubedns-svc.yml.j2:10:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/fluentd/templates/fluentd-config.yml.j2:9:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/fluentd/templates/fluentd-ds.yml.j2:12:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/elasticsearch/templates/efk-sa.yml:9:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/elasticsearch/templates/efk-clusterrolebinding.yml:9:    addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/elasticsearch/templates/elasticsearch-deployment.yml.j2:12:    addonmanager.kubernetes.io/mode: Reconcile


## Container Engine Acceleration
## Enable container accelartion feature, for example use gpu acceleration in containers
container_engine_acceleration_enabled: false
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is already false so I think the sample should be

## Uncomment to enable GPU acceleration:
# container_engine_acceleration_enabled: true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will be in next commit

## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers.
## Nodes won't get labels gpu labels if they are not in the array.
## Important: this should be set in all.yml 'docker_storage_options: -s overlay2'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the docker storage options actually relevant to GPU installation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, because the ubuntu nvidia dirver installer image user overlay mounts. If the storage option is not set to overlay2 ubuntu will use aufs and the driver installer will result is a kernel panic.

- { name: "{{ansible_distribution_release}}-nvidia-driver-install-daemonset", file: "{{ansible_distribution_release}}-nvidia-driver-install-daemonset.yml", type: daemonset }
- { name: k8s-device-plugin-nvidia-daemonset, file: k8s-device-plugin-nvidia-daemonset.yml, type: daemonset }
register: container_engine_accelerator_manifests
when: inventory_hostname == groups['kube-master'][0] and ansible_distribution_release in nvidia_driver_installer_supported_distrubion_release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use syntax:

when:
  - this
  - that
  - anything else

filename: "{{ kube_config_dir }}/addons/container_engine_accelerator/{{ item.item.file }}"
state: "latest"
with_items: "{{ container_engine_accelerator_manifests.results }}"
when: inventory_hostname == groups['kube-master'][0] and ansible_distribution_release in nvidia_driver_installer_supported_distrubion_release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use syntax:

when:
  - this
  - that
  - anything else

resource: "{{ item.item.type }}"
filename: "{{ kube_config_dir }}/addons/container_engine_accelerator/{{ item.item.file }}"
state: "latest"
with_items: "{{ container_engine_accelerator_manifests.results }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with_items:
  - "{{ something }}"

@flx42
Copy link

flx42 commented Jul 27, 2018

In case you are interested, we (NVIDIA) has just released a new set of container images on DockerHub:
https://hub.docker.com/r/nvidia/driver/

These images already bundle the driver installer and has prebuilt objects for the latest kernel supported by the distribution. If you are running the same kernel, installing the driver should only be a matter of seconds.

We have documentation on our wiki:
https://github.com/NVIDIA/nvidia-docker/wiki/Driver-containers-(EXPERIMENTAL)

You can find technical details in this presentation:
https://docs.google.com/presentation/d/1NY4X2K6BMaByfnF9rMEcNq6hS3NtmOKGTfihZ44zfrw/edit?usp=sharing

@squat
Copy link

squat commented Jul 27, 2018

@flx42 I am happy to see NVIDIA adopting these approaches! From reading the slides and wiki I see that this new driver container currently relies on the NVIDIA container runtime, correct? For simplicity and maintainability, I am not sure Kubespray should be installing new runtimes.

@flx42
Copy link

flx42 commented Jul 27, 2018

@squat the driver container load the kernel modules and also expose the user-space driver libraries on the host through mount propagation. Technically you could use any method afterwards to enable GPU support in the runtime as long as you can point to this folder.
Also, when using CRI-O you can just register a hook, no need to swap the default runtime.

@squat
Copy link

squat commented Jul 28, 2018

Ah I see. That’s in line with the model of other GPU installers 👍 the slides seemed to specifically indicate the NVIDIA runtime but I see now that was just implementation detail. That sounds like a good option for the OSs that NVIDIA is targeting with the builds.

@AtzeDeVries
Copy link
Contributor Author

AtzeDeVries commented Jul 30, 2018

@squat So i want to reopen the discussion on the extra initcontainer which unload the nouveau module. While working on the Centos version i ran into a issue unloading the nouveau module, it caused a reboot of the system. So to get is PR working i think it might be better to remove this unloading and add a readme with a how to on how to disable nouveau. What do you think?


I've also got the CentOS driver install working, but i want to rename a few variables, so a new commit on this will be comming soon.

@AtzeDeVries
Copy link
Contributor Author

I've been trying some stuff with the nvidia container-driver-installer. It seems really good and usefull if nvidia support installing drivers for ubuntu/centos. I also like the automatic update feature with dkms.

I don't (yet) see how it is usefull at the moment since k8s nvidia device plugin daemon requests a specific directory format which the nvidia container-driver-installer does not provide. The k8s nvidia device plugin daemon requests a bin/nvidia binaries and a lib64/nvidialibs and mounts them into the pod at the location which is requested by for example the nvidia/cuda:9.2 (https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.2/base/Dockerfile) .

@flx42 and @squat do you agree on this, or am i missing something?

@AtzeDeVries
Copy link
Contributor Author

So i think this PR is pretty much ready.

I've removed the nouveau unload init container. I have a centos install container. I can deliver a Dockerfile for it so you can host until GCP will host it by them selves.

I can also provide a ansible playbook which disabled nouveau.

@AtzeDeVries
Copy link
Contributor Author

GCP won't build and test a CentOS driver installer:
GoogleCloudPlatform/container-engine-accelerators#91

@ant31
Copy link
Contributor

ant31 commented Aug 22, 2018

can we merge this?
Could you add a CI test on GCE ?

@ant31
Copy link
Contributor

ant31 commented Aug 22, 2018

/assign ant31

@Atoms
Copy link
Member

Atoms commented Aug 24, 2018

please rebase

@@ -246,3 +246,16 @@ persistent_volumes_enabled: false
## See https://github.com/kubernetes-incubator/kubespray/issues/2141
## Set this variable to true to get rid of this issue
volume_cross_zone_attachment: false

## Container Engine Acceleration
## Enable container accelartion feature, for example use gpu acceleration in containers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo on the first acceleration

# nvidia_accelerator_enabled: true
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers.
## Nodes won't get labels gpu labels if they are not in the array.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is a bit confusing

## Enable container accelartion feature, for example use gpu acceleration in containers
# nvidia_accelerator_enabled: true
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I have a label that I define in my inventory file that would install the gpu stuff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can the driver will install, but there is also a taint set on the nodes in nvidia_gpu_nodes which prevents scheduling of non gpu pods on gpu nodes. That is why this array exsists.

If you would add the label nvidia.com/gpu=true via inventory node and have nvidia_accelerator_enabled then the driver will be installed but then the taint won't be set.

inventory/sample/group_vars/k8s-cluster.yml Outdated Show resolved Hide resolved
@@ -0,0 +1,9 @@
---
nvidia_accelerator_enabled: false
nvidia_driver_version: "384.111"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

390.87 was just released, not sure if that would work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me test this. I think it should work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works

@reverson
Copy link
Contributor

reverson commented Sep 7, 2018

Is this able to be merged?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: ant31

If they are not already assigned, you can assign the PR to them by writing /assign @ant31 in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@AtzeDeVries
Copy link
Contributor Author

AtzeDeVries commented Sep 10, 2018

so vacation is over, time to get this fixed ;)

@ant31 On CI testing on GCE: Need to check that out, not experience with that. -> so i checked it.
To start a GCE instance we need a gce instantace with an accelerator. (check https://cloud.google.com/compute/pricing, search for GPU). Pricewise P4 is the most interesting.

To be able to start a build a gce instance with a P4 card we need to use gce_instance_template and need a future version of ansible to be able to use accelerators (check ansible/ansible#22204). I'm not sure we it is a good idea to push this now, maybe create a issue for it, to be picked up later when ansible is ready.

@ant31
Copy link
Contributor

ant31 commented Sep 12, 2018

ci check this

@ant31
Copy link
Contributor

ant31 commented Sep 12, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Sep 12, 2018
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@ant31
Copy link
Contributor

ant31 commented Sep 12, 2018

@ant31
Copy link
Contributor

ant31 commented Sep 12, 2018

please fix until tests are green, and then squash the 'commits'

@ant31
Copy link
Contributor

ant31 commented Sep 12, 2018

lgtm,
could you squash / fixup all in one/two commits ?

I'll create the CI in a following PR, I've requested GPU on kubespray's GCE account and @squat will provide a test pod

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 12, 2018
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 12, 2018
@ant31
Copy link
Contributor

ant31 commented Sep 12, 2018

@AtzeDeVries can you reach me on the kubernetes slack @ant31

@AtzeDeVries
Copy link
Contributor Author

so this is now merged in #3304 . So i'll close this.

@riverzhang riverzhang mentioned this pull request Sep 21, 2018
6 tasks
@jayunit100
Copy link

Has anyone tested this ? would love to try it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants