-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nvidia gpu power in k8s! #2913
Nvidia gpu power in k8s! #2913
Conversation
there is currently a issue with scheduling (device plugin service is not detecting gpu's) and i'm working on it. This happens after a reboot, so initial install works ok. |
cc @squat |
Google Updated their install container so it survives reboots. Set this as default install container. Also fixed a typo, but appierently GTX cards install correctly under TESLA drivers.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks really good, @AtzeDeVries! I have a few questions about some of the implementation. Please take a look.
hostPath: | ||
path: /lib/modules | ||
initContainers: | ||
- image: atzedevries/xenial-nouveau-unloader:1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This DaemonSet manifest differs from the Ubuntu DS made public by the GCP accelerators team [0]. I think we should be as close to that reference as possible, or even wholesale, and add in any mandatory templating. This way kubespray does not have to be in the business of maintaining NVIDIA GPU features. Any new functionality should really be contributed upstream or maintained independently of kubespray, which should just import those components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should kubespray use this image? Mirror it? Or not use it at all as Google does not do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 3 docker images in play. A pause image, a installer image and a nouveau unloader. The nouveau unloader is not in the GCP since nouveau is not available in GCP (and they are not going to add it). GCP systems are not same as most of 'our' installs.
By default in ubuntu/centos installed nouveau is loaded. Unloading before install is required. We can ofcourse leave this to the end user. By adding this simple extra init container (with image xenial-nouveau-unloader
) we can control this step and make sure the nvidia driver is correctly loaded. The nvidia driver can only be loaded by the gcr.io/google-containers/ubuntu-nvidia-driver-installer
container.
My advice would be to keep it like this since it add more control to a succesfull install and the difference is only a extra initContianer, so keeping up with GCP source should be simple.
An other difference is:
- matchExpressions:
- key: cloud.google.com/gke-accelerator
operator: Exists
This is very GCP specific, so i've changed it to
- matchExpressions:
- key: "nvidia.com/gpu"
operator: Exists
To make it more uniform. But this is just a style choice.
I'm not fully aware of how kupespray hosts self developed images, but for me it makes sense if kubespray hosts this image. I can deliver the Dockerfile
, or we can rewrite it into a default xenial image with a command
and some args
which does the trick.
mountPath: /lib/modules | ||
- name: dev | ||
mountPath: /dev | ||
# - image: gcr.io/google-containers/ubuntu-nvidia-driver-installer@sha256:7ffaf40fcf6bcc5bc87501b6be295a47ce74e1f7aac914a9f3e6c6fb8dd780a4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should not be here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be in next commit
@@ -0,0 +1,17 @@ | |||
--- | |||
container_engine_accelerator_enabled: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this variable should be renamed to something like “nvidia_accelerator_enabled”. NVIDIA GPUs are not the only type of accelerator (they’re not even the only kinds of GPUs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree and will also be in next commit.
nvidia_gpu_flavor: tesla | ||
nvidia_url_end: "{{nvidia_driver_version}}/NVIDIA-Linux-x86_64-{{nvidia_driver_version}}.run" | ||
## this should end up in var/ubuntu.yml | ||
#nvidia_driver_install_container: atzedevries/nvidia-ubuntu-driver-installer:21-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be removed as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be in next commit.
- xenial | ||
## Download URL of Nvidia GPU drivers. | ||
## Use this for Tesla based cards | ||
# nvidia_driver_download_url_default: https://us.download.nvidia.com/tesla/{{nvidia_driver_version}}/NVIDIA-Linux-x86_64-{{nvidia_driver_version}}.run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is not something that the user should be setting. I think the default should be maintained by kubespray and the user can override the nvidia_driver_download_url
. Or is this leftover and should be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left over and should be deleted. Dowload url is now controlled by the variable nvidia_gpu_flavor
namespace: kube-system | ||
labels: | ||
k8s-app: nvidia-gpu-device-plugin | ||
addonmanager.kubernetes.io/mode: Reconcile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does kubespray deploy the addon manager? If not then we should probably eliminate this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to check to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So i don't think the addon manager is deployed by kubespray. But checking the usage of addonmanager.kubernetes.io/mode
in the kubespray project it is used pretty much in each addon
. This also comes default with the k8s-device-plugin-nvidia-daemon
from GCP.
grep -nr addonmanager *
roles/dnsmasq/templates/dnsmasq-autoscaler.yml.j2:24: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/container_engine_accelerator/nvidia_gpu/templates/k8s-device-plugin-nvidia-daemonset.yml.j2:8: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/registry/templates/registry-pvc.yml.j2:9: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/registry/templates/registry-rs.yml.j2:11: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/registry/templates/registry-svc.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/kubedns-autoscaler.yml.j2:24: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-deployment.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-clusterrole.yml.j2:7: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/kubedns-deploy.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-config.yml.j2:8: addonmanager.kubernetes.io/mode: EnsureExists
roles/kubernetes-apps/ansible/templates/coredns-sa.yml.j2:9: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-svc.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/ansible/templates/coredns-clusterrolebinding.yml.j2:9: addonmanager.kubernetes.io/mode: EnsureExists
roles/kubernetes-apps/ansible/templates/kubedns-svc.yml.j2:10: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/fluentd/templates/fluentd-config.yml.j2:9: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/fluentd/templates/fluentd-ds.yml.j2:12: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/elasticsearch/templates/efk-sa.yml:9: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/elasticsearch/templates/efk-clusterrolebinding.yml:9: addonmanager.kubernetes.io/mode: Reconcile
roles/kubernetes-apps/efk/elasticsearch/templates/elasticsearch-deployment.yml.j2:12: addonmanager.kubernetes.io/mode: Reconcile
|
||
## Container Engine Acceleration | ||
## Enable container accelartion feature, for example use gpu acceleration in containers | ||
container_engine_acceleration_enabled: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is already false so I think the sample should be
## Uncomment to enable GPU acceleration:
# container_engine_acceleration_enabled: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will be in next commit
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset. | ||
## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers. | ||
## Nodes won't get labels gpu labels if they are not in the array. | ||
## Important: this should be set in all.yml 'docker_storage_options: -s overlay2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the docker storage options actually relevant to GPU installation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, because the ubuntu nvidia dirver installer image user overlay mounts. If the storage option is not set to overlay2
ubuntu will use aufs
and the driver installer will result is a kernel panic.
- { name: "{{ansible_distribution_release}}-nvidia-driver-install-daemonset", file: "{{ansible_distribution_release}}-nvidia-driver-install-daemonset.yml", type: daemonset } | ||
- { name: k8s-device-plugin-nvidia-daemonset, file: k8s-device-plugin-nvidia-daemonset.yml, type: daemonset } | ||
register: container_engine_accelerator_manifests | ||
when: inventory_hostname == groups['kube-master'][0] and ansible_distribution_release in nvidia_driver_installer_supported_distrubion_release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use syntax:
when:
- this
- that
- anything else
filename: "{{ kube_config_dir }}/addons/container_engine_accelerator/{{ item.item.file }}" | ||
state: "latest" | ||
with_items: "{{ container_engine_accelerator_manifests.results }}" | ||
when: inventory_hostname == groups['kube-master'][0] and ansible_distribution_release in nvidia_driver_installer_supported_distrubion_release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use syntax:
when:
- this
- that
- anything else
resource: "{{ item.item.type }}" | ||
filename: "{{ kube_config_dir }}/addons/container_engine_accelerator/{{ item.item.file }}" | ||
state: "latest" | ||
with_items: "{{ container_engine_accelerator_manifests.results }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with_items:
- "{{ something }}"
In case you are interested, we (NVIDIA) has just released a new set of container images on DockerHub: These images already bundle the driver installer and has prebuilt objects for the latest kernel supported by the distribution. If you are running the same kernel, installing the driver should only be a matter of seconds. We have documentation on our wiki: You can find technical details in this presentation: |
@flx42 I am happy to see NVIDIA adopting these approaches! From reading the slides and wiki I see that this new driver container currently relies on the NVIDIA container runtime, correct? For simplicity and maintainability, I am not sure Kubespray should be installing new runtimes. |
@squat the driver container load the kernel modules and also expose the user-space driver libraries on the host through mount propagation. Technically you could use any method afterwards to enable GPU support in the runtime as long as you can point to this folder. |
Ah I see. That’s in line with the model of other GPU installers 👍 the slides seemed to specifically indicate the NVIDIA runtime but I see now that was just implementation detail. That sounds like a good option for the OSs that NVIDIA is targeting with the builds. |
@squat So i want to reopen the discussion on the extra initcontainer which unload the nouveau module. While working on the Centos version i ran into a issue unloading the nouveau module, it caused a reboot of the system. So to get is PR working i think it might be better to remove this unloading and add a readme with a how to on how to disable nouveau. What do you think? I've also got the CentOS driver install working, but i want to rename a few variables, so a new commit on this will be comming soon. |
I've been trying some stuff with the I don't (yet) see how it is usefull at the moment since k8s nvidia device plugin daemon requests a specific directory format which the @flx42 and @squat do you agree on this, or am i missing something? |
So i think this PR is pretty much ready. I've removed the I can also provide a ansible playbook which disabled nouveau. |
GCP won't build and test a CentOS driver installer: |
can we merge this? |
/assign ant31 |
please rebase |
@@ -246,3 +246,16 @@ persistent_volumes_enabled: false | |||
## See https://github.com/kubernetes-incubator/kubespray/issues/2141 | |||
## Set this variable to true to get rid of this issue | |||
volume_cross_zone_attachment: false | |||
|
|||
## Container Engine Acceleration | |||
## Enable container accelartion feature, for example use gpu acceleration in containers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo on the first acceleration
# nvidia_accelerator_enabled: true | ||
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset. | ||
## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers. | ||
## Nodes won't get labels gpu labels if they are not in the array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is a bit confusing
## Enable container accelartion feature, for example use gpu acceleration in containers | ||
# nvidia_accelerator_enabled: true | ||
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset. | ||
## Array with nvida_gpu_nodes, leave empty or comment if you dont't want to install drivers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I have a label that I define in my inventory file that would install the gpu stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can the driver will install, but there is also a taint set on the nodes in nvidia_gpu_nodes
which prevents scheduling of non gpu pods on gpu nodes. That is why this array exsists.
If you would add the label nvidia.com/gpu=true
via inventory node and have nvidia_accelerator_enabled
then the driver will be installed but then the taint won't be set.
@@ -0,0 +1,9 @@ | |||
--- | |||
nvidia_accelerator_enabled: false | |||
nvidia_driver_version: "384.111" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
390.87 was just released, not sure if that would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me test this. I think it should work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works
Is this able to be merged? |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
so vacation is over, time to get this fixed ;) @ant31 On CI testing on GCE: Need to check that out, not experience with that. -> so i checked it. To be able to start a build a gce instance with a P4 card we need to use |
ci check this |
/lgtm |
New changes are detected. LGTM label has been removed. |
please fix until tests are green, and then squash the 'commits' |
lgtm, I'll create the CI in a following PR, I've requested GPU on kubespray's GCE account and @squat will provide a test pod |
7a87772
to
c838a8b
Compare
@AtzeDeVries can you reach me on the kubernetes slack @ant31 |
so this is now merged in #3304 . So i'll close this. |
Has anyone tested this ? would love to try it out. |
So here is the followup on the PR #2478. This setup uses container_engine_accelerator. Currently only Ubuntu Xenial (16.04) is supported.
To enable gpu capabillity set the following vars
all.yml
k8s-cluster.yml
This will label the nodes correctly (also setup taint for only scheduling gpu jobs to gpu). There is a container that installs the driver.
The container provided by the https://github.com/GoogleCloudPlatform/container-engine-accelerators is a bit simple and does not survive reboots. I've updated it and created a PR (GoogleCloudPlatform/container-engine-accelerators#70)
For now the container to install the drivers can be found here:
atzedevries/nvidia-ubuntu-driver-installer:10
I'm planning to add support for CentOS next.