-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimal Nvidia GPU image #5
base: main
Are you sure you want to change the base?
Conversation
to use NVIDIA GPU-Optimized AMI - https://aws.amazon.com/marketplace/pp/prodview-7ikjtg3um26wq
given that the Nvidia source image requires a GPU > 2024-08-09T03:45:28Z: ==> amazon-ebs.build_ebs: Error waiting for fleet request (fleet-d6b5ce15-8087-660d-0e92-0982be357801) to become ready:The instance configuration for this AWS Marketplace product is not supported. Please see the AWS Marketplace site for more information about supported instance types, regions, and operating systems.
given that 80 is too small for Nvidia source image > 2024-08-09T03:11:58Z: ==> amazon-ebs.build_ebs: Error waiting for fleet request (fleet-5415661f-0887-e407-0630-81802dcb1f95) to become ready:Your requested instance type (c7a.xlarge) is not supported in your requested Availability Zone (eu-west-2c).Your requested instance type (m7a.xlarge) is not supported in your requested Availability Zone (eu-west-2c).Volume of size 80GB is smaller than snapshot 'snap-00c0e57c77605a262', expect size>= 128GB
as it conflicts with Nvidia source image
`./bin/patch/ubuntu22-x64 releases/ubuntu22/x64`
given that nvidia source image apt installs drivers at startup so we need to wait for dpkg locks to release
`./bin/patch/ubuntu22-x64 releases/ubuntu22/x64`
as it conflicts with Nvidia source image
`./bin/patch/ubuntu22-x64 releases/ubuntu22/x64`
``` 2024/08/14 19:31:55 ui error: 2024-08-14T19:31:55Z: ==> amazon-ebs.build_ebs: Error modify AMI attributes: AuthFailure: AMIs with product codes can't be made public ```
`./bin/patch/ubuntu22-x64 releases/ubuntu22/x64`
because it takes so much time to build
`./bin/patch/ubuntu22-x64 releases/ubuntu22/x64`
Thanks @ruffsl! I think you forgot to push I will experiment with this and also try @samayala22 approach, since it would be nice to be able to simply extend the base RunsOn images with the additional drivers. |
I forget how/where the templates populate from, but it's already included in the source tree here: runner-images-for-aws/releases/ubuntu22/x64/images/ubuntu/scripts/build/configure-apt-mock.sh Lines 1 to 6 in 432c20f
That could be a more optimal and customizable approach, as I've updated the OP commit to note about the disk usage. |
Opening for visibility and collaboration. It would be nice to include a minimal Nvidia GPU AMI for use with RunsOn.
This PR currently modifies the default templates re-used by the
gpu
RELEASE_DIST
example to slim down the resulting AMI, and change the source AMI to leverage NVIDIA GPU-Optimized AMI:Because of the source AMI's imposed constraints, this does necessitate that building the custom GPU AMI then requires the use of a nvidia GPU
instance_type
to kick off the packer process. Perhaps this could sidestepped by manually installing nvidia drivers and nvidia container runtime, but is something I've not yet bothered to reverse engineer.View the commit log for some notable subtle patches required to accommodate for apt-lock bocking because of the Nvidia source AMI's use of bashrc to bootstrap the drivers on first boot, and disabling the AWS CLI installation given it conflicts with the pre-installed version that ships with the Nvidia source AMI. The Nvidia source AMI is also initialized from a larger drive (128GB), so our child AMI also (unforntally) requires a bump minimum HDD size, larger than the current default large option in the RunsOn disk size of 80GB. Thus some editing of the RunsOn cloud formation setting were also needed. This may be another motivation to manually install the nvidia drivers, rather than rely on the source AMI.
Context: