-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: DriverError(CUDA_ERROR_INVALID_PTX, "a PTX JIT compilation failed") when loading utanh_bf16 #850
Comments
with error
in docker there is cuda requirement that does not go as far as 550. it is all 536
this driver uses |
ok, so CUDA works on nodes. there is something wrong with the CUDA usage or build in mistralrs
and sample CUDA Pod works too $ kubectl -n ml logs vector-add
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done pods apiVersion: v1
kind: Pod
metadata:
name: cuda-info
namespace: ml
spec:
restartPolicy: OnFailure
containers:
- name: main
image: cuda:12.4.1-cudnn-devel-ubuntu22.04
command: ["nvidia-smi"]
resources:
limits:
nvidia.com/gpu: 1 apiVersion: v1
kind: Pod
metadata:
name: vector-add
namespace: ml
spec:
restartPolicy: OnFailure
containers:
- name: main
image: cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04
resources:
limits:
nvidia.com/gpu: 1 |
Pytorch also works on these nodes. apiVersion: v1
kind: Pod
metadata:
name: pytorch-cuda
namespace: ml
spec:
containers:
- name: main
image: pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel
command: ["/bin/sh", "-c", "sleep 1000000"]
resources:
limits:
nvidia.com/gpu: 1 $ kubectl exec -n ml --stdin --tty pytorch-cuda -- /bin/bash
root@pytorch-cuda:/workspace# python3
Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.current_device()
0
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_name(0)
'NVIDIA L4'
>>>
root@pytorch-cuda:/workspace# HuggingFace PyTorch based HTTP server docker also works here and uses CUDA. https://github.com/nikolaydubina/basic-openai-pytorch-server |
@EricLBuehler any tips on why CUDA here is not working? |
I test python cookbook from examples on google colab and get the same or similar error
|
Describe the bug
LLAMA 3.2 11B Vision cannot start after loading model
my system
Latest commit or version
ghcr.io/ericlbuehler/mistral.rs:cuda-90-sha-1caf83a@sha256:095518a16d1f0a9fa2e212463736ccb540eeb0f88f21c10a2123ab8cf481b83e
References
550.90.07
driver supports12.4
cuda, which is same as in Docker image12.4.1
https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-550-90-07/index.htmlThe text was updated successfully, but these errors were encountered: