Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.3.1 #862 new build failure, stop at mistralrs-quant #866

Open
misureaudio opened this issue Oct 19, 2024 · 13 comments
Open

0.3.1 #862 new build failure, stop at mistralrs-quant #866

misureaudio opened this issue Oct 19, 2024 · 13 comments
Labels
bug Something isn't working build Issues relating to building mistral.rs

Comments

@misureaudio
Copy link

Minimum reproducible example

cargo build --release --features cuda

Error

error: failed to run custom build command for mistralrs-quant v0.3.1 (C:\Users\misur\Desktop\rustsrc\mistral.rs.0.3.1.0862\mistralrs-quant)

Caused by:
process didn't exit successfully: C:\Users\misur\Desktop\rustsrc\mistral.rs.0.3.1.0862\target\release\build\mistralrs-quant-a5b0a5658b3f8319\build-script-build (exit code: 101)
--- stdout
cargo:rerun-if-changed=build.rs
cargo:rerun-if-changed=kernels/gptq/q_gemm.cu
cargo:rerun-if-changed=kernels/hqq/hqq.cu
cargo:rerun-if-changed=kernels/ops/ops.cu
cargo:rerun-if-changed=kernels/marlin/marlin_kernel.cu
cargo:info=["/usr", "/usr/local/cuda", "/opt/cuda", "/usr/lib/cuda", "C:/Program Files/NVIDIA GPU Computing Toolkit", "C:/CUDA"]
cargo:rerun-if-env-changed=CUDA_COMPUTE_CAP
cargo:rustc-env=CUDA_COMPUTE_CAP=75

Other information

Please specify:
Windows 11

  • GPU or accelerator information
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2024 NVIDIA Corporation
    Built on Wed_Apr_17_19:36:51_Pacific_Daylight_Time_2024
    Cuda compilation tools, release 12.5, V12.5.40
    Build cuda_12.5.r12.5/compiler.34177558_0

Sat Oct 19 14:49:11 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.90 Driver Version: 565.90 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1650 ... WDDM | 00000000:01:00.0 Off | N/A |
| N/A 46C P8 3W / 40W | 0MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

Latest commit or version

0.3.1 #862

@misureaudio misureaudio added bug Something isn't working build Issues relating to building mistral.rs labels Oct 19, 2024
@misureaudio
Copy link
Author

No problem in building #820

#862 build on one of my Win11 laptops, all relevant libs and sw on disk C:
#862 doesn't build on a second laptop, having CUDA toolkit on D:, however all preceding #xxx are built ok.
#862 doesn't build on a third laptop, with all relevant lib and sw on C:

@misureaudio
Copy link
Author

BTW, #859 (pre metal fix) compiled with no issue.

@gfxenjoyer
Copy link

kernels/marlin/marlin_kernel.cu fails to compile on --gpu-architecture=sm_75. Seems to work fine for 80, 86, 89, and 90.
I manually tested by setting $env:CUDA_COMPUTE_CAP="75".

@misureaudio
Copy link
Author

Fine on a Quadro A2000, CI 86

@misureaudio
Copy link
Author

OK on 4070 laptop. CI 89,
No go on GTX1650 CI 75
No go on GTX1070 CI 61

@DenisBobrovskiy
Copy link

Pretty sure it is caused by Marlin kernel support that was added in #856 Try falling back to #848 . Marlin kernels are built for Compute Capability of 8+

@misureaudio
Copy link
Author

Could it be feasible to allow backward compatibility?
Even a GTX 1080 with CI=6.1, having 8GB VRAM, could be a useful asset.
A slower execution could be ok, if one can follow the future developments, (essentially support for new models).

@EricLBuehler
Copy link
Owner

@misureaudio that makes sense. I'll merge a nice solution!

@EricLBuehler
Copy link
Owner

@misureaudio @DenisBobrovskiy I just merged #878 which only compiles & runs the Marlin kernels if the compute cap is appropriate, can you please confirm if it works?

@DenisBobrovskiy
Copy link

DenisBobrovskiy commented Oct 23, 2024

@EricLBuehler fails at

 thread 'main' panicked at mistralrs-quant\build.rs:19:64:
 called `Result::unwrap()` on an `Err` value: ParseFloatError { kind: Invalid }

in this code (output.split('\n').nth(1).unwrap().parse::<f32>().unwrap() * 100.) as usize. I think it is because is not trimmed, this fixed it for me: (output.split('\n').nth(1).unwrap().trim().parse::<f32>().unwrap() * 100.) as usize

@misureaudio
Copy link
Author

@EricLBuehler @DenisBobrovskiy , both mods are needed, and all works ok: build, install, mistralrs-server.exe works on a GTX1080, CI 6.1:

2024-10-24T08:36:26.399093Z INFO mistralrs_core::utils::normal: Detected minimum CUDA compute capability 6.1
2024-10-24T08:36:26.399315Z INFO mistralrs_core::utils::normal: Skipping BF16 because CC < 8.0
2024-10-24T08:36:26.510733Z INFO mistralrs_core::utils::normal: DType selected is F16.
100%|████████████████████████████████████████████████████████████████████████████████████| 85/85 [02:56<00:00, 0.54it/s]
100%|████████████████████████████████████████████████████████████████████████████████| 507/507 [03:24<00:00, 190.28it/s]
2024-10-24T08:39:56.301282Z INFO mistralrs_core::pipeline::isq: Applying in-situ quantization into Some(Q4K) to 129 tensors.
2024-10-24T08:39:56.302138Z INFO mistralrs_core::pipeline::isq: Applying ISQ on 12 threads.
[00:00:15] [###################################>----] 113/129 (2s)

Confirmed!

Thank You very much!

@DenisBobrovskiy
Copy link

@EricLBuehler #880 this should fix the issue i mentioned

@grpathak22
Copy link

grpathak22 commented Nov 8, 2024

I am also facing the same issue when I try to build the Dockerfile:

71.79 error: failed to run custom build command for mistralrs-quant v0.3.2 (/mistralrs/mistralrs-quant) 71.79 71.79 Caused by: 71.79 process didn't exit successfully: /mistralrs/target/release/build/mistralrs-quant-cd5c37cc638c94f7/build-script-build(exit status: 101) 71.79 --- stdout 71.79 cargo:rerun-if-changed=build.rs 71.79 71.79 --- stderr 71.79 thread 'main' panicked at mistralrs-quant/build.rs:15:22: 71.79 Failed to get compute cap: Os { code: 2, kind: NotFound, message: "No such file or directory" } 71.79 note: run withRUST_BACKTRACE=1` environment variable to display a backtrace
71.79 warning: build failed, waiting for other jobs to finish...
Dockerfile Reference:

Dockerfile

FROM nvidia/cuda:12.5.1-cudnn-devel-ubuntu22.04 AS builder

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends
curl
git
libssl-dev
pkg-config
&& rm -rf /var/lib/apt/lists/*

RUN curl https://sh.rustup.rs -sSf | bash -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
RUN rustup update nightly
RUN rustup default nightly

WORKDIR /mistralrs
RUN git clone https://github.com/EricLBuehler/mistral.rs.git .
COPY . .

#ENV CUDA_ARCH="7.5"
ARG CUDA_COMPUTE_CAP=80
ENV CUDA_COMPUTE_CAP=${CUDA_COMPUTE_CAP}
ARG FEATURES="cuda"
ENV RAYON_NUM_THREADS=4
RUN RUSTFLAGS="-Z threads=4" cargo build --release --workspace --exclude mistralrs-pyo3 --features "${FEATURES}"

FROM nvidia/cuda:12.5.1-cudnn-runtime-ubuntu22.04 AS base

ENV HUGGINGFACE_HUB_CACHE=/data
PORT=80
RAYON_NUM_THREADS=8 \
LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

Run the script to create symlinks in /usr/local/cuda/lib64

RUN set -eux;
for lib in $(ls /usr/local/cuda/lib64); do
base=$(echo $lib | sed -r 's/(.+).so..+/\1.so/');
if [ "$lib" != "$base" ]; then
ln -sf "/usr/local/cuda/lib64/$lib" "/usr/local/cuda/lib64/$base";
fi;
done

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends
libomp-dev
ca-certificates
libssl-dev
curl
pkg-config
&& rm -rf /var/lib/apt/lists/*

FROM base

COPY --from=builder /mistralrs/target/release/mistralrs-bench /usr/local/bin/mistralrs-bench
RUN chmod +x /usr/local/bin/mistralrs-bench
COPY --from=builder /mistralrs/target/release/mistralrs-server /usr/local/bin/mistralrs-server
RUN chmod +x /usr/local/bin/mistralrs-server
ENTRYPOINT ["mistralrs-server", "--port", "1234", "--token-source", "env:HUGGING_FACE_HUB_TOKEN"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build Issues relating to building mistral.rs
Projects
None yet
Development

No branches or pull requests

5 participants