Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-Delete previous Makefile #1039

Merged
merged 6 commits into from
Feb 5, 2024
Merged

Re-Delete previous Makefile #1039

merged 6 commits into from
Feb 5, 2024

Conversation

younesbelkada
Copy link
Collaborator

As done in #949 - we should redirect users to use CMake now to compile bnb from source

cc @Titus-von-Koeller @rickardp @akx @wkpark

@@ -23,12 +23,11 @@ pip install bitsandbytes

```bash
git clone https://github.com/TimDettmers/bitsandbytes.git && cd bitsandbytes/
CUDA_VERSION=XXX make cuda12x
python setup.py install
cmake -B build -DBUILD_CUDA=ON -S .
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the correct way to install bnb from source now using cmake? When trying this out I am getting

Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):

        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should show the traceback ("look up to see its traceback" in that message there) – hard to say otherwise.

Copy link
Contributor

@akx akx Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build workflow does cmake -DCOMPUTE_BACKEND=cuda -DNO_CUBLASLT=OFF .

On my machine, I needed an additional -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok perfect let me try that out

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works indeed, thanks !

Copy link
Collaborator Author

@younesbelkada younesbelkada Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: it worked on my VM but on our hosted runner + custom Docker image I constantly get:

../../../opt/conda/envs/peft/lib/python3.8/site-packages/_pytest/config/__init__.py:1394
  /opt/conda/envs/peft/lib/python3.8/site-packages/_pytest/config/__init__.py:1394: PytestConfigWarning: Unknown config option: doctest_glob
  
    self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")

../../../bitsandbytes/bitsandbytes/cuda_setup/main.py:184
  /bitsandbytes/bitsandbytes/cuda_setup/main.py:184: UserWarning: Welcome to bitsandbytes. For bug reports, please run
  
  python -m bitsandbytes
  
  
    warn(msg)

../../../bitsandbytes/bitsandbytes/cuda_setup/main.py:184
  /bitsandbytes/bitsandbytes/cuda_setup/main.py:184: UserWarning: /opt/conda/envs/peft did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0', 'libcudart.so.12.1', 'libcudart.so.12.2'] as expected! Searching further paths...
    warn(msg)

../../../bitsandbytes/bitsandbytes/cuda_setup/main.py:184
  /bitsandbytes/bitsandbytes/cuda_setup/main.py:184: UserWarning: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0', 'libcudart.so.12.1', 'libcudart.so.12.2'] as expected! Searching further paths...
    warn(msg)

With the same final error, debugging it further I found out that libcudart.so.12 was in /usr/local/cuda-12/lib64 or /usr/local/cuda-12.2/lib64 in that docker image. I also tried to download that docker image and run python -m bitsandbytes I can confirm I get the same error. I also tried to re-compile it with the correct LD_LIBRARY_PATH by doing export LD_LIBRARY_PATH=/usr/local/cuda-12/lib64:$LD_LIBRARY_PATH but still getting the same error

Do you have an idea why the bnb build from source fails in this case? You can pull the docker image from here: https://hub.docker.com/r/huggingface/peft-gpu-bnb-source and use the peft environment by calling source activate peft

And corresponding PR: huggingface/peft#1437

I have to disable our testing workflow for now until I figure out how to fix this 🙏 Any insight appreciated ! @akx @wkpark @rickardp

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full traceback below:

==================================== ERRORS ====================================
_____________ ERROR collecting tests/quantization/bnb/test_4bit.py _____________
transformers-clone/tests/quantization/bnb/test_4bit.py:82: in <module>
    import bitsandbytes as bnb
/bitsandbytes/bitsandbytes/__init__.py:6: in <module>
    from . import cuda_setup, research, utils
/bitsandbytes/bitsandbytes/research/__init__.py:2: in <module>
    from .autograd._functions import (
/bitsandbytes/bitsandbytes/research/autograd/_functions.py:8: in <module>
    from bitsandbytes.autograd._functions import GlobalOutlierPooler, MatmulLtState
/bitsandbytes/bitsandbytes/autograd/__init__.py:1: in <module>
    from ._functions import get_inverse_transform_indices, undo_layout
/bitsandbytes/bitsandbytes/autograd/_functions.py:10: in <module>
    import bitsandbytes.functional as F
/bitsandbytes/bitsandbytes/functional.py:17: in <module>
    from .cextension import COMPILED_WITH_CUDA, lib
/bitsandbytes/bitsandbytes/cextension.py:17: in <module>
    raise RuntimeError('''
E   RuntimeError: 
E           CUDA Setup failed despite GPU being available. Please run the following command to get more information:
E   
E           python -m bitsandbytes
E   
E           Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
E           to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
E           and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
------------------------------- Captured stdout --------------------------------
False

===================================BUG REPORT===================================
================================================================================
The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib64'), PosixPath('/usr/local/nvidia/lib')}
The following directories listed in your path were found to be non-existent: {PosixPath('https'), PosixPath('//api.github.com/graphql')}
The following directories listed in your path were found to be non-existent: {PosixPath('multi_gpu_huggingface/peft-gpu-bnb-latest'), PosixPath('latest')}
The following directories listed in your path were found to be non-existent: {PosixPath('https'), PosixPath('//api.github.com')}
The following directories listed in your path were found to be non-existent: {PosixPath('huggingface/peft/.github/workflows/nightly-bnb.yml@refs/heads/main')}
The following directories listed in your path were found to be non-existent: {PosixPath('huggingface/peft')}
The following directories listed in your path were found to be non-existent: {PosixPath('refs/heads/main')}
The following directories listed in your path were found to be non-existent: {PosixPath('https'), PosixPath('//github.com')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=1[21](https://github.com/huggingface/peft/actions/runs/7794931488/job/21257081248#step:9:22), Highest Compute Capability: 7.5.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Required library version not found: libbitsandbytes_cuda121.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...

================================================ERROR=====================================
CUDA SETUP: CUDA detection failed! Possible reasons:
1. You need to manually override the PyTorch CUDA version. Please see: "https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
2. CUDA driver not installed
3. CUDA not installed
4. You have multiple conflicting CUDA libraries
5. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.
================================================================================

CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=121 make cuda12x
python setup.py install
CUDA SETUP: Setup Failed!
=============================== warnings summary ===============================
../../../opt/conda/envs/peft/lib/python3.8/site-packages/_pytest/config/__init__.py:1[39](https://github.com/huggingface/peft/actions/runs/7794931488/job/21257081248#step:9:40)4
  /opt/conda/envs/peft/lib/python3.8/site-packages/_pytest/config/__init__.py:1394: PytestConfigWarning: Unknown config option: doctest_glob
  
    self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")

../../../bitsandbytes/bitsandbytes/cuda_setup/main.py:184
  /bitsandbytes/bitsandbytes/cuda_setup/main.py:184: UserWarning: Welcome to bitsandbytes. For bug reports, please run
  
  python -m bitsandbytes
  
  
    warn(msg)

../../../bitsandbytes/bitsandbytes/cuda_setup/main.py:184
  /bitsandbytes/bitsandbytes/cuda_setup/main.py:184: UserWarning: /opt/conda/envs/peft did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0', 'libcudart.so.12.1', 'libcudart.so.12.2'] as expected! Searching further paths...
    warn(msg)

../../../bitsandbytes/bitsandbytes/cuda_setup/main.py:184
  /bitsandbytes/bitsandbytes/cuda_setup/main.py:184: UserWarning: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0', 'libcudart.so.12.1', 'libcudart.so.12.2'] as expected! Searching further paths...
    warn(msg)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
-------------- generated report log file: transformers_tests.log ---------------
=========================== short test summary info ============================
ERROR transformers-clone/tests/quantization/bnb/test_4bit.py - RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
========================= 4 warnings, 1 error in 0.88s =========================
make: *** [Makefile:[52](https://github.com/huggingface/peft/actions/runs/7794931488/job/21257081248#step:9:53): transformers_tests] Error 2

Copy link
Contributor

@rickardp rickardp Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my machine, I needed an additional -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc

@akx According to cmake docs, it should now try to find CUDA automatically without resorting to magic like this. It would be interesting to know why you need to set this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably don't have CUDA_HOME set at all ✌️ 😂

Copy link
Contributor

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the makefile – maybe #1037 shouldn't have been self-merged without review in the first place though? 😅

Copy link

github-actions bot commented Feb 5, 2024

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@younesbelkada
Copy link
Collaborator Author

Thanks for the review! Yes sorry, I thought the change was too breaking so I had to revert and merge quickly 😢

@wkpark
Copy link
Contributor

wkpark commented Feb 5, 2024

cmake support simple syntax like -DBUILD_CUDA=ON|OFF option but current COMPUTE_BACKEND is needed to be defined -DCOMPUTE_BACKEND=cpu and it also breaks the document.

@younesbelkada
Copy link
Collaborator Author

thanks @akx @wkpark for your insights !

@younesbelkada younesbelkada merged commit bb5f6b9 into main Feb 5, 2024
10 checks passed
@younesbelkada younesbelkada deleted the younesbelkada-patch-4 branch February 5, 2024 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants