Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile and CUDA and CUDNN issues with GPU detected #47

Open
samhodge-aiml opened this issue Mar 14, 2024 · 5 comments
Open

Dockerfile and CUDA and CUDNN issues with GPU detected #47

samhodge-aiml opened this issue Mar 14, 2024 · 5 comments

Comments

@samhodge-aiml
Copy link

Trying to build from a CI/CD I got the following
builderror.txt.zip

see zip attached

important parts

-- Found CUDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so  
-- Found cuDNN: v8.9.6  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
CMake Warning at External/libtorch/share/cmake/Caffe2/public/cuda.cmake:214 (message):
  Failed to compute shorthash for libnvrtc.so
Call Stack (most recent call first):
  External/libtorch/share/cmake/Caffe2/Caffe2Config.cmake:92 (include)
  External/libtorch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  CMakeLists.txt:78 (find_package)


-- Automatic GPU detection failed. Building for common architectures.
-- Autodetected CUDA architecture(s): 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.0;8.6;8.6+PTX
-- Added CUDA NVCC flags for: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_86,code=compute_86
-- Found Torch: /app/External/libtorch/lib/libtorch.so  
-- Package torch                      Yes, at /app/External/libtorch/include;/app/External/libtorch/include/torch/csrc/api/include
CMAKE_EXE_LINKER_FLAGS before: -Wl,--no-as-needed
TORCH_LIBRARIES: torch;torch_library;/app/External/libtorch/lib/libc10.so;/app/External/libtorch/lib/libkineto.a;/usr/local/cuda/lib64/stubs/libcuda.so;/usr/local/cuda/lib64/libnvrtc.so;/usr/local/cuda/lib64/libnvToolsExt.so;/usr/local/cuda/lib64/libcudart.so;/app/External/libtorch/lib/libc10_cuda.so
TORCH_CXX_FLAGS: -D_GLIBCXX_USE_CXX11_ABI=1
CMAKE_EXE_LINKER_FLAGS after: -Wl,--as-needed
CUDNN_LIBRARY_PATH: /usr/lib/x86_64-linux-gnu/libcudnn.so; CUDNN_INCLUDE_PATH: /usr/include
-- Obtained CUDA architectures automatically from installed GPUs
-- Automatic GPU detection failed. Building for Turing and Ampere as a best guess.
-- Targeting CUDA architectures: 75;86
-- SAIGA_CUDA_VERSION 
-- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found suitable version "11.8.89", minimum required is "10.2") 
-- Enabled CUDA. Version: 11.8.89
-- Package CUDA::cudart               Yes, at /usr/local/cuda/targets/x86_64-linux/include
-- Package CUDA::nppif                Yes, at /usr/local/cuda/targets/x86_64-linux/include
-- Package CUDA::nppig                Yes, at /usr/local/cuda/targets/x86_64-linux/include
-- SAIGA_CUDA_FLAGS: -Xcompiler=-fopenmp;-Xcompiler=-march=native;-use_fast_math;--expt-relaxed-constexpr;-Xcudafe=--diag_suppress=esa_on_defaulted_function_ignored;-Xcudafe=--diag_suppress=field_without_dll_interface;-Xcudafe=--diag_suppress=base_class_has_different_dll_interface;-Xcudafe=--diag_suppress=dll_interface_conflict_none_assumed;-Xcudafe=--diag_suppress=dll_interface_conflict_dllexport_assumed
-- Using automatic CUDA Arch detection...
-- Automatic GPU detection failed. Building for common architectures.
-- Autodetected CUDA architecture(s): 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.0;8.6;8.6+PTX
-- SAIGA_CUDA_ARCH: 
-- SAIGA_CUDA_ARCH_FLAGS: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_86,code=compute_86
-- 
Compiler Flags:
-- SAIGA_CXX_FLAGS: -Wall;-Werror=return-type;-Wno-strict-aliasing;-Wno-sign-compare;-march=native;-fopenmp
-- SAIGA_PRIVATE_CXX_FLAGS: -fvisibility=hidden
-- SAIGA_LD_FLAGS: -fopenmp
-- CMAKE_CXX_FLAGS: 
-- CMAKE_CXX_FLAGS_DEBUG: -g
-- CMAKE_CXX_FLAGS_RELWITHDEBINFO: -O2 -g -DNDEBUG
-- CMAKE_CXX_FLAGS_RELEASE: -O3 -DNDEBUG
-- 
CUDA Compiler Flags:
-- CMAKE_CUDA_FLAGS: 
-- CMAKE_CUDA_FLAGS_DEBUG: -g
-- CMAKE_CUDA_FLAGS_RELWITHDEBINFO: -O2 -g -DNDEBUG
-- CMAKE_CUDA_FLAGS_RELEASE: -O3 -DNDEBUG
[ 17%] Built target signalhandler_unittest
[ 17%] Building CUDA object External/tiny-cuda-nn/CMakeFiles/tiny-cuda-nn.dir/src/common.cu.o
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
make[2]: *** [External/tiny-cuda-nn/CMakeFiles/tiny-cuda-nn.dir/build.make:77: External/tiny-cuda-nn/CMakeFiles/tiny-cuda-nn.dir/src/common.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:2579: External/tiny-cuda-nn/CMakeFiles/tiny-cuda-nn.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
@samhodge-aiml
Copy link
Author

@michaldigimansai

Do you have anything you can suggest?

@abecadel
Copy link
Contributor

looks like there is no GPU available

@samhodge-aiml
Copy link
Author

I was able to build the nerfstudio Docker image in the same CI/CD environment and that used the GPU for nvcc calls.

@samhodge-aiml
Copy link
Author

It sort of relates to

https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime/61737404#61737404

Because the same thing happened on my machine, I will look into a kaniko based solution in kubernetes because this is the runtime environment.

@samhodge-aiml
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants