[Build] Problems when use onnxruntime #21264

paulocoutinhox · 2024-07-05T17:36:45Z

Describe the issue

When i try use onnxruntime docker cuda i get these errors below.

Urgency

No response

Target platform

Linux x64

Build script

Error / output

2024-07-05 17:20:42.719797292 [E:onnxruntime:Default, provider_bridge_ort.cc:1731 TryGetProviderInfo_TensorRT] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1426 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_tensorrt.so with error: libcublas.so.11: cannot open shared object file: No such file or directory

*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/python/onnxruntime_pybind_state.cc:456 void onnxruntime::python::RegisterTensorRTPluginsAsCustomOps(onnxruntime::python::PySessionOptions&, const ProviderOptions&) Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported.
when using ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.

2024-07-05 17:20:42.999570148 [E:onnxruntime:Default, provider_bridge_ort.cc:1745 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1426 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./checkpoints/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
2024-07-05 17:20:43.679676761 [E:onnxruntime:Default, provider_bridge_ort.cc:1731 TryGetProviderInfo_TensorRT] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1426 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_tensorrt.so with error: libcublas.so.11: cannot open shared object file: No such file or directory

*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/python/onnxruntime_pybind_state.cc:456 void onnxruntime::python::RegisterTensorRTPluginsAsCustomOps(onnxruntime::python::PySessionOptions&, const ProviderOptions&) Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported.
when using ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.

2024-07-05 17:20:43.695770115 [E:onnxruntime:Default, provider_bridge_ort.cc:1745 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1426 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory

Visual Studio Version

No response

GCC / Compiler Version

No response

snnn · 2024-07-08T20:23:57Z

Could you please be more specific, which docker? We no longer publish docker images.

dianyo · 2024-07-12T10:32:36Z

Hi @snnn
I encountered the same issue when using this docker https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-12.html

And I build onnxruntime-gpu using pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/

Think it should related to the wheel's link lib version?

Here's the ldd

ldd libonnxruntime_providers_cuda.so
        linux-vdso.so.1 (0x00007ffe92257000)
        libcublasLt.so.11 => not found
        libcublas.so.11 => not found
        libcudnn.so.8 => /lib/x86_64-linux-gnu/libcudnn.so.8 (0x00007f3d45200000)
        libcurand.so.10 => /usr/local/cuda/targets/x86_64-linux/lib/libcurand.so.10 (0x00007f3d3ec00000)
        libcufft.so.10 => not found
        libcudart.so.11.0 => not found
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3d4556d000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f3d45568000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3d45563000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f3d3e9d4000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3d4547a000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f3d4545a000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3d3e7ac000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3d68769000)

And here are the two of the missing libs

ldconfig -p | grep libcublasLt
       libcublasLt.so.12 (libc6,x86-64) => /usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so.12
       libcublasLt.so (libc6,x86-64) => /usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so

ldconfig -p | grep libcufft.so
        libcufft.so.11 (libc6,x86-64) => /usr/local/cuda/targets/x86_64-linux/lib/libcufft.so.11
        libcufft.so (libc6,x86-64) => /usr/local/cuda/targets/x86_64-linux/lib/libcufft.so

snnn · 2024-07-12T20:54:36Z

Dup with #20944 .

paulocoutinhox added the build build issues; typically submitted using template label Jul 5, 2024

github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider labels Jul 5, 2024

snnn added more info needed issues that cannot be triaged until more information is submitted by the original user and removed build build issues; typically submitted using template labels Jul 8, 2024

snnn closed this as completed Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Build] Problems when use onnxruntime #21264

[Build] Problems when use onnxruntime #21264

paulocoutinhox commented Jul 5, 2024

snnn commented Jul 8, 2024

dianyo commented Jul 12, 2024 •

edited

Loading

snnn commented Jul 12, 2024

[Build] Problems when use onnxruntime #21264

[Build] Problems when use onnxruntime #21264

Comments

paulocoutinhox commented Jul 5, 2024

Describe the issue

Urgency

Target platform

Build script

Error / output

Visual Studio Version

GCC / Compiler Version

snnn commented Jul 8, 2024

dianyo commented Jul 12, 2024 • edited Loading

snnn commented Jul 12, 2024

dianyo commented Jul 12, 2024 •

edited

Loading