Can't run abaqus in ubuntu Ubuntu 22.04.1 LTS with GPU(CUDA) accelerator. Initializing the CUDA Driver NO_DEVICE
本文介绍了在Linux下运行Abaqus并调用CUDA时,出现Error initializing the CUDA Driver NO_DEVICE
的解决方案。
English ver:
Can’t run abaqus in ubuntu Ubuntu 22.04.1 LTS with GPU(CUDA) accelerator. Initializing the CUDA Driver NO_DEVICE
For the English version you can refer to my answer at the following link
1. 问题描述¶
当在Linux下运行abaqus job=jobname cpus=4 gpus=1 int 时,调用CUDA加速时出现以下错误:
1 | USING ACCELERATOR PLATFORM_CUDA |
我的系统环境为:
1 | Ubuntu 22.04.1 LTS |
环境变量设置:
1 | $export |grep ABA |
deviceQuery
测试
1 | deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.0, CUDA Runtime Version = 12.0, NumDevs = 1, Device0 = NVIDIA A30 |
2. 简单阐述原因:¶
Abaqus2022 有硬缺陷,自带了libcuda之类的 低等级包,导致系统的cuda无法加载
3. 解决方案:¶
把Abaqus 自带的 libcuda 包 给规避或者删除掉,问题就能解决。
具体解决方案(如果是默认安装地址):
移动以下文件:libcuda.so
、libcuda.so.1
和libcuda.so.418.39
到新创建的 keepcuda
子目录,以便这些文件不会干扰系统上安装的驱动程序
- 进入abaqus 自带的lib库
1 | $cd /usr/SIMULIA/EstProducts/2022/EstPrd/linux_a64/code/bin |
- 创建规避文件夹:
1 | $sudo mkdir keepcuda |
- 规避自带的cuda库
1 | $mv libcuda.so ./keepcuda/libcuda.so |
再运行 gpus=1 应该就不会出现 Error initializing the CUDA Driver NO_DEVICE 的问题。
*注:以上解决方案默认CUDA安装正确,且通过deviceQuery 测试,CUDA的权限没有问题。否则先确认系统环境是否设定ok
4. Workaround:¶
Create a subdirectory name keepcuda in the installation_dir/2022/EstPrd/linux_a64/code/bin directory.
Move the following files: libcuda.so
, libcuda.so.1
, and libcuda.so.418.39
to the newly created keepcuda subdirectory so that the files do not interfere with the driver installed on the system (in /usr/lib64). Note the subdirectory name is not important.
Can't run abaqus in ubuntu Ubuntu 22.04.1 LTS with GPU(CUDA) accelerator. Initializing the CUDA Driver NO_DEVICE
https://www.chenyu-k.com/2023/02/03/2023-02-03-Abaqus-cuda-nodevice/