Cuda driver version is insufficient for cuda runtime version как исправить

I got the message: "cutilCheckMsg() CUTIL CUDA error : kernel launch failure : CUDA driver version is insufficient for CUDA runtime version." While trying to run an example source code. A...

I got the message:

«cutilCheckMsg() CUTIL CUDA error :
kernel launch failure : CUDA driver
version is insufficient for CUDA
runtime version.»

While trying to run an example source code. Also happens for the function cutilSafeCall.

I am using:

  • Windows 7 64bits
  • Visual studio 2008
  • CUDA developer driver, toolkit, and SDK 3.1
  • Emulation mode

double-beep's user avatar

double-beep

4,85916 gold badges32 silver badges41 bronze badges

asked Jul 15, 2010 at 7:13

superscalar's user avatar

2

You need to ensure that your driver version matches or exceeds your CUDA Toolkit version.

For 2.3 you need a 190.x driver, for 3.0 you need 195.x and for 3.1 you need 256.x (actually anything up to the next multiple of five is ok, e.g. 258.x for 3.1).

You can check your driver version by either running the deviceQueryDrv SDK sample or go into the NVIDIA Control Panel and choose System Information.

Download an updated driver from www.nvidia.com/drivers.

answered Jul 20, 2010 at 9:26

Tom's user avatar

TomTom

20.7k4 gold badges42 silver badges54 bronze badges

I saw the same at runtime with the latest driver on Mac OS 10.6.

cudaError_t error = cudaGetDevice(&device);
printf("%sn", cudaGetErrorString(error));

I went back to the developer site, downloaded the driver again and now it runs.
http://developer.nvidia.com/object/cuda_3_1_downloads.html#MacOS

double-beep's user avatar

double-beep

4,85916 gold badges32 silver badges41 bronze badges

answered Sep 2, 2010 at 20:57

Frank's user avatar

FrankFrank

1342 bronze badges

You can either download the latest driver OR use an older toolkit version to compile your code.

answered Dec 13, 2011 at 7:50

Meghana's user avatar

MeghanaMeghana

511 silver badge1 bronze badge

1

Counterintuitively, this error also happens if libcuda.so is not found, even when versions reported by nvidia-smi match perfectly. This library is part of nvidia-drivers package (On CentOS: nvidia-driver-latest-cuda-libs, on Gentoo x11-drivers/nvidia-drivers). It is possible to have the CUDA Tookit with nvcc and libcudart installed and building your app fine, but the drivers part not installed, causing this error.

To diagnose whether this is the reason, use strace:

strace -f -e trace=file ./your_cuda_app

and check for open calls to libcuda.so*, at least one of them should return with a success code, like so:

4928  open("/lib64/libcuda.so.1", O_RDONLY|O_CLOEXEC) = 3 

answered Apr 19, 2021 at 15:50

alexei's user avatar

alexeialexei

1,9611 gold badge25 silver badges27 bronze badges

My cent,

with Linux/Unix this error may be related to the selected GPU mode (Performance/Power Saving Mode), when you select (with nvidia-settings utiliy) the integrated Intel GPU and you execute the deviceQuery script… you get this error:

-> CUDA driver version is insufficient for CUDA runtime version

But this error is misleading, by selecting back the NVIDIA(Performance mode) with nvidia-settings utility the problem disappears.

It is not a version problem.

Regards

P.s: «Power Saving Mode» tells Optimus to activate the CPU integrated Intel GPU

answered Mar 12, 2018 at 9:28

Fabiano Tarlao's user avatar

CUDA driver version is insufficient for CUDA runtime version: means your GPU can`t been manipulated by the CUDA runtime API, so you need to update your driver.

answered Oct 10, 2014 at 17:02

Dongwei Wang's user avatar

1

In my case, I had to run my docker container with nvidia-docker run ... instead of docker run ...

answered Sep 3, 2021 at 11:35

Kees Schollaart's user avatar

Kees SchollaartKees Schollaart

6361 gold badge5 silver badges4 bronze badges

I also had similar problem, updated my graphic driver but the problem still remained. I finally decided to remove Cuda 9.2 and install Cuda 8, it solved my issue.

answered Sep 13, 2018 at 23:47

user3112759's user avatar

This problem can also be because of incorrect environment setup, e.g. Docker image setup. Although the driver itself is correct, sufficient for your program. If your LD_LIBRARY_PATH points to the wrong driver, it can throw this error. In my case, i get this error when using /usr/local/nvidia/lib/libcuda.so, and if I use /usr/local/nvidia/lib64/libcuda.so everything goes right.

answered Sep 23, 2022 at 12:26

lenin's user avatar

leninlenin

811 silver badge2 bronze badges

Maybe it is related to the TBB lib:
Error OpenCV with CUDA using TBB for multiple GPUs

Try rebuilding it making sure you passed the following parameters to CMake (assuming you already installed «tbb» and «tbb-devel» packages:

-D WITH_TBB=YES -D TBB_INCLUDE_DIRS=/usr/include/tbb

Community's user avatar

answered Oct 13, 2014 at 21:22

herrera's user avatar

herreraherrera

1171 silver badge10 bronze badges

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and
privacy statement. We’ll occasionally send you account related emails.

Already on GitHub?
Sign in
to your account


Closed

mforde84 opened this issue

Aug 23, 2018

· 28 comments

Assignees

@azaks2

Comments

@mforde84

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    Kernel: 2.6.32-573.12.1.el6.x86_64
    Host: RHEL 6.7
    Container: Ubuntu 16.04.5 LTS

  • TensorFlow installed from (source or binary):
    Singularity

  • TensorFlow version (use command below):
    Tensorflow:1.10.0-devel-gpu-py3

  • Python version:
    Python 3.5.2

  • GCC/Compiler version (if compiling from source):
    GCC 5.4.0

  • CUDA/cuDNN version:
    9

  • GPU model and memory:
    Singularity tensorflow:1.10.0-devel-gpu-py3:~> nvidia-smi
    Thu Aug 23 00:24:41 2018
    +——————————————————+
    | NVIDIA-SMI 352.39 Driver Version: 352.39 |
    |——————————-+———————-+———————-+
    | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
    | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
    |===============================+======================+======================|
    | 0 Tesla K80 Off | 0000:84:00.0 Off | 0 |
    | N/A 39C P0 58W / 149W | 22MiB / 11519MiB | 0% E. Process |
    +——————————-+———————-+———————-+

  • Exact command to reproduce:
    $ # install nvidia driver v352.39
    $ sudo singularity build —sandbox /path/to/sandbox docker://tensorflow/tensorflow/1.10.0-devel-gpu-py3
    $ singularity shell -nv /path/to/sandbox
    Singularity tensorflow:1.10.0-devel-gpu-py3:~> nvidia-smi
    Thu Aug 23 00:24:41 2018
    +——————————————————+
    | NVIDIA-SMI 352.39 Driver Version: 352.39 |
    |——————————-+———————-+———————-+
    | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
    | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
    |===============================+======================+======================|
    | 0 Tesla K80 Off | 0000:84:00.0 Off | 0 |
    | N/A 39C P0 58W / 149W | 22MiB / 11519MiB | 0% E. Process |
    +——————————-+———————-+———————-+

+——————————————————————————+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+——————————————————————————+
Singularity tensorflow:1.10.0-devel-gpu-py3:~> python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type «help», «copyright», «credits» or «license» for more information.

        from tensorflow.python.client import device_lib
        print(device_lib.list_local_devices())
        2018-08-23 00:26:35.424225: I
        tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports
        instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
        2018-08-23 00:26:38.208490: I
        tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with
        properties:
        name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
        pciBusID: 0000:84:00.0
        totalMemory: 11.25GiB freeMemory: 11.16GiB
        2018-08-23 00:26:38.208576: I
        tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu
        devices: 0
        Traceback (most recent call last):
        File "", line 1, in
        File
        "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/device_lib.py",
        line 41, in list_local_devices
        for s in pywrap_tensorflow.list_devices(session_config=session_config)
        File
        "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py",
        line 1679, in list_devices
        return ListDevices(status)
        File
        "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py",
        line 519, in exit
        c_api.TF_GetCode(self.status.status))
        tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed.
        Status: CUDA driver version is insufficient for CUDA runtime version

Describe the problem

I built a tensorflow container with singularity. I think there might be a mismatch between the some of the card drivers and cuda libraries between the host and container. I have the container built as a sandbox so I’m able to make modifications quiet easily, I was curious if there’s a way I can install appropriate cuda driver and runtimes to the container, and have the container run off those instead of pulling libraries from the host which are incompatible with the container? Is this the right way to do it? Or should I be updating the cuda drivers / libraries on the host to match the container?

@tensorflowbutler

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
Have I written custom code
Bazel version
Mobile device

@mforde84

Have I written custom code
N/A
Bazel version
N/A
Mobile device
N/A

@mforde84

@ppwwyyxx

@mforde84

Sure. But the question is more on how to integrate compatible drivers into a tensorflow container. The adage about containerization is: build once, run anywhere; and not: build once, run anywhere with Nvidia drivers v485 and above plus a kernel supporting experimental filesystem overlays. Even experimental / unofficial documentation on this scenario would be extremely helpful for most HPC environments that are still running epel6. ¯_(ツ)_/¯

patricieni, yzsatgithub, cpoptic, ankitshah009, adideshpande, himjl, campellcl, michael-iuzzolino, qtianreal, skyuuka, and 8 more reacted with thumbs up emoji

@ppwwyyxx

The world is not perfect. I’m afraid «build once, run anywhere with nvidia drivers>=384.81» is the way to go. At least that’s what nvidia says: https://github.com/NVIDIA/nvidia-docker/wiki/CUDA#requirements

Running a CUDA container requires a machine with at least one CUDA-capable GPU and a driver compatible with the CUDA toolkit version you are using.

@nicolefinnie

MorePainMoreGainByTP, hello-wangjj, JustinhoCHN, quangvu0702, songtaoshi, aiminsu, MrBearWithHisSword, aviaefrat, cpoptic, iridiumblue, and 20 more reacted with thumbs up emoji
nfsrules, psailamul, gsygsy96, tucan9389, and Hopding reacted with hooray emoji

@mforde84

We upgraded to a recent version of drivers 396 and the issue resolved.

@nicolefinnie

@mforde84 Thanks for the confirmation. That’s what I was thinking too, but I had trouble upgrading to 396.54 due to a broken dependency, however, after having read your confirmation, I managed to install 396.54 and now it works with tensorflow 1.11.0, Yoho! Thanks! Upgraded the ticket in the Nvidia DevTalk.

@azaks2

tensorflow 1.11 + CUDA runtime 9.0 + cudnn 7.3 + nvidia driver 390
the combo should have worked. Note with 396.54 there will be one more upgrade once TF switches to CUDA 10.

@hello-wangjj

@nicolefinnie , thanks, I downgraded the tensorflow version fromt o 1.7 and this problem got solved.

@saskra

I tested the recommendations in this thread, but I was not able to install any other driver than 390 on Ubuntu 18.04 and downgrading tensorflow to 1.7 resulted in a new error message:

2018-10-17 09:12:21.434933: E tensorflow/stream_executor/cuda/cuda_dnn.cc:343] Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.2.1.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
Segmentation fault (core dumped)

Which is strange, as I had installed version 7.3.1 on my system, but it seems that anaconda installs its own cudnn in the enviroment.

@hello-wangjj

I tested the recommendations in this thread, but I was not able to install any other driver than 390 on Ubuntu 18.04 and downgrading tensorflow to 1.7 resulted in a new error message:

2018-10-17 09:12:21.434933: E tensorflow/stream_executor/cuda/cuda_dnn.cc:343] Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.2.1.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
Segmentation fault (core dumped)

Which is strange, as I had installed version 7.3.1…

@saskra ,I was use deepin15.8, nvidia-driver==390.67, cuda==9.0,cudnn==7.0, and miniconda installed tensorflow-gpu==1.7,and the problem got solved.

@mforde84

Saskra are you running in a container?

@saskra

No. But I now found the solution: Anaconda creates an environment with its own incompatible cudnn version which has to be overwritten manually. :-)

@PhilipMay

No. But I now found the solution: Anaconda creates an environment with its own incompatible cudnn version which has to be overwritten manually. :-)

I have the same problem. :-(
Which version of which exact conda module did you have to use to overwrite?

@saskra

@Yongyao

@mforde84 Would you mind sharing how you upgraded it?

cpoptic, exowanderer, divprasad, zlyx525, Mohammed-Elias, jturi, feiyutalk, jhave, VertexC, un-lock-me, and MagaretJi reacted with thumbs up emoji

@Huixxi

@mmattklaus

@mforde84 Would you mind sharing how you upgraded it?

As for me, upgrading my driver worked out. I run a Windows 10 PC and use TF 1.13.
( NOTE: _Just an aside, I needed to activate my virtual environment and start Jupyter notebook in that env before I was able to use TF in the notebook.)

Here is how I upgraded my driver:

  1. Open Device Manager
  2. Expand the display adapters
  3. Locate your NVIDIA Graphics adapter
  4. Right-click and click Update driver

Alternative

  • I found this software ( GeForce Experience ) on the NVIDIA website for my graphics family which can also be downloaded, installed and used to update the driver(s). This should work as well, though I didn’t go that way.

@Huixxi

@ghost

Same issue here and I can’t find an appropriate tensorflow version. I currently have ubuntu version 16.04.6, driver version 410.78, cuda version 10, conda version 4.7.11 and none of the above-mentioned tensorflow versions works for me. I tried 1.13.1, 1.7 and 1.14.
Anaconda installs cudnn with version 7.6.0. Edit: I forced conda to use the version 10.0 for cudatoolkit and not cuda10.1_0 as it was before (according to @saskra’s suggestion), but nothing changed unfortunately.

Updating anaconda also didn’t help. In fact, conda update --all and conda update conda outputs many new errors like:
InvalidArchiveError('Error with archive ... You probably need to delete and re- download or re-create this file. Message from libarchive was:...

Creating a conda environment with my current specs or simply running my python script also produces various InvalidArchiveError messages like above:

channels:
  - conda-forge
  - defaults
dependencies:
  - keras=2.2.4
  - nltk=3.3.0
  - numpy=1.15.4
  - pandas=0.23.4
  - python=3.6.6
  - scikit-learn=0.20.0
  - scipy=1.1.0
  - tensorflow=1.7
  - tensorflow-gpu=1.7
  - cython=0.29
  - pip:
    - fasttext==0.8.3
    - fuzzywuzzy==0.17.0
    - python-levenshtein==0.12.0
    - subsample==0.0.6
    - talos
    - tabulate==0.8.3

@agostini01

I had a similar issue using driver 384.130. Turns out that versions of the cudatoolkit inside anaconda environment and the cuda supported by my driver did not match.

These two links helped me identifying my driver and cuda version and, later, to install the correct version of tensorflow_gpu that matched the cuda in my machine

To select the appropriate version based on your cuda installation:
https://www.tensorflow.org/install/source#tested_build_configurations

Version Python version Compiler Build tools cuDNN CUDA
tensorflow_gpu-1.14.0 2.7, 3.3-3.7 GCC 4.8 Bazel 0.24.1 7.4 10.0
tensorflow_gpu-1.13.1 2.7, 3.3-3.7 GCC 4.8 Bazel 0.19.2 7.4 10.0
tensorflow_gpu-1.12.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.11.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.10.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.9.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.11.0 7 9

The cuda versions may have minor-versions (9.0, 9.2), thus you should double check what exactly you are installing with conda.
To check what you have inside your conda enviroment and how to install a different version
https://stackoverflow.com/a/55351774/2971299

So, I identified my cuda version

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

And installed the correct anaconda environment:

conda create -n gpu tensorflow-gpu==1.9.0 jupyter

@ghost

@agostini01

@KonstantinaLazaridou no problems. I believe your suggested link is for when you are installing cuda system wide.

This line: conda create -n gpu tensorflow-gpu==1.9.0 jupyter cudatoolkit==XX should work as long as you match the anaconda tensorflow-gpu version with the correct anaconda cudatoolkit (XX) and «system-wide installed» cuda driver. Unfortunately I dont remember what to use for the XX value anymore.
|Apparently they don’t go well together?
indeed! Nice catch. The advantage of using conda is that you can have tensorflow in one environment and tensorflow-gpu in another.

@MagaretJi

@mforde84 I had a similar issue using driver 384.81,but Nvidia recommended Tesla k80 need install driver 384.183.So upgraded to a recent version of drivers 396 is a good choice???
GPU Tesla k80
tensorflow-gpu 1.10.0
CDUNN 7.0.5
CUDA 9.0

2019-12-17 09:55:46.558571: E tensorflow/stream_executor/cuda/cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-12-17 09:55:46.558747: E tensorflow/stream_executor/cuda/cuda_dnn.cc:463] possibly insufficient driver version: 384.81.0
2019-12-17 09:55:46.558864: F tensorflow/core/kernels/conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)

@turinglife

### nvidia drivers mismatch

my nvidia driver is 384.90.

before: error which is same as the title of the thread.
tensorflow-gpu 1.15.0 with cudatoolkit 10.0.130 + cudnn 7.6.5

after: Worked
tensorflow-gpu 1.12.0 with cudatoolkit 9.0

solution:
conda uninstall cudatoolkit (10.0.130)
conda install tensorflow-gpu 1.12 cudatoolkit=9.0

image

@shivam1702

This error also occurs if you create a symbolic link for any CUDA shared object file with a higher version to a shared object. with a lower version.

For example, for me this error was occurring because I had a symbolic link from /usr/local/cuda-10.0/lib64/libcudart.so pointing towards: /usr/local/cuda/lib64/libcudart.so.10.1, among other symlinks.

When I removed just this symlink, the error vanished, but I noticed that there was no significant difference between the training times between GPU and CPU, despite the GPU process showing up in nvidia-smi, while the other one obviously didn’t. They were exactly the same. Weird issue.

Содержание

  1. Status: CUDA driver version is insufficient for CUDA runtime version #21832
  2. Comments
  3. Describe the problem
  4. Alternative
  5. [0.15.0.dev11] Error: Insufficient CUDA driver: 9010. Not bug. You must upgrade CUDA to 9.2 #1138
  6. Comments

Status: CUDA driver version is insufficient for CUDA runtime version #21832

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Kernel: 2.6.32-573.12.1.el6.x86_64
Host: RHEL 6.7
Container: Ubuntu 16.04.5 LTS

TensorFlow installed from (source or binary):
Singularity

TensorFlow version (use command below):
Tensorflow:1.10.0-devel-gpu-py3

Python version:
Python 3.5.2

GCC/Compiler version (if compiling from source):
GCC 5.4.0

CUDA/cuDNN version:
9

GPU model and memory:
Singularity tensorflow:1.10.0-devel-gpu-py3:

Exact command to reproduce:
$ # install nvidia driver v352.39
$ sudo singularity build —sandbox /path/to/sandbox docker://tensorflow/tensorflow/1.10.0-devel-gpu-py3
$ singularity shell -nv /path/to/sandbox
Singularity tensorflow:1.10.0-devel-gpu-py3:

> python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type «help», «copyright», «credits» or «license» for more information.

Describe the problem

I built a tensorflow container with singularity. I think there might be a mismatch between the some of the card drivers and cuda libraries between the host and container. I have the container built as a sandbox so I’m able to make modifications quiet easily, I was curious if there’s a way I can install appropriate cuda driver and runtimes to the container, and have the container run off those instead of pulling libraries from the host which are incompatible with the container? Is this the right way to do it? Or should I be updating the cuda drivers / libraries on the host to match the container?

The text was updated successfully, but these errors were encountered:

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
Have I written custom code
Bazel version
Mobile device

Have I written custom code
N/A
Bazel version
N/A
Mobile device
N/A

Would https://github.com/NIH-HPC/gpu4singularity be viable for Singularity 2.6.0 with —nv flags or would I need to make additional modification to library paths?

This is not a tensorflow issue: according to https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html your nvidia driver is not new enough for cuda 9.0

Sure. But the question is more on how to integrate compatible drivers into a tensorflow container. The adage about containerization is: build once, run anywhere; and not: build once, run anywhere with Nvidia drivers v485 and above plus a kernel supporting experimental filesystem overlays. Even experimental / unofficial documentation on this scenario would be extremely helpful for most HPC environments that are still running epel6. ¯_(ツ)_/¯

The world is not perfect. I’m afraid «build once, run anywhere with nvidia drivers>=384.81» is the way to go. At least that’s what nvidia says: https://github.com/NVIDIA/nvidia-docker/wiki/CUDA#requirements

Running a CUDA container requires a machine with at least one CUDA-capable GPU and a driver compatible with the CUDA toolkit version you are using.

I hit exactly this problem and someone else with the same combination ( tensorflow 1.11 + CUDA runtime 9.0 + cudnn 7.3 + nvidia driver 390 ) hit this problem too, though nvidia driver 390 is new enough for CUDA runtime 9.0 . This person opened an issue in the Nvidia DevTalk:

And I downgraded the tensorflow version from 1.11 (the latest conda version) to 1.7 and this problem got solved. And my question is if the newer tensorflow, say 1.10+ , has a dependency on specific nvidia drivers /cuda versions?

We upgraded to a recent version of drivers 396 and the issue resolved.

@mforde84 Thanks for the confirmation. That’s what I was thinking too, but I had trouble upgrading to 396.54 due to a broken dependency, however, after having read your confirmation, I managed to install 396.54 and now it works with tensorflow 1.11.0 , Yoho! Thanks! Upgraded the ticket in the Nvidia DevTalk.

tensorflow 1.11 + CUDA runtime 9.0 + cudnn 7.3 + nvidia driver 390
the combo should have worked. Note with 396.54 there will be one more upgrade once TF switches to CUDA 10.

@nicolefinnie , thanks, I downgraded the tensorflow version fromt o 1.7 and this problem got solved.

I tested the recommendations in this thread, but I was not able to install any other driver than 390 on Ubuntu 18.04 and downgrading tensorflow to 1.7 resulted in a new error message:

Which is strange, as I had installed version 7.3.1 on my system, but it seems that anaconda installs its own cudnn in the enviroment.

I tested the recommendations in this thread, but I was not able to install any other driver than 390 on Ubuntu 18.04 and downgrading tensorflow to 1.7 resulted in a new error message:

Which is strange, as I had installed version 7.3.1.

@saskra ,I was use deepin15.8, nvidia-driver==390.67, cuda==9.0,cudnn==7.0, and miniconda installed tensorflow-gpu==1.7,and the problem got solved.

Saskra are you running in a container?

No. But I now found the solution: Anaconda creates an environment with its own incompatible cudnn version which has to be overwritten manually. 🙂

No. But I now found the solution: Anaconda creates an environment with its own incompatible cudnn version which has to be overwritten manually. 🙂

I have the same problem. 🙁
Which version of which exact conda module did you have to use to overwrite?

I have Ubuntu 18.04 which needs Nvidia driver 390. Anaconda brings cuDNN 7.2.1, which seems to be too old for this driver version: https://anaconda.org/anaconda/cudnn Now I am using the newest cuDNN version (7.3.1) as suggested by the official download site: https://developer.nvidia.com/rdp/cudnn-download btw: Anaconda’s cuDNN version depends on its TensorFlow version, I have the newest one here as well (1.11).

PS: I suggested to update the version: ContinuumIO/anaconda-issues#10224

@mforde84 Would you mind sharing how you upgraded it?

check whether your nvidia-driver support your cuda version from here https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

@mforde84 Would you mind sharing how you upgraded it?

As for me, upgrading my driver worked out. I run a Windows 10 PC and use TF 1.13.
( NOTE: _Just an aside, I needed to activate my virtual environment and start Jupyter notebook in that env before I was able to use TF in the notebook.)

Here is how I upgraded my driver:

  1. Open Device Manager
  2. Expand the display adapters
  3. Locate your NVIDIA Graphics adapter
  4. Right-click and click Update driver

Alternative

  • I found this software ( GeForce Experience ) on the NVIDIA website for my graphics family which can also be downloaded, installed and used to update the driver(s). This should work as well, though I didn’t go that way.

Same issue here and I can’t find an appropriate tensorflow version. I currently have ubuntu version 16.04.6 , driver version 410.78 , cuda version 10 , conda version 4.7.11 and none of the above-mentioned tensorflow versions works for me. I tried 1.13.1 , 1.7 and 1.14 .
Anaconda installs cudnn with version 7.6.0 . Edit: I forced conda to use the version 10.0 for cudatoolkit and not cuda10.1_0 as it was before (according to @saskra’s suggestion), but nothing changed unfortunately.

Updating anaconda also didn’t help. In fact, conda update —all and conda update conda outputs many new errors like:
InvalidArchiveError(‘Error with archive . You probably need to delete and re- download or re-create this file. Message from libarchive was.

Creating a conda environment with my current specs or simply running my python script also produces various InvalidArchiveError messages like above:

I had a similar issue using driver 384.130. Turns out that versions of the cudatoolkit inside anaconda environment and the cuda supported by my driver did not match.

These two links helped me identifying my driver and cuda version and, later, to install the correct version of tensorflow_gpu that matched the cuda in my machine

Version Python version Compiler Build tools cuDNN CUDA
tensorflow_gpu-1.14.0 2.7, 3.3-3.7 GCC 4.8 Bazel 0.24.1 7.4 10.0
tensorflow_gpu-1.13.1 2.7, 3.3-3.7 GCC 4.8 Bazel 0.19.2 7.4 10.0
tensorflow_gpu-1.12.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.11.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.10.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.9.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.11.0 7 9

The cuda versions may have minor-versions (9.0, 9.2), thus you should double check what exactly you are installing with conda.
To check what you have inside your conda enviroment and how to install a different version
https://stackoverflow.com/a/55351774/2971299

So, I identified my cuda version

And installed the correct anaconda environment:

Thank you very much @agostini01 . I actually have all versions aligned correctly. The only thing that actually worked out is the second answer here: https://stackoverflow.com/questions/41402409/tensorflow-doesnt-seem-to-see-my-gpu
I uninstalled tensorflow and reinstalled tensorflow-gpu. Apparently they don’t go well together?
Now Python sees my GPUs and when I do watch-smi I can see my job using them.

@KonstantinaLazaridou no problems. I believe your suggested link is for when you are installing cuda system wide.

This line: conda create -n gpu tensorflow-gpu==1.9.0 jupyter cudatoolkit==XX should work as long as you match the anaconda tensorflow-gpu version with the correct anaconda cudatoolkit (XX) and «system-wide installed» cuda driver. Unfortunately I dont remember what to use for the XX value anymore.
|Apparently they don’t go well together?
indeed! Nice catch. The advantage of using conda is that you can have tensorflow in one environment and tensorflow-gpu in another.

@mforde84 I had a similar issue using driver 384.81,but Nvidia recommended Tesla k80 need install driver 384.183.So upgraded to a recent version of drivers 396 is a good choice.
GPU Tesla k80
tensorflow-gpu 1.10.0
CDUNN 7.0.5
CUDA 9.0

2019-12-17 09:55:46.558571: E tensorflow/stream_executor/cuda/cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-12-17 09:55:46.558747: E tensorflow/stream_executor/cuda/cuda_dnn.cc:463] possibly insufficient driver version: 384.81.0
2019-12-17 09:55:46.558864: F tensorflow/core/kernels/conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)

### nvidia drivers mismatch

my nvidia driver is 384.90.

before: error which is same as the title of the thread.
tensorflow-gpu 1.15.0 with cudatoolkit 10.0.130 + cudnn 7.6.5

after: Worked
tensorflow-gpu 1.12.0 with cudatoolkit 9.0

solution:
conda uninstall cudatoolkit (10.0.130)
conda install tensorflow-gpu 1.12 cudatoolkit=9.0

Источник

[0.15.0.dev11] Error: Insufficient CUDA driver: 9010. Not bug. You must upgrade CUDA to 9.2 #1138

Ethminer from archive ethminer-0.15.0.dev11-Linux.tar.gz

On start have following:

nvidia-smi

nvcc —version

With ethminer version 0.14.0 and 0.15.0.dev10 have no problems.

The text was updated successfully, but these errors were encountered:

try 390.59 from Nvidia? dev11 is compiled with cuda 9.2 and on Win10 works fine with latest nvidia drivers.

Oh, yes. Now I see «Travis CI: Build with g++-7, upgrade CUDA to 9.2» comments in commits.
Issue can be closed.

The explanation here seems inconsistent. I have just built v0.14.0 with CUDA 9.2 and I get the same error .

Not sure about v0.14 . I know 0.15 has the CUDA 9.2 code. You also must have the Linux 390.59 drivers from Nvidia for Linux or the latest drivers for Windows that also includes CUDA 9.2.

The drivers must match the code that ethermine uses.

Well something does not match. I just checked out the 0.15.0dev11, built and got the same error.
I’m on Debian 9, I have the proper CUDA, the proper driver and all that jazz

I tried the debug version but I cannot get more from the log . any suggestions?

EDIT:
I just downloaded the released version of 0.15.0dev11 and I have the same behavior .
but it works with the released versioned of 0.14.0!

The problem is in nvidia driver, not the CUDA SDK,

@celavek How do you installed cuda 9.2? I downloaded the cuda toolkit deb network (https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=debnetwork) but after the installation i was unable to login. I had to run sudo apt-get purge nvidia-* for fix this problem, reinstall nvida-390 driver ( sudo apt-get install nvidia-390 ) and also the cuda toolkit ( sudo apt install nvidia-cuda-toolkit ).
Now if i run nvcc —version i have the 7.5.17 version

But if i run this command i have the 9.2.88 version

Same issue here,

m 03:45:06|ethminer| ethminer 0.15.0.dev11-79+commit.83b75508
m 03:45:06|ethminer| Build: linux/release
Error: Insufficient CUDA driver: 9010

terminate called without an active exception
./miner.sh: line 40: 3437 Aborted

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Wed_Apr_11_23:16:29_CDT_2018
Cuda compilation tools, release 9.2, V9.2.88
(to get this result had to manually add the path
PATH=$PATH:/usr/local/cuda-9.2/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.2/bin/../lib:/usr/local/cuda-9.2/bin/../lib64
)

Driver Version: 390.59

Any help welcomed

How do you precisely set manually the path?

@invidtiv you may have to reinstall nvidia drivers.

@AndreaLanfranchi Thanks , actually the issue was no purge from apt-get with the * , had to manually purge every nvidia , don t know why.
Got it working, actually installing with sudo apt-get install nvidia-390 it simply did not work for me .
I downloaded the cuda from nvidia cuda_9.2.88.1_linux.run , plus the NVIDIA-Linux-x86_64-396.24.run.
And installed the driver from the cuda install setup.
Again don t know why, I had these issues , they are not etherminer issues, sudo apt-get purge nvidia* always worked for me previously.

@Scorpion2185
just write
PATH=$PATH:/usr/local/cuda-9.2/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.2/bin/../lib:/usr/local/cuda-9.2/bin/../lib64
It will add the path. you are setting PATH=$PATH:(new path to include)

@invidtiv i wrote in the terminal:

I used sudo apt-get install cuda and again i wasn’ t able to login, and i had to do all the things that i said before to fix it.

Источник

So I have a very similar question to:
What can I do against ‘CUDA driver version is insufficient for CUDA runtime version’?

When I make and run deviceQuery, I get the exact same error:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Here’s my system:

andycui97@andycui97-Z10PE-D8-WS:~$ nvidia-settings -q NvidiaDriverVersion

  Attribute 'NvidiaDriverVersion' (andycui97-Z10PE-D8-WS:0.0): 367.35
  Attribute 'NvidiaDriverVersion' (andycui97-Z10PE-D8-WS:0[gpu:0]): 367.35

andycui97@andycui97-Z10PE-D8-WS:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  367.35  Mon Jul 11 23:14:21 PDT 2016
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.1)

andycui97@andycui97-Z10PE-D8-WS:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26

andycui97@andycui97-Z10PE-D8-WS:~$ nvidia-smi
Sat Jul 16 17:48:19 2016       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.35                 Driver Version: 367.35                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 0000:03:00.0      On |                  N/A |
| 27%   39C    P5    12W / 151W |    545MiB /  8106MiB |     31%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0       900    G   /usr/lib/xorg/Xorg                             241MiB |
|    0      1556    G   compiz                                         140MiB |
|    0      7455    G   ...s-passed-by-fd --v8-snapshot-passed-by-fd   136MiB |
|    0      9861    G   /home/andycui97/.steam/ubuntu12_32/steam        25MiB |
+-----------------------------------------------------------------------------+
​

So I have a gtx 1070 and I installed cuda 8rc from the runfile for linux 16.04

If I’m not mistaken, my driver version is the absolute latest, literally released a day ago according to
http://www.nvidia.com/download/driverResults.aspx/105343/en-us, so I am confused as to how my CUDA driver version is insufficient.

Any help would be appreciated!

So I have a very similar question to:
What can I do against ‘CUDA driver version is insufficient for CUDA runtime version’?

When I make and run deviceQuery, I get the exact same error:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Here’s my system:

andycui97@andycui97-Z10PE-D8-WS:~$ nvidia-settings -q NvidiaDriverVersion

  Attribute 'NvidiaDriverVersion' (andycui97-Z10PE-D8-WS:0.0): 367.35
  Attribute 'NvidiaDriverVersion' (andycui97-Z10PE-D8-WS:0[gpu:0]): 367.35

andycui97@andycui97-Z10PE-D8-WS:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  367.35  Mon Jul 11 23:14:21 PDT 2016
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.1)

andycui97@andycui97-Z10PE-D8-WS:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26

andycui97@andycui97-Z10PE-D8-WS:~$ nvidia-smi
Sat Jul 16 17:48:19 2016       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.35                 Driver Version: 367.35                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 0000:03:00.0      On |                  N/A |
| 27%   39C    P5    12W / 151W |    545MiB /  8106MiB |     31%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0       900    G   /usr/lib/xorg/Xorg                             241MiB |
|    0      1556    G   compiz                                         140MiB |
|    0      7455    G   ...s-passed-by-fd --v8-snapshot-passed-by-fd   136MiB |
|    0      9861    G   /home/andycui97/.steam/ubuntu12_32/steam        25MiB |
+-----------------------------------------------------------------------------+
​

So I have a gtx 1070 and I installed cuda 8rc from the runfile for linux 16.04

If I’m not mistaken, my driver version is the absolute latest, literally released a day ago according to
http://www.nvidia.com/download/driverResults.aspx/105343/en-us, so I am confused as to how my CUDA driver version is insufficient.

Any help would be appreciated!

When running the CUDA example /usr/local/cuda/samples/1_Utilities/deviceQuery$ with the sudo ./deviceQuery command, the output was :

 ./deviceQuery Starting...
 CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

On using the lspci -v | grep -i command I get :

NVIDIA Corporation GF117M [GeForce 610M/710M/820M / GT 620M/625M/630M/720M] (rev a1)

The lshw -c video command gives :

PCI (sysfs)  


  *-display               
       description: VGA compatible controller
       product: Haswell-ULT Integrated Graphics Controller
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       version: 0b
       width: 64 bits
       clock: 33MHz
       capabilities: vga_controller bus_master cap_list rom
       configuration: driver=i915 latency=0
       resources: irq:63 memory:b5000000-b53fffff memory:c0000000-cfffffff     ioport:6000(size=64)
  *-display
       description: 3D controller
       product: GF117M [GeForce 610M/710M/820M / GT 620M/625M/630M/720M]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:09:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: bus_master cap_list
       configuration: driver=nouveau latency=0
       resources: irq:62 memory:b3000000-b3ffffff memory:a0000000-afffffff memory:b0000000-b1ffffff ioport:3000(size=128)

So might it be that CUDA doesn’t work because the i915 driver is in play instead of the nvidia one ?
If so how do I get this working ?

The last guide I followed to install the nvidia drivers really messed up my system and it needed a reinstall, please suggest a guide that works well for Ubuntu 14.04.

I have the following configuration:

  • SUSE Linux Enterprise Server 12 SP3 (x86_64)
  • CUDA Toolkit: CUDA 9.2 (9.2.148 Update 1)
  • CUDA Driver Version: 396.37

According to NVIDIA just right (https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#major-components).

I set up a new environment with Anaconda and installed tensorflow-gpu in it:

conda create -n keras python=3.6.8 anaconda
conda install -c anaconda tensorflow-gpu

But if I then want to check the installation via python console:

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

I get the following error:

2019-04-17 15:23:45.753926: I
tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports
instructions that this TensorFlow binary was not compiled to use:
SSE4.1 SSE4.2 AVX AVX2 FMA

2019-04-17 15:23:45.793109: I
tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency:
2600180000 Hz

2019-04-17 15:23:45.798218: I
tensorflow/compiler/xla/service/service.cc:150] XLA service
0x561f42601240 executing computations on platform Host. Devices:

2019-04-17 15:23:45.798258: I
tensorflow/compiler/xla/service/service.cc:158] StreamExecutor
device (0): ,

2019-04-17 15:23:45.981727: I
tensorflow/compiler/xla/service/service.cc:150] XLA service
0x561f426ad9b0 executing computations on platform CUDA. Devices:

2019-04-17 15:23:45.981777: I
tensorflow/compiler/xla/service/service.cc:158] StreamExecutor
device (0): Tesla K40c, Compute Capability 3.5

2019-04-17 15:23:45.982175: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0
with properties:

name: Tesla K40c major: 3 minor: 5 memoryClockRate(GHz): 0.745
pciBusID: 0000:06:00.0 totalMemory: 11.17GiB freeMemory: 11.09GiB

2019-04-17 15:23:45.982206: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible
gpu devices: 0

Traceback (most recent call last): File «», line 1, in
File
«/home/fuchs/.conda/envs/keras/lib/python3.6/site-packages/tensorflow/python/client/session.py»,
line 1551, in init
super(Session, self).init(target, graph, config=config) File «/home/fuchs/.conda/envs/keras/lib/python3.6/site-packages/tensorflow/python/client/session.py»,
line 676, in init
self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts) tensorflow.python.framework.errors_impl.InternalError:
cudaGetDevice() failed. Status: CUDA driver version is insufficient
for CUDA runtime version

I’ve been looking for solutions from others with this problem, but for most of them it was because the CUDA Toolkit and Driver version didn’t match. Which is not the case with me.

I’d really appreciate the help.

Понравилась статья? Поделить с друзьями:

Читайте также:

  • Cuda driver api error
  • Cucm database communication error
  • Cube world error could not connect to server
  • Cube acr не удалось осуществить запись ошибка сохранения недоступный аудиоисточник
  • Cube acr не слышно собеседника как исправить самсунг

  • 0 0 голоса
    Рейтинг статьи
    Подписаться
    Уведомить о
    guest

    0 комментариев
    Старые
    Новые Популярные
    Межтекстовые Отзывы
    Посмотреть все комментарии