Runtimeerror cuda error no kernel image is available for execution on the device - Исправление ошибок и поиск оптимальных решений проблем

🐛 Bug

Hi, torch.cuda.is_available() returns True, however I cannot use cuda tensor. I tried to uninstall and install anaconda, nvidia drivers and cudatoolkit. Still the error is same

To Reproduce

I did nothiing special, just installed pytorch on anaconda and execute following commanda

Python 3.7.5 (default, Oct 31 2019, 15:18:51) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.tensor([1.0, 2.0])
tensor([1., 2.])
>>> torch.tensor([1.0, 2.0]).cuda()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:UsersbalciAnaconda3envspytorchlibsite-packagestorchtensor.py", line 130, in __repr__
    return torch._tensor_str._str(self)
  File "C:UsersbalciAnaconda3envspytorchlibsite-packagestorch_tensor_str.py", line 311, in _str
    tensor_str = _tensor_str(self, indent)
  File "C:UsersbalciAnaconda3envspytorchlibsite-packagestorch_tensor_str.py", line 209, in _tensor_str
    formatter = _Formatter(get_summarized_data(self) if summarize else self)
  File "C:UsersbalciAnaconda3envspytorchlibsite-packagestorch_tensor_str.py", line 87, in __init__
    nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
  File "C:UsersbalciAnaconda3envspytorchlibsite-packagestorchfunctional.py", line 227, in isfinite
    return (tensor == tensor) & (tensor.abs() != inf)
RuntimeError: CUDA error: no kernel image is available for execution on the device
>>>

Expected behavior

Run pytorch on gpu

Environment

Collecting environment information...
PyTorch version: 1.3.1
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Microsoft Windows 10 Pro
GCC version: Could not collect
CMake version: Could not collect

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce GTX 780
Nvidia driver version: 441.22
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] numpy==1.17.4
[pip] torch==1.3.1
[pip] torchvision==0.4.2
[conda] blas                      1.0                         mkl  
[conda] mkl                       2019.4                      245  
[conda] mkl-service               2.3.0            py37hb782905_0  
[conda] mkl_fft                   1.0.15           py37h14836fe_0  
[conda] mkl_random                1.1.0            py37h675688f_0  
[conda] pytorch                   1.3.1           py3.7_cuda101_cudnn7_0    pytorch
[conda] torchvision               0.4.2                py37_cu101    pytorch

Process finished with exit code 0

PyTorch Version (e.g., 1.0): 1.3
OS (e.g., Linux): Windows 10 x64
How you installed PyTorch (conda, pip, source): conda
Build command you used (if compiling from source): —
Python version: 3.7.5 (conda)
CUDA/cuDNN version: cudatoolkit=10.1 (conda), I also tried with cudatoolkit 10.1 and 10.2 (https://developer.nvidia.com/cuda-toolkit-archive)
GPU models and configuration: GTX 780
Any other relevant information: —

Additional context

Источник

Содержание

Proper CUDA Error Checking
Introduction
CUDA Error Types
Synchronous Error VS Asynchronous Error
Sticky VS Non-Sticky Error
CUDA Error Checking Best Practice
RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. #3790
Comments
«RuntimeError: CUDA error: invalid configuration argument» when operating on some GPU tensors #48573
Comments
To Reproduce
Expected behavior
Environment
Additional context
RuntimeError: CUDA error: no kernel image is available for execution on the device #31981
Comments
❓ CUDA error with pytorch
Installed pytorch with conda and CUDA support
The error occurs when I am trying to run a tensor on the cuda gpu
>> torch.rand(10).to(torch.device(‘cuda’))
RuntimeError: CUDA error: no kernel image is available for execution on the device
This is the collect_env.py output:
Some have stated that the TORCH_CUDA_ARCH_LIST needs to be manually set to =’3.5′, but I can’t figure out how

Proper CUDA Error Checking

Introduction

Proper CUDA error checking is critical for making the CUDA program development smooth and successful. Missing or incorrectly identifying CUDA errors could cause problems in production or waste lots of time in debugging.

In this blog post, I would like to quickly discuss proper CUDA error checking.

CUDA Error Types

CUDA errors could be separated into synchronous and asynchronous errors, or sticky and non-sticky errors.

Synchronous Error VS Asynchronous Error

CUDA kernel launch is asynchronous, meaning when the host thread reaches the code for kernel launch, say kernel >> , the host thread issues an request to execute the kernel on GPU, then the host thread that launches the kernel continues, without waiting for the kernel to complete. The kernel might not begin to execute right away either.

There could be two types of error for CUDA kernel launch, synchronous error and asynchronous error.

Synchronous error happens when the host thread knows the kernel is illegal or invalid. For example, when the thread block size or grid size is too large, a synchronous error is resulted immediately after the kernel launch call, and this error could be captured by CUDA runtime error capturing API calls, such as cudaGetLastError , right after the kernel launch call.

Asynchronous error happens during kernel execution or CUDA runtime asynchronous API execution on GPU. It might take a while to encounter the error and send the error to host thread. For example, For example, it might encounter accessing invalid memory address in the late stage of kernel execution or CUDA runtime asynchronous API cudaMemcpyAsync execution, it will abort the execution and then send the error back to thread. Even if there are CUDA runtime error capturing API calls, such as cudaGetLastError , right after the kernel launch call, at the time when the error reaches host, those CUDA runtime error capturing API calls have been executed and they found no error. It is possible to capture the asynchronous error by explicitly synchronizing using the CUDA kernel launch using CUDA runtime API calls, such as cudaDeviceSynchronize , cudaStreamSynchronize , or cudaEventSynchronize , and checking the returned error from those CUDA kernel launch using CUDA runtime API calls or capturing the error using CUDA runtime error capturing API calls, such as cudaGetLastError . However, explicitly synchronization usually affects performance and therefore is not recommended for using in production unless it is extremely necessary.

Sticky VS Non-Sticky Error

CUDA runtime API returns non-sticky error if there is any, whereas CUDA kernel execution resulted in sticky error if there is any.

A non-sticky error is recoverable, meaning subsequent CUDA runtime API calls could behave normally. Therefore, the CUDA context is not corrupted. For example, when we allocate memory using cudaMalloc , it will return a non-sticky error if the GPU memory is insufficient.

A sticky error is not recoverable, meaning subsequent CUDA runtime API calls will always return the same error. Therefore, the CUDA context is corrupted, unless the application host process is terminated. For example, when the kernel tries to access invalid memory address during kernel execution, it will result in a sticky error which will be captured and returned by all the subsequent CUDA runtime API calls.

CUDA Error Checking Best Practice

In a CUDA program implementation, both development and production code, always check the return value of each CUDA runtime synchronous or asynchronous API call to see if there is any CUDA synchronous error, always run CUDA runtime error capturing API calls, such as cudaGetLastError , after kernel launch calls to see if there is any CUDA synchronous error. Check CUDA asynchronous error in development by synchronization and error checking after kernel launch calls and disable it in production.

There is a question on the NVIDIA developer forum. Let’s use it as a quiz. Basically, the user has the following code. All calculations are done on the default stream and one thread. The cudaDeviceSynchronize returns cudaSuccess , but the cudaGetLastError call returns an invalid device function error. How would this happen?

cudaGetLastError returns the last error that has been produced by any of the runtime calls in the same host thread and resets it to cudaSuccess . cudaDeviceSynchronize is a CUDA runtime API call and it got no error. This means the kernel launch got no asynchronous error. However, there could be errors from CUDA runtime API calls prior to launching the kernel or the kernel launching encountered synchronous error which have not been properly error-checked. The last error that produced by those would not be reset until the cudaGetLastError call, even though before the reset there were cudaSuccess from other CUDA runtime API calls.

Fundamentally, it was due to that the CUDA program error checking was not following the best practice mentioned previously.

Источник

RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. #3790

I encounter the below error when i try to fine tune the learner in my local jupyter notebook setup. The details of setup specified below in the software details
RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Please see this model example of how to fill out an issue correctly. Please try to emulate that example as appropriate when opening an issue.

Please confirm you have the latest versions of fastai, fastcore, and nbdev prior to reporting a bug (delete one): YES
=== Software ===
python : 3.9.12
fastai : 2.7.9
fastcore : 1.5.26
fastprogress : 1.0.3
torch : 1.12.1+cu102
nvidia driver : 511.65
torch cuda : 10.2 / is available
torch cudnn : 7605 / is enabled

=== Hardware ===
nvidia gpus : 1
torch devices : 1

gpu0 : NVIDIA GeForce GTX 1650

=== Environment ===
platform : Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.31
distro : #1 SMP Fri Apr 2 22:23:49 UTC 2021
conda env : /home/plaban/anaconda3
python : /home/plaban/anaconda3/bin/python
sys.path : /mnt/c/Users/nayak/Documents/boat_classification
/home/plaban/anaconda3/lib/python39.zip
/home/plaban/anaconda3/lib/python3.9
/home/plaban/anaconda3/lib/python3.9/lib-dynload

Describe the bug
RuntimeError: CUDA error

To Reproduce
Steps to reproduce the behavior:

Expected behavior
it should fine tune the learner but instaed throws error

Error with full stack trace

Place between these lines with triple backticks:

Additional context
i am running it in a ubuntu systen (wsl2 for windows)
Add any other context about the problem here.fastai2_error.docx

The text was updated successfully, but these errors were encountered:

Источник

«RuntimeError: CUDA error: invalid configuration argument» when operating on some GPU tensors #48573

When calling some functions like torch::mean() on this gpu tensor, a CUDA runtime error will occur:

Here is the complete output of gdb backtrace (running with CUDA_LAUNCH_BLOCKING=1):

The code at /home/admin/fanyi/Softwares/NN_train/src_O0/struct_DP.h:232 is :

This first happened in my C++ code using PyTorch’s C++ APIs. I have saved this tensor xyz_hat using torch::save() (the attchment gpu_tensor_cpp.tar.gz), then loaded it in python using torch.jit.load. The same error occured when calling torch.mean():

To Reproduce

Steps to reproduce the behavior:

Expected behavior

Operations on tensor a like torch.sum(a) should return the same result as tensor a2, where a2 is just a clone of a, as described above.

Environment

Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).

PyTorch version: 1.7.0+cu110
Is debug build: True
CUDA used to build PyTorch: 11.0
ROCM used to build PyTorch: N/A

OS: CentOS Linux 8 (Core) (x86_64)
GCC version: (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5)
Clang version: Could not collect
CMake version: version 3.11.4

Python version: 3.6 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: A100-SXM4-40GB
GPU 1: A100-SXM4-40GB
GPU 2: A100-SXM4-40GB
GPU 3: A100-SXM4-40GB
GPU 4: A100-SXM4-40GB
GPU 5: A100-SXM4-40GB
GPU 6: A100-SXM4-40GB
GPU 7: A100-SXM4-40GB

Nvidia driver version: 450.80.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.4
[pip3] torch==1.7.0+cu110
[conda] Could not collect

Additional context

This error could be overcome by calling .to(«cpu») first or simply making a clone() of this problematic tensor. But I still would like to understand what actually triggered it. Maybe the data stored in the problematic tensor is broken?

I tried running the same C++ code on anther system with PyTorch 1.4.0 + CUDA 10.1 installed using pip, and found everything goes fine. Here is the environment for that system:
PyTorch version: 1.4.0
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A

OS: CentOS Linux release 7.7.1908 (Core) (x86_64)
GCC version: (GCC) 7.5.0
Clang version: Could not collect
CMake version: version 2.8.12.2

Python version: 3.7 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 10.2.89
GPU models and configuration:
GPU 0: Tesla V100-PCIE-32GB
GPU 1: Tesla V100-PCIE-32GB

Nvidia driver version: 450.51.06
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.4
[pip3] numpydoc==0.9.1
[pip3] torch==1.4.0
[conda] _pytorch_select 0.2 gpu_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.0.130 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl 2019.4 243
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.14 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] numpy 1.17.2 py37haad9e8e_0
[conda] numpy-base 1.17.2 py37hde5b4d6_0
[conda] numpydoc 0.9.1 py_0
[conda] pytorch 1.3.1 cuda100py37h53c1284_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main

The text was updated successfully, but these errors were encountered:

Источник

RuntimeError: CUDA error: no kernel image is available for execution on the device #31981

❓ CUDA error with pytorch

Installed pytorch with conda and CUDA support

The error occurs when I am trying to run a tensor on the cuda gpu

>> torch.rand(10).to(torch.device(‘cuda’))

RuntimeError Traceback (most recent call last)

/anaconda3/envs/deep/lib/python3.7/site-packages/IPython/core/formatters.py in call(self, obj)
700 type_pprinters=self.type_printers,
701 deferred_pprinters=self.deferred_printers)
—> 702 printer.pretty(obj)
703 printer.flush()
704 return stream.getvalue()

/anaconda3/envs/deep/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
397 if cls is not object
398 and callable(cls.dict.get(‘repr‘)):
—> 399 return _repr_pprint(obj, self, cycle)
400
401 return _default_pprint(obj, self, cycle)

/anaconda3/envs/deep/lib/python3.7/site-packages/IPython/lib/pretty.py in repr_pprint(obj, p, cycle)
687 «»»A pprint that just redirects to the normal repr function.»»»
688 # Find newlines and replace them with p.break()
—> 689 output = repr(obj)
690 for idx,output_line in enumerate(output.splitlines()):
691 if idx:

/anaconda3/envs/deep/lib/python3.7/site-packages/torch/tensor.py in repr(self)
128 # characters to replace unicode characters with.
129 if sys.version_info > (3,):
—> 130 return torch._tensor_str._str(self)
131 else:
132 if hasattr(sys.stdout, ‘encoding’):

/anaconda3/envs/deep/lib/python3.7/site-packages/torch/_tensor_str.py in _str(self)
309 tensor_str = _tensor_str(self.to_dense(), indent)
310 else:
—> 311 tensor_str = _tensor_str(self, indent)
312
313 if self.layout != torch.strided:

/anaconda3/envs/deep/lib/python3.7/site-packages/torch/_tensor_str.py in _tensor_str(self, indent)
207 if self.dtype is torch.float16 or self.dtype is torch.bfloat16:
208 self = self.float()
—> 209 formatter = _Formatter(get_summarized_data(self) if summarize else self)
210 return _tensor_str_with_formatter(self, indent, formatter, summarize)
211

/anaconda3/envs/deep/lib/python3.7/site-packages/torch/_tensor_str.py in init(self, tensor)
85
86 else:
—> 87 nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
88
89 if nonzero_finite_vals.numel() == 0:

/anaconda3/envs/deep/lib/python3.7/site-packages/torch/functional.py in isfinite(tensor)
225 if not tensor.is_floating_point():
226 return torch.ones_like(tensor, dtype=torch.bool)
—> 227 return (tensor == tensor) & (tensor.abs() != inf)
228
229

RuntimeError: CUDA error: no kernel image is available for execution on the device

This is the collect_env.py output:

Collecting environment information.
PyTorch version: 1.3.1
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 16.04.6 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1

16.04.12) 5.4.0 20160609
CMake version: version 3.5.1

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration: GPU 0: GeForce 920M
Nvidia driver version: 415.27
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] numpy==1.17.4
[pip] numpy-stl==2.10.1
[pip] torch==1.3.1
[pip] torchvision==0.4.2
[conda] blas 1.0 mkl
[conda] mkl 2019.4 243
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] pytorch 1.3.1 py3.7_cuda10.0.130_cudnn7.6.3_0 pytorch
[conda] torchvision 0.4.2 py37_cu100 pytorch

Some have stated that the TORCH_CUDA_ARCH_LIST needs to be manually set to =’3.5′, but I can’t figure out how

The text was updated successfully, but these errors were encountered:

#31981 (comment)
Hi — I am having some similar issues: untimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

My GPU is GTX 730 and I have cuda 11.4 installed in it . I have built pytorch from the source but still it is not working.

@rajeshroy402 There are two variants for GT 730. As listed in https://developer.nvidia.com/cuda-gpus, one has arch 3.5 while the other has arch 2.1. The latter one is apparently unsupported.

@rajeshroy402 There are two variants for GT 730. As listed in https://developer.nvidia.com/cuda-gpus, one has arch 3.5 while the other has arch 2.1. The latter one is apparently unsupported.

I will share you the specs.
I have Nvidia driver 470.x.x with Cuda 11.4.x installed for my UBUNTU 20.04 LTS system.
I didn’t find conda install -c pytorch magma-cuda114 so went with conda install -c pytorch magma-cuda112

Rest all the installation was smooth.
Still, if I try to run something, I get the below-quoted errors:

How did you do the build?

I followed this document — https://github.com/pytorch/pytorch/#from-source and build it.
I will share you the specs.
I have Nvidia driver 470.x.x with Cuda 11.4.x installed for my UBUNTU 20.04 LTS system.
I didn’t find conda install -c pytorch magma-cuda114 so went with conda install -c pytorch magma-cuda112
Rest all the installation was smooth.
Still, if I try to run something, I get the below-quoted errors:

How did you do the build?

I followed this link: https://github.com/pytorch/pytorch/#from-source
Installed Cuda 11.4 and Cudnn 11.x or higher then through anaconda — installed the required packages for ubuntu. Later cloned the repo and followed the commands till setup.py.

@rajeshroy402 You mean that you built against the master branch of PyTorch? I don’t think it is supported now.

@rajeshroy402 You mean that you built against the master branch of PyTorch? I don’t think it is supported now.

Yes. I used master branch.
Do you suggest following this — git checkout v1.3.1 ?

@rajeshroy402 Maybe you could try this one first. But v1.3.1 only support CUDA 10.1. BTW, I don’t know the last version that support arch 3.5.

If it supports CUDA 10.1 then with my CUDA 11.4 it should throw errors? What type of errors should that be?
As per Wikipedia the supported cuda version for a Kepler architecture is shown below.

@rajeshroy402 Maybe you could try this one first. But v1.3.1 only support CUDA 10.1. BTW, I don’t know the last version that support arch 3.5.

I tried the commands but I’m getting errors —

Источник

1. Computer Configuration

GPU 3080 arithmetic 8.6
CUDA 11.1
CUDNN 8.2.0
conda 4.9.2
python 3.8.5

2. Description of the problem

First in pytroch website Use pip command according to computer configuration

pip3 install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

Install torch, torchaudio, torchvision, and cudatoolkit with the following versions:

torch==1.8.1+cu111
torchaudio==0.8.1
torchvision==0.9.1+cu111

cuda was not tested for availability at that time, neglect!!!After you install it, you must first test whether cuda is available in the current environment, then use it on your own script, so that you don’t know what the problem is after you get it!
——————
I used the first instance code given by mmdetection directly with the following error:

ibtorch_cuda_cu.so: cannot open shared object file: No such file or directory

The online search said that pytorch might be too high, but it was foolish not to check if cuda was available at that time!!
The torch is uninstalled directly, and as installed with the pip command, the uninstall command is as follows:

pip uninstall torch
# Will prompt to uninstall with torch1.8.1+cu111
# I clicked OK, it looks like this command uninstalled the cudatoolkit as well

So I reinstalled torch1.8.0 and cudatoolkit11.1

pip install torch==1.8.0 cudatoolkit==11.1

After successful installation, test the availability of cuda with the following code:

import torch
torch.cuda.is_available()
# Return True and test further with the following code
torch.zeros(1).cuda()

Start error as follows:

GeForce RTX 3080 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the GeForce RTX 3080 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
# The following errors were also reported
RuntimeError: CUDA error: no kernel image is available for execution on the device

This roughly means that the current GPU’s arithmetic does not match the current version of Pytorch-dependent CUDA arithmetic (3080 arithmetic is 8.6, whereas the current version of Pytorch-dependent CUDA arithmetic only supports 3.7, 5.0, 6.0, 7.0).
——————
Reference resources : Low versions of GPUs can run under relatively high CUDA versions, for example: GPUs with an arithmetic of 8.0 can run under a CUAD with an arithmetic of 8.6, but not vice versa.A graphics card with the same arithmetic power of 8.x cannot run on a version of CUDA that supports the highest arithmetic power of 7.x.
So you should try to install a higher version of CUDA, but what I’m asking is that I’ve installed almost the latest version and don’t support it yet?
——————
So go online to collect the current torch, cuda adapter information, see Blog
——————
Next Step Full Unload Reload

conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge

After installation, the following is true:

pytorch==1.8.0       py3.8_cuda11.1_cudnn8.0.5_0
torchaudio==0.8.0 
torchvision==0.9.0

Test the command again to see if cuda is available

The test was successful!!cuda is ready!!

Summary: Only cuda11.1 supports an 8.6 GPU, so don’t make a mistake!!!Of course, if you have a lower version of pytorch, you should also have a lower version of cudatoolkit.Specifically, I haven’t tried. Try creating a new environment sometime!Also, to know your GPU’s power, NVIDIA can do it here See

Источник

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Proper CUDA Error Checking

Introduction

CUDA Error Types

Synchronous Error VS Asynchronous Error

Sticky VS Non-Sticky Error

CUDA Error Checking Best Practice

RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. #3790

«RuntimeError: CUDA error: invalid configuration argument» when operating on some GPU tensors #48573

To Reproduce

Expected behavior

Environment

Additional context

RuntimeError: CUDA error: no kernel image is available for execution on the device #31981

❓ CUDA error with pytorch

Installed pytorch with conda and CUDA support

The error occurs when I am trying to run a tensor on the cuda gpu

>> torch.rand(10).to(torch.device(‘cuda’))

RuntimeError: CUDA error: no kernel image is available for execution on the device

This is the collect_env.py output:

Some have stated that the TORCH_CUDA_ARCH_LIST needs to be manually set to =’3.5′, but I can’t figure out how

1. Computer Configuration

2. Description of the problem

Читайте также: