Runtimeerror cuda error an illegal memory access was encountered

Hi,everyone! I met a strange illegal memory access error. It happens randomly without any regular pattern. The code is really simple. It is PointNet for point cloud segmentation. I don't think ...

Hi,everyone!
I met a strange illegal memory access error. It happens randomly without any regular pattern.
The code is really simple. It is PointNet for point cloud segmentation. I don’t think there is anything wrong in the code.

import torch
import torch.nn as nn
import torch.nn.functional as F
import os
class InstanceSeg(nn.Module):
    def __init__(self, num_points=1024):
        super(InstanceSeg, self).__init__()

        self.num_points = num_points

        self.conv1 = nn.Conv1d(9, 64, 1)
        self.conv2 = nn.Conv1d(64, 64, 1)
        self.conv3 = nn.Conv1d(64, 64, 1)
        self.conv4 = nn.Conv1d(64, 128, 1)
        self.conv5 = nn.Conv1d(128, 1024, 1)
        self.conv6 = nn.Conv1d(1088, 512, 1)
        self.conv7 = nn.Conv1d(512, 256, 1)
        self.conv8 = nn.Conv1d(256, 128, 1)
        self.conv9 = nn.Conv1d(128, 128, 1)
        self.conv10 = nn.Conv1d(128, 2, 1)
        self.max_pool = nn.MaxPool1d(num_points)

    def forward(self, x):
        batch_size = x.size()[0] # (x has shape (batch_size, 9, num_points))

        out = F.relu(self.conv1(x)) # (shape: (batch_size, 64, num_points))
        out = F.relu(self.conv2(out)) # (shape: (batch_size, 64, num_points))
        point_features = out

        out = F.relu(self.conv3(out)) # (shape: (batch_size, 64, num_points))
        out = F.relu(self.conv4(out)) # (shape: (batch_size, 128, num_points))
        out = F.relu(self.conv5(out)) # (shape: (batch_size, 1024, num_points))
        global_feature = self.max_pool(out) # (shape: (batch_size, 1024, 1))

        global_feature_repeated = global_feature.repeat(1, 1, self.num_points) # (shape: (batch_size, 1024, num_points))
        out = torch.cat([global_feature_repeated, point_features], 1) # (shape: (batch_size, 1024+64=1088, num_points))

        out = F.relu(self.conv6(out)) # (shape: (batch_size, 512, num_points))
        out = F.relu(self.conv7(out)) # (shape: (batch_size, 256, num_points))
        out = F.relu(self.conv8(out)) # (shape: (batch_size, 128, num_points))
        out = F.relu(self.conv9(out)) # (shape: (batch_size, 128, num_points))

        out = self.conv10(out) # (shape: (batch_size, 2, num_points))

        out = out.transpose(2,1).contiguous() # (shape: (batch_size, num_points, 2))
        out = F.log_softmax(out.view(-1, 2), dim=1) # (shape: (batch_size*num_points, 2))
        out = out.view(batch_size, self.num_points, 2) # (shape: (batch_size, num_points, 2))

        return out

Num = 0
network = InstanceSeg()
network.cuda()
while(1):

    input0 = torch.randn(32, 3, 1024).cuda()
    input1 = torch.randn(32, 3, 1024).cuda()
    input2 = torch.randn(32, 3, 1024).cuda()
    input = torch.cat((input0, input1, input2), 1)

    out = network(input)
    Num = Num+1
    print(Num)

After random number of steps, error raises. The error report is

Traceback (most recent call last):
  File "/home/wangye/Frustum-PointNet_Test/frustum_pointnet.py", line 58, in <module>
    input0 = torch.randn(32, 3, 1024).cuda()
RuntimeError: CUDA error: an illegal memory access was encountered

When I added «os.environ[‘CUDA_LAUNCH_BLOCKING’] = ‘1’» at the top of this script, the error report was changed to this

Traceback (most recent call last):
  File "/home/wangye/Frustum-PointNet_Test/frustum_pointnet.py", line 64, in <module>
    out = network(input)
  File "/home/wangye/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/wangye/Frustum-PointNet_Test/frustum_pointnet.py", line 35, in forward
    out = F.relu(self.conv5(out)) # (shape: (batch_size, 1024, num_points))
  File "/home/wangye/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/wangye/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 187, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

I know some wrong indexing operations and some wrong usage method of loss function may lead to illegal memory access error. But in this script, there is no such kind of operation.
I am quite sure this error is not because of out of memory since only about 2G GPU memory is used, and I have totally 12G GPU memory.

This is my environment information:

OS: Ubuntu 16.04 LTS 64-bit
Command: conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
GPU: Titan XP
Driver Version: 410.93
Python Version: 3.6
cuda Version: cuda_9.0.176_384.81_linux
cudnn Version: cudnn-9.0-linux-x64-v7.4.2.24
pytorch Version: pytorch-1.0.1-py3.6_cuda9.0.176_cudnn7.4.2_2

I have been stuck here for long time.
In fact, not only this project faces this error, many other projects face similar error in my computer.
I don’t think there is anything wrong in the code. It can run correctly for some steps. Maybe this error is because the environment. I am not sure.
Does anyone have any idea about this situation? If more detailed information is needed, please let me know.
Thanks for any suggestion.

Relatively new to using CUDA. I keep getting the following error after a seemingly random period of time:
RuntimeError: CUDA error: an illegal memory access was encountered

I have seen people suggest things such as using cuda.set_device() rather than cuda.device(), setting torch.backends.cudnn.benchmark = False

but I can’t seem to get the error to go away. Here are some pieces of my code:
torch.cuda.set_device(torch.device('cuda:0'))
torch.backends.cudnn.benchmark = False

class LSTM(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
        super(LSTM, self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers

        self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True, dropout=0.2)

        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_().cuda()
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_().cuda()

        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))

        out = self.fc(out[:, -1, :]) 
        
        return out

    def pred(self, x):
        return self(x) > 0

def train(model, loss_fn, optimizer, num_epochs, x_train, y_train, x_val, y_val, loss_stop=60):
    cur_best_loss = 999
    loss_recur_count = 0
    best_model = None
    for t in range(num_epochs):
        model.train()

        y_train_pred = model(x_train)

        train_loss = loss_fn(y_train_pred, y_train)

        tr_l = train_loss.item()
        
        optimizer.zero_grad()

        train_loss.backward()
        optimizer.step()

        model.eval()
        with torch.no_grad():  
            y_val_pred = model(x_val)

            val_loss = loss_fn(y_val_pred, y_val)

            va_l = val_loss.item()
            
            if va_l < cur_best_loss:
                cur_best_loss = va_l
                best_model = model
                loss_recur_count = 0
            else:
                loss_recur_count += 1

        if loss_recur_count == loss_stop:
            break
    if best_model is None:
        print("model is None.")
    return best_model
def lstm_test(cols, df, test_percent, test_bal, initial_shares_test, max_price, last_sell_day):
    wdw = 20
    x_train, y_train, x_test, y_test, x_val, y_val = load_data(df, wdw, test_percent, cols)

    x_train = torch.from_numpy(x_train).type(torch.Tensor).cuda()
    x_test = torch.from_numpy(x_test).type(torch.Tensor).cuda()
    x_val = torch.from_numpy(x_val).type(torch.Tensor).cuda()
    y_train = torch.from_numpy(y_train).type(torch.Tensor).cuda()
    y_test = torch.from_numpy(y_test).type(torch.Tensor).cuda()
    y_val = torch.from_numpy(y_val).type(torch.Tensor).cuda()

    input_dim = x_train.shape[-1]
    hidden_dim = 32
    num_layers = 2
    output_dim = 1
    y_preds_dict = {}
    for i in range(11):
        model = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers).cuda()

        r = (y_train.cpu().shape[0] - np.count_nonzero(y_train.cpu()))/np.count_nonzero(y_train.cpu())/2
        pos_w = torch.tensor([r]).cuda()

        loss_fn = torch.nn.BCEWithLogitsLoss(pos_weight=pos_w).cuda()

        optimizer = torch.optim.AdamW(model.parameters(), lr=0.01)

        best_model = train(model, loss_fn, optimizer, 300, x_train, y_train, x_val, y_val)
        
        y_test_pred = get_predictions(best_model, x_test)
        y_preds_dict[i] = y_test_pred.cpu().detach().numpy().flatten()

and here is the error msg:

<ipython-input-5-c52edc2c0508> in train(model, loss_fn, optimizer, num_epochs, x_train, y_train, x_val, y_val, loss_stop)
     19         model.eval()
     20         with torch.no_grad():
---> 21             y_val_pred = model(x_val)
     22 
     23             val_loss = loss_fn(y_val_pred, y_val)

~anaconda3libsite-packagestorchnnmodulesmodule.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

<ipython-input-4-9da8c811c037> in forward(self, x)
     10 
     11     def forward(self, x):
---> 12         h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_().cuda()
     13         c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_().cuda()
     14 

RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Explain the function of static keyword and final keyword in Java in detail>>>

Python run error: runtimeerror: cudnn error: cudnn_ STATUS_ INTERNAL_ ERROR

Solution:

Add:

torch.cuda.set_device(0)

Nan solution for training RNN network loss

(1) The cause of gradient explosion can be solved by gradient ruling

GRAD_CLIP = 5
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), GRAD_CLIP)
optimizer.step()

(2) Testmodel and evaluate

with torch.no_grad():

(3) Lower the learning rate

RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 ‘self’ in call to _ th_ addmm

There are three places in the code that need CUDA () conversion:

Whether the model is placed on CUDA model = model. To (device)

Whether the input data is put on CUDA data = data. To (device)

Whether the new tensor in the model is placed on CUDA P = torch. Tensor ([1]). To (device) 0

In the first article, model = model. To (device) is only instantiated in model__ init__() If instantiated in forward and used directly, the model will not be placed in CUDA

Here is an error code:

import torch
import torch.nn as nn


data = torch.rand(1, 10).cuda()


class TestMoule(nn.Module):
    def __init__(self):
        super(TestMoule, self).__init__()
        # self.linear = torch.nn.Linear(10, 2)

    def forward(self, x):
        # return self.linear(x)
        return torch.nn.Linear(10, 2)(x)


model = TestMoule()
model = model.cuda()

print(model(data))

RuntimeError: CUDA error: an illegal memory access was encountered

One of the above problems is that some functions under the NN module pass in GPU type data, with the following error code:

import torch

data = torch.randn(1, 10).cuda()

layernorm = torch.nn.LayerNorm(10)
# layernorm = torch.nn.LayerNorm(10).cuda()

re_data = layernorm(data)
print(re_data)

RuntimeError: CUDA error: device-side assert triggered

The category target of classification is not one-to-one corresponding to the softmax value of model output

Targets is a value of 1-3, but softmax calculates a value of 0-2, so the above error is prompted

df = pd.read_csv('data/reviews.csv')

def to_sentiment(score):
    score = int(score)
    if score <= 2:
        return 0
    elif score == 3:
        return 1
    else:
        return 2

df['sentiment'] = df.score.apply(to_sentiment)

Similar Posts:

  • Mark

    • Kaiten10

      I had the same issue, I fixed it by installing Python 3.9 instead of 3.10. It seems PyTorch doesn’t work with 3.10 yet

  • Paul Koen

    download 404
    you are no longer hosting the project download; common sense workarounds such as removing the filename from the download url and attempting to visit the index page for the hoster – tried and failed.

    thanks.

  • Kanumann

    hi there, thank you for your effort.
    my problem with your approach is:
    – use_cpu = False results in the harsh realization that my GPU is not good enough to run even a 512×512 image (nvidia 1070 ti)
    but setting: use_cpu = False results in an error that i can not fix with my limited knowledge of torch.

    Before we might go into detail, just generally asking: the CPU mode is running for you? Or is this a problem on my side? (Ubuntu machine)

    Very

    • I need the error it gives you, otherwise it’s hard to fix. I don’t run on CPU, it takes a VERY long time, but if editing the use_cpu=False line (226) in the code to use_cpu=True doesn’t fix it, you will need to find line 1032 and edit the whole line to “fp16 = False” and that should fix everything.

    • With cpu only i can render, but the gpu i have no idea what torch is and how to make it work, i get this error only now, any help is appreciated.
      AssertionError: Torch not compiled with CUDA enabled

        • Hi There

          I am getting this Error “AssertionError: Torch not compiled with CUDA enabled”

          I have followed the instruction on Start Locally Pytorch page. The ironic thing is I did a reinstall of windows 10 today as I need to adjust how I was using my system so a clean install was the way to go. but yesterday on the “old” install everything was working perfectly.

          System is a i9, RTX 3090.
          running CUDA 11.6
          Python 3.9

          I have uninstalled PyTorch, purged the cache and done reinstalls. but to no avail, though really its likely I am doing something dumb.

          • Thanks so far for your help, I am still getting this.
            AssertionError: Torch not compiled with CUDA enabled

            Tried this to force reinstall (WSC didn’t like your version)
            pip install –force-reinstall torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu113

            Did an entirely clean build of the PC (Its the weekend and I had nothing better to do 🙂 ). In the order I installed thngs

            Installed Windows 10 Home Edition – then did all the updated to 21H2
            Installed Visual Studio
            Installed Nvidia Cuda SDK 11.3
            Installed Python 3.9

            Installed Windows Studio Code:
            Opened as Admin
            Loaded main.py
            Python3 -m venv venv
            had a permission issue even though this accound is a adm “Get-ExecutionPolicy” changed to “set-ExecutionPolicy remotesigned”

            from cmd :
            Python -m pip install –upgrade pip

            python3 -m pip install lpips
            python3 -m pip install ftfy
            python3 -m pip install regex
            python3 -m pip install matplotlib
            python3 -m pip install ipywidgets

            Python3 -m pip3 install torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu113

            Ran Python3 main.py errored out “AssertionError: Torch not compiled with CUDA enabled”

            Used the force reinstall
            pip install –force-reinstall torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu113

            Ran again the Python3 main.py and it still errored out “AssertionError: Torch not compiled with CUDA enabled”

            I know your setup really works 🙂 just going a bit mental that, its doing this.

          • Other than that, I actually don’t know, hopefully someone else can help. I’d also suggest asking the the pytorch subreddit

          • AA possible solution

            AssertionError: Torch not compiled with CUDA enabled

            Installed Cuda 11.3, still got the issue . but also had another error “WARNING: Ignoring invalid distribution -orch”

            So the solution to that one which seems to have had an onflow effect on the Cuda issue…

            Check in the “Venv/lib/site packages” for anything that looks like “~Whatever” folder and delete it.

            re run part 8, 9 and 10 and you should … in theory be up and running

            Thanks Eliso 🙂 You help, patience and excellent install is just brilliant 🙂

  • Hi,
    im new to python and VS and all and am having an error where Line 164 gives me- ModuleNotFoundError: No module named ‘timm’
    Where do i have to place the main folder to begin with (directory?) I just placed all the content of the main folder inside the- C:UsersMyName folder and installed everything somehow.
    /I’m using windows 7 with python 3 .7 .7
    -python3 main.py does not work for me so I tried- python main.py and the things worked.
    I would be grateful if you answer my questions,
    thank you!

    • Type “python3 -m pip install timm” in the terminal.

      • Thanks for the reply eliso,
        The term ‘python3’ is not recognized as the name of a cmdlet and so on….
        that is why i have been typing “python” only. Win7 wont see python3. and i did have the variable inserted to the Path section of the Win variables “;C:Program FilesPython37;C:Program FilesPython37Scripts”

  • Debugger said: line 191- Exception has occurred: ModuleNotFoundError
    No module named ‘timm’
    File “C:UsersAndimain.py”, line 191, in
    import timm

  • I think i got it, i need to install
    pip install lpips
    pip install ipywidgets
    pip install pips
    pip install matplotlib
    and god knows what else..i thought they were all installed and ready to use but nope

  • I got everything but this to fix
    AssertionError: Torch not compiled with CUDA enabled
    Any ideas?

  • i need visual studio before that right? My cuda now is 9.0.176 and not listed on this site. Am i f***d

    • CUDA 9 is pretty old, you should try updating.

  • Kaiten10

    Hi! I’d like some pointers on how to fix the CUDA out of memory issue. It managed to run until the moment it generates the first iteration, but it returns this error:

    RuntimeError: CUDA out of memory. Tried to allocate 1.76 GiB (GPU 0; 8.00 GiB total capacity; 5.12 GiB already allocated; 1.05 GiB free; 5.37 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

    • For an 8GB GPU, you really can’t run much, you will need to go to line 1131 and change it to “width_height = [448, 448]”

      • Kaiten10

        Thanks, I played around with the settings and managed to get up to 1280×640 by only loading one CLIP model (ViTB32). Could still push some awesome art with this limited hardware. Really appreciate the guide!

        • minhtran

          Hi Kaiten10,
          Can you please show your setting for your notebook, especially to cuda memory management. I only have quadroP1000 with 4GB VRAM but still want to play around with disco. Thanks in advance!

  • Martin

    I can’t go past step 5. I have installed python3 (correct bit, os, all that) but when I type “python3 -m venv venv” in VS code I get this error:

    Python was not found; run without arguments to install from the Microsoft Store, or disable this
    shortcut from Settings > Manage App Execution Aliases.

    What can be done?

    • When installing, you need to select “add to PATH” (or similar text), if you did that, you may need to restart your computer, if that also fails, you need to install python from the microsoft store instead.

      • Martin

        I appreciate the answers eliso & andon. I added to PATH and typed “pyton” (removing the 3). This worked. However immediately an error on step 6. Using Windows, I type: source .venv/Scripts/activate.ps1

        Get this error:

        source : The term ‘source’ is not recognized as the name of a cmdlet, function, script file, or operable program. Check the
        spelling of the name, or if a path was included, verify that the path is correct and try again.
        At line:1 char:1
        + source .venv/Scripts/activate.ps1
        + ~~~~~~
        + CategoryInfo : ObjectNotFound: (source:String) [], CommandNotFoundException
        + FullyQualifiedErrorId : CommandNotFoundException

        • Rachel

          Did you ever find a solution to this?

          • Martin

            No, sorry. I never got past this issue. I moved on to colab DDv5 (and 5.1) and JAX instead.

        • JacksonTB

          • Chris

            Run powershell or commandline as admin and type in this line
            Set-ExecutionPolicy RemoteSigned
            answer with Y for yes

            after that you can run
            venvScriptsactivate

          • Another option is to use “venvScriptsActivate.ps1” instead of “venvScriptsactivate”

    • i used “python” without the 3, and it worked (using CPU)

  • Sey

    Hi Eliso,
    Thanks for the great job here!

    I’m stuck at the step 11, when I type python3 main.py in the terminal.
    The terminal says python3.exe can’t open file ‘C:Usersmynamemain.py’: : [Errno 2] No such file or directory

    Is there a way to specify directly the folder where main.py is stored? Or is it something else I messed up?

    Thanks a lot for your time!

    • Type “cd directoryname” replacing directoryname with the folder the main python file is in.

  • Amir

    Hi Eliso,

    Could you please help me. I’m new to coding.

    I’m stuck at ‘2.1 Install and import dependencies’

    I got an error:
    ModuleNotFoundError Traceback (most recent call last)
    s:Program FilesDDmainmain.py in ()
    189 import io
    190 import math
    —> 191 import timm
    192 #from IPython import display
    193 import lpips

    ModuleNotFoundError: No module named ‘timm’

    • Amir

      I have dealt with it.

      But there is another problem which I can’t find the solution for on internet.

      NameError Traceback (most recent call last)
      s:Program FilesDDmainmain.py in ()
      1074 model_default = model_config[‘image_size’]
      1078 if secondary_model_ver == 2:
      –> 1079 secondary_model = SecondaryDiffusionImageNet2()
      1080 secondary_model.load_state_dict(torch.load(f'{model_path}/secondary_model_imagenet_2.pth’, map_location=’cpu’))
      1081 secondary_model.eval().requires_grad_(False).to(device)

      NameError: name ‘SecondaryDiffusionImageNet2’ is not defined

      • Amir

        I executed ‘python main.py’ and got the error:

        File “S:Program FilesDDmainmain.py”, line 1080, in
        secondary_model.load_state_dict(torch.load(f'{model_path}/secondary_model_imagenet_2.pth’, map_location=’cpu’))
        TypeError: load_state_dict() missing 1 required positional argument: ‘state_dict’

        Could you help please.

  • Seb

    I’m trying to run this on an RTX 2070 (8GB VRAM; not a lot), and I scaled down `width_height` to just `[64, 64]` but I’m still getting the following error:

    “`RuntimeError: CUDA out of memory. Tried to allocate 24.00 MiB (GPU 0; 7.79 GiB total capacity; 4.93 GiB already allocated; 46.00 MiB free; 5.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF“`

    Is there anything I can do about this, or do I just need a better GPU?

    • You can’t do much at all with 8GB VRAM

  • Ken

    I started using DD on my PC using my 3090 and was working well for a while. Now I loaded it up for another go and I get this error when trying to run it:

    OSError: [WinError 193] %1 is not a valid Win32 application. Error loading “C:UserssteyrAppDataLocalPackagesPythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0LocalCachelocal-packagesPython39site-packagestorchlibcudnn_cnn_infer64_8.dll” or one of its dependencies.

    Any ideas?

  • Line3

    I have ModuleNotFoundError: No module named ‘PIL’
    and I have installed Pillow==9.0.1
    pip==22.0.4

    • Try installing it both globally and in the venv

  • elliott

    hey getting this error on step 6

    PS C:DiscoDiffusionmain> ./venv/Scripts/activate.ps1
    ./venv/Scripts/activate.ps1 : File C:DiscoDiffusionmainvenvScriptsActivate.ps1 cannot be loaded because running scripts is disabled on this system.
    For more information, see about_Execution_Policies at https:/go.microsoft.com/fwlink/?LinkID=135170.
    At line:1 char:1
    + ./venv/Scripts/activate.ps1
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : SecurityError: (:) [], PSSecurityException
    + FullyQualifiedErrorId : UnauthorizedAccess

    • Use the bat script in that folder instead then, it’s called activate.bat

  • gao

    Traceback (most recent call last):
    File “c:WindowsSystem32mainmain.py”, line 1084, in
    if ViTB32 is True: clip_models.append(clip.load(‘ViT-B/32’, jit=False)[0].eval().requires_grad_(False).to(device))
    AttributeError: module ‘clip’ has no attribute ‘load’

    • Try “pip3 uninstall clip” then I can help with the next step.

      • esteban

        hello, I am having the same problem now…uninstalling clip did not solve that…

        • Other than that, I don’t actually know

  • Rasoul

    Thanks again for the documentations.
    Do I have to install Visual Studio and CUDA toolkit for this?

    When I’m running main.py I get this!

    RuntimeError: CUDA error: unknown error
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    • That is actually an out of memory error (very confusing error), if it isn’t an out of memory error, it could also mean wrong overclocking.

      • Rasoul

        Thanks for the reply,
        I have RTX3080 and 64G Ram
        And I don’t do overclocking. Do you know if there is anyway to terminate any running cuda before using this!? or something to free CUDA with!
        I’m not running any other app!

        • VRAM memory, not RAM, and to remove all VRAM, type “nvidia-smi” and then, for each number it shows you on the left, put “kill number” replacing number with the code listed.

  • 沉木

    i met mistakes on step3
    here’s the code
    【Checking 512 Diffusion File
    512 Model SHA matches
    Checking Secondary Diffusion File
    Secondary Model SHA matches
    —————————————————————————
    NameError Traceback (most recent call last)
    in ()
    136
    137 if secondary_model_ver == 2:
    –> 138 secondary_model = SecondaryDiffusionImageNet2()
    139 secondary_model.load_state_dict(torch.load(f'{model_path}/secondary_model_imagenet_2.pth’, map_location=’cpu’))
    140 secondary_model.eval().requires_grad_(False).to(device)

    NameError: name ‘SecondaryDiffusionImageNet2’ is not defined】

    • I haven’t actually seen this error before, hopefully someone else can help you, maybe try posting in the disco diffusion reddit.

  • Berzerk

    hello, i have that error on step 2 on DD with colab pro:

    SSLCertVerificationError Traceback (most recent call last)

    /usr/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
    1349 h.request(req.get_method(), req.selector, req.data, headers,
    -> 1350 encode_chunked=req.has_header(‘Transfer-encoding’))
    1351 except OSError as err: # timeout error

    21 frames

    SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1091)

    During handling of the above exception, another exception occurred:

    URLError Traceback (most recent call last)

    /usr/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
    1350 encode_chunked=req.has_header(‘Transfer-encoding’))
    1351 except OSError as err: # timeout error
    -> 1352 raise URLError(err)
    1353 r = h.getresponse()
    1354 except:

    URLError:

    • We just need to wait for pytorch to fix it.

  • Wolfgang

    As of yesterday on DD 5.0 I cant get past the cell to define necessary functions. seems things to be missing now

    —————————————————————————
    ModuleNotFoundError Traceback (most recent call last)
    in ()
    4
    5 import pytorch3d.transforms as p3dT
    —-> 6 import disco_xform_utils as dxf
    7
    8 def interp(t):

    /content/disco_xform_utils.py in ()
    1 import torch, torchvision
    —-> 2 import py3d_tools as p3d
    3 import midas_utils
    4 from PIL import Image
    5 import numpy as np

    ModuleNotFoundError: No module named ‘py3d_tools’

    —————————————————————————
    NOTE: If your import is failing due to a missing package, you can
    manually install dependencies using either !pip or !apt.

    To view examples of installing some common dependencies, click the
    “Open Examples” button below.
    —————————————————————————

    • That’s a new change, you should ask in the subreddit or discord.

      • Wolfgang

        Thank you ill ask there

  • mir

    I’m trying to run it localy but there’s something missing, and I have no clue what should I do:

    Cloning into ‘pytorch3d-lite’…
    fatal: destination path ‘ResizeRight’ already exists and is not an empty directory.
    Cloning into ‘MiDaS’…
    Cloning into ‘disco-diffusion’…
    ‘wget’ is not recognized as an internal or external command,
    operable program or batch file.
    —————————————————————————
    ModuleNotFoundError Traceback (most recent call last)
    c:UserscDownloadsmainmainCopia_di_{WSL}_Disco_Diffusion_v5_[w_3D_animation].ipynb Cell 11′ in
    80 from dataclasses import dataclass
    81 from functools import partial
    —> 82 import cv2
    83 import pandas as pd
    84 import gc

    Thanks

    • You need to install wget for windows, then restart your computer, then try.

      • mir

        Thanks Eliso,

        I installed wget and checked if it works, but I still get some errors:
        —————————————————————————
        ModuleNotFoundError Traceback (most recent call last)
        c:UserscedomirDownloadsmainmainCopia_di_Disco_Diffusion_v5_1_[w_Turbo].ipynb Cell 13′ in
        59 from dataclasses import dataclass
        60 from functools import partial
        —> 61 import cv2
        62 import pandas as pd
        63 import gc

        ModuleNotFoundError: No module named ‘cv2’

        Please help
        Thanks

        • Type “python3 -m pip install opencv-python”

          • Definitely installed opencv. Did anaconda first, then tried pip as well. Install successful. At a loss!

          • Using anaconda is not recommended, it will mess with all dependencies and make them impossible to uninstall.

  • Slayzar

    I’m getting a “ModuleNotFoundError: No module named ‘resize_right’” error.
    I’ve solved the other Module not found errors and don’t understand why this one is different as it’s in the sub folder

    • I’m actually not sure how to fix that error. Hopefully someone else can help.

  • eden

    Hi, I got this error while doing step 4
    model.load_state_dict(torch.load(f'{model_path}/{diffusion_model}.pt’, map_location=’cpu’))

    RuntimeError Traceback (most recent call last)
    in ()
    160 print(‘Prepping model…’)
    161 model, diffusion = create_model_and_diffusion(**model_config)
    –> 162 model.load_state_dict(torch.load(f'{model_path}/{diffusion_model}.pt’, map_location=’cpu’))
    163 model.requires_grad_(False).eval().to(device)
    164 for name, param in model.named_parameters():
    ———————————— 1 frames—————————————
    /usr/local/lib/python3.7/dist-packages/torch/serialization.py in __init__(self, name_or_buffer)
    240 class _open_zipfile_reader(_opener):
    241 def __init__(self, name_or_buffer) -> None:
    –> 242 super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
    243
    244

    RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

    thanks~

    • Never seen this error before, hopefully someone else can help you solve it.

    • Markie

      I also met the same problem. Maybe it’s timeout, because I used vpn to connect to colab and the connection is weak.

      RuntimeError Traceback (most recent call last)

      in ()
      160 print(‘Prepping model…’)
      161 model, diffusion = create_model_and_diffusion(**model_config)
      –> 162 model.load_state_dict(torch.load(f'{model_path}/{diffusion_model}.pt’, map_location=’cpu’))
      163 model.requires_grad_(False).eval().to(device)
      164 for name, param in model.named_parameters():

      1 frames

      /usr/local/lib/python3.7/dist-packages/torch/serialization.py in __init__(self, name_or_buffer)
      241 class _open_zipfile_reader(_opener):
      242 def __init__(self, name_or_buffer) -> None:
      –> 243 super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
      244
      245

      RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

    • coco

      I met the same problem. Have you solved it? with appreciate:)

  • sm

    im on windows and when i do the pip install on requirement.txt it throws this:

    ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: ‘C:\Users\name\AppData\Local\Temp\pip-uninstall-8prdf_s8\pip3.exe’
    Check the permissions.

    • Add ” –user” to the end of the command”

      • Any other solutions? This didn’t do anything.

        • If that didn’t work, I have no idea.

  • Warren B

    Ok for some reason after getting no where, I took a few days to play with other things unrelated to this “AssertionError: Torch not compiled with CUDA enabled” error.

    Use the Pytorch.org tool, but added the “–upgrade” into the Pip

    pip3 install –upgrade torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu113

    and finally I am back up and running again 🙂 just thought I would share 🙂

  • Hello, Hi, I got this error while doing running ‘main.py’ in step 11.
    —–
    (venv) PS D:discodisco-diffusion-main> python3 main.py
    D:discodisco-diffusion-mainmain.py:1200: SyntaxWarning: “is not” with a literal. Did you mean “!=”?
    if steps_per_checkpoint is not 0 and intermediates_in_subfolder is True:
    filepath ./content/init_images exists.
    filepath ./content/images_out exists.
    filepath ./content/models exists.
    Traceback (most recent call last):
    File “D:discodisco-diffusion-mainmain.py”, line 193, in
    import lpips
    ModuleNotFoundError: No module named ‘lpips’

    • Type “pip3 install lpips”

      • Joshua

        Getting the same:
        ModuleNotFoundError: No module named ‘lpips’

        Even after “pip3 install lpips” – any suggestions?

        • try “python3 -m pip install lpips”

  • So grateful for this resource — thanks!!

    I’m getting an error on Step 8. The full error message is posted below. Can you help?

    (venv) PS C:UsersdominDropboxMy PC (DESKTOP-OMDF9L9)Downloadsmainmain> pip3 install -r requirements.txt
    Traceback (most recent call last):
    File “C:Program FilesWindowsAppsPythonSoftwareFoundation.Python.3.9_3.9.3312.0_x64__qbz5n2kfra8p0librunpy.py”, line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
    File “C:Program FilesWindowsAppsPythonSoftwareFoundation.Python.3.9_3.9.3312.0_x64__qbz5n2kfra8p0librunpy.py”, line 87, in _run_code
    exec(code, run_globals)
    File “C:UsersdominDropboxMy PC (DESKTOP-OMDF9L9)DownloadsmainmainvenvScriptspip3.exe__main__.py”, line 4, in
    ModuleNotFoundError: No module named ‘pip._internal’

    • That’s very strange, maybe try “python -m pip install –upgrade pip”

      • Dominic Klyve

        It works! I was doing something dumb. In particular, I neglected Step 10. Now art is being created on my computer — you rock!

  • Vash

    Thanks for this great tutorial, when running the requirements. txt I got this

    “ERROR: Could not find a version that satisfies the requirement pywin32==303 (from versions: none)
    ERROR: No matching distribution found for pywin32==303

    • If you aren’t on windows, that can’t be installed. Not sure what it’s required for though.

  • Stefano

      • Stefano

        I still get 404, weird

  • Stefano

    it seems Firefox forces redirection to https that causes the 404, tried with Brave and it works. Thanks again for your reply

  • Benjamin Hughes

    Thanks for the amazing guide 👌

    I set use_cpu to True and get this error:
    RuntimeError: attn = softmax(attn, dim=-1)
    File “C:UsersbenjaminAppDataLocalPackagesPythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0LocalCachelocal-packagesPython39site-packagestorchnnfunctional.py”, line 1680, in softmax
    ret = input.softmax(dim)
    RuntimeError: “softmax_lastdim_kernel_impl” not implemented for ‘Half’

    Do you have any idea of how to fix this. Im on windows

    • That likely means that fp16 is still set to true, try to find something that’s named fp16 and make it false.

      • Benjamin Hughes

        Did not work, but thanks for the help anyways 🙂

  • Matt

    Thanks for the great guide!
    Actually, I followed each step and everything seems fine.
    Unfortunately, it stops when batching starts 🙁

    “Prepping model…
    Batches: 0%| | 0/1 [00:00<?, ?it/s]
    [1] 7995 killed python3 main.py»

    I am using CPU… I guess this is a memory issue…
    If you have any suggestions it would be great!
    Thanks again!

    • Reduce memory by reducing size or turning off some models.

  • markie

    In the Diffuse step it said RuntimeError. It failed at ” 162 model.load_state_dict(torch.load(f'{model_path}/{diffusion_model}.pt’, map_location=’cpu’) “, why is that happening?

    • Not actually sure, are you running on CPU?

  • markie

    I met the same problem on colab as eden. Is that because my connection was weak? I used vpn.
    RuntimeError Traceback (most recent call last)

    in ()
    160 print(‘Prepping model…’)
    161 model, diffusion = create_model_and_diffusion(**model_config)
    –> 162 model.load_state_dict(torch.load(f'{model_path}/{diffusion_model}.pt’, map_location=’cpu’))
    163 model.requires_grad_(False).eval().to(device)
    164 for name, param in model.named_parameters():

    1 frames

    /usr/local/lib/python3.7/dist-packages/torch/serialization.py in __init__(self, name_or_buffer)
    241 class _open_zipfile_reader(_opener):
    242 def __init__(self, name_or_buffer) -> None:
    –> 243 super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
    244
    245

    RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

  • Terry

    Hi, I’m running v5.2 on Google Colab with default settings. And I got this error:
    —————————————————————————
    FileNotFoundError Traceback (most recent call last)
    in ()
    133 if use_secondary_model:
    134 secondary_model = SecondaryDiffusionImageNet2()
    –> 135 secondary_model.load_state_dict(torch.load(f'{model_path}/secondary_model_imagenet_2.pth’, map_location=’cpu’))
    136 secondary_model.eval().requires_grad_(False).to(device)
    137

    2 frames
    /usr/local/lib/python3.7/dist-packages/torch/serialization.py in __init__(self, name, mode)
    210 class _open_file(_opener):
    211 def __init__(self, name, mode):
    –> 212 super(_open_file, self).__init__(open(name, mode))
    213
    214 def __exit__(self, *args):

    FileNotFoundError: [Errno 2] No such file or directory: ‘/content/drive/MyDrive/AI/Disco_Diffusion/models/secondary_model_imagenet_2.pth’

    It feels like the download link of the .pth file does not work. Can you kindly help me with this? Thank you.

    • Did you download the model manually? it’s one of the steps, if you did, make sure it’s in the models folder.

      • Terry

  • Jaime Jasso

    I’m stuck on execution 17
    RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

    • That’s an unusual error, is there any more context?

  • Jigaru

    Hey i’m getting RuntimeError: CUDA out of memory. Tried to allocate 1.88 GiB (GPU 0; 11.00 GiB total capacity; 7.70 GiB already allocated; 0 bytes free; 9.00 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

    • It’s running out of VRAM, try a smaller image/fewer models. You may get another different error later, but any CUDA error other than the one that mentions launch blocking means not enough VRAM.

  • Zhang Zz

    I found some questions when I run
    `secondary_model.load_state_dict(torch.load(f'{model_path}/secondary_model_imagenet_2.pth’, map_location=’cpu’))`
    and
    `lpips_model = lpips.LPIPS(net=’vgg’).to(device)`

    I’m using RTX3090 and when I execute the codes above, the script just stay running for unlimited time (seemingly not be able to load successfully?)

    I feel confused for this question as the other scripts are successfully passed.

  • Lynn

    I got error while doing step 2,it shows these:
    512 Model already downloaded, check check_model_SHA if the file is corrupt
    Secondary Model already downloaded, check check_model_SHA if the file is corrupt
    —————————————————————————
    RuntimeError Traceback (most recent call last)
    in ()
    171 secondary_model = SecondaryDiffusionImageNet2()
    172 secondary_model.load_state_dict(torch.load(f'{model_path}/secondary_model_imagenet_2.pth’, map_location=’cpu’))
    –> 173 secondary_model.eval().requires_grad_(False).to(device)
    174
    175 clip_models = []

    3 frames
    /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in to(self, *args, **kwargs)
    905 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    906
    –> 907 return self._apply(convert)
    908
    909 def register_backward_hook(

    /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    576 def _apply(self, fn):
    577 for module in self.children():
    –> 578 module._apply(fn)
    579
    580 def compute_should_use_set_data(tensor, tensor_applied):

    /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    599 # `with torch.no_grad():`
    600 with torch.no_grad():
    –> 601 param_applied = fn(param)
    602 should_use_set_data = compute_should_use_set_data(param, param_applied)
    603 if should_use_set_data:

    /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in convert(t)
    903 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
    904 non_blocking, memory_format=convert_to_format)
    –> 905 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    906
    907 return self._apply(convert)

    RuntimeError: CUDA error: an illegal memory access was encountered
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    • That either means out of VRAM or you have another major process running on your GPU.

  • Andrew

    ‘model_path’ not defined showing up as of 5/14 3pm PST. Failing on Step 2.

    • Something is wrong in the code then, try to find where model_path should be defined.

  • G

    Getting this –

    name ‘lpips’ is not defined
    File “C:UsersyesimDocumentsDISCO DIFFUSIONmainmain.py”, line 1121, in
    lpips_model = lpips.LPIPS(net=’vgg’).to(device)

    any tips ?

    • That’s an unusual error, do you have it installed? (pip)

  • Hi there!
    Im having a message error everytime i try to setup DD, its in step 2, diffusion and clip models.

    FileNotFoundError: [Errno 2] No such file or directory: ‘/content/drive/MyDrive/AI/Disco_Diffusion/models/secondary_model_imagenet_2.pth’

    Since that this file cant be installed. I tried everything. check_model_SHA and changing the diffusion model to 256.

    any ideas?

    • Set the path manually after downloading the model.

      • Mind elaborating on this?

        • You will need to edit that line in the code.

  • cristiano

    I have this problem: when DD ends his work, the image goes immediately away and the program starts again… how can i stop it at the end? In this way the images are lost before i can even watch them!

    • The image is saved to a file, and you can set batches to 1 for only 1 image.

      • cristiano

        thank you! it worked! Thanks again.

  • cristiano

    DD worked perfectly yesterday. Today suddenly it hardly connects, it is not possible to disconnect because it freezes, it tells me that you cannot do the “check GPU status” because it is not possible to communicate with the NVIDIA Drivers (but I have no NVIDIA cards and I didn’t even have them yesterday when the program was running … plus it tells me that the CODA GPUS are not available. What has changed since yesterday? I have been using the program all day with no problems. The only way to make it work now is to use the CPU, but doing so it takes 20 minutes to complete 1% of the job Thanks for your attention and any help.

    • With only a CPU, it will just take a long time.

      • Cristiano

        But why yesterday was it working and today doesn’t? I didn’t change anything, and the pc is the same as yesterday… What happened? And is there anything I can do to change this situation?
        Yea. If you use the cpu only it takes 1 hours to do 3%…impossibile!

  • Behrooz

    I have it up and running, but the output images are just plain black.

    • That’s very unusual, never heard of that before

    • That either means you aren’t connected to the internet, or you have a wrong, unsupported version of ubuntu.

      • Hi there,

        I am definitely connected to the internet (all of the rest of the files downloaded and I am also typing this) 🙂

        I am also on Windows 10. So I am not sure why it is downloading those?

        Thanks again!

        • Oh, that would be the problem, what script are you running that does that?

  • “1.3 Install and import dependencies”

    • That’s not a “script,” but whatever script you’re using is designed for ubuntu. I assume this isn’t the base one that I have in my guide, right?

  • Sorry for the wrong terminology. It has been a LONG time since I have really done anything resembling coding. I am a designer/animator these days and just love this “Disco Diffusion v4.1 [w/ Video Inits, Recovery & DDIM Sharpen].ipynb”

    Maybe I am not even asking for help in the right place. I am sorry. I am just stepping though the steps on Colab.

    • Oh, you’re running this on colab?

  • LEE

    Hi,I’m getting an error :e:workDDDisco_Diffusion.ipynb Cell 14′ in ()
    135 try:
    –> 136 from infer import InferenceHelper
    137 except:

    ImportError: cannot import name ‘InferenceHelper’ from ‘infer’ (e:workDDmyvenvlibsite-packagesinfer__init__.py)

    I have tried reinstalling infer, but it doesn’t work after several attempts.I’ve been stuck for two days and I haven’t found a solution.

    My version is Disco Diffusion V5.2.

    • LEE

      I finally solved it, but I have a new problem:
      AssertionError: Torch not compiled with CUDA enabled.

      My CUDA Version is:
      NVIDIA-SMI 512.95 Driver Version: 512.95 CUDA Version: 11.6

      And I Installl from PyTorch

      pip3 install torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu113

      But it don’t work.I still trapped in the new problem.

      • V5 and above are relatively difficult to get to work on windows, hopefully someone else knows how to solve this issue.

        • LEE

          After I got back from work, it worked! I have drawn the default drawing

      • Flo

        Hi. how did you manage to solve your issue with infer?
        I’m getting the same error you’ve described:
        “cannot import name ‘InferenceHelper’ from ‘infer’”

        thanks!

      • Dr X

        How did you solve installing the InferenceHelper?

  • Nerone

    Hi! I’m getting this errors. I don’t code and solved it yesterday without touching the code but can’t remember how.

    Could you hel? Thank you!!

    FileNotFoundError Traceback (most recent call last)
    in ()
    160 print(‘Prepping model…’)
    161 model, diffusion = create_model_and_diffusion(**model_config)
    –> 162 model.load_state_dict(torch.load(f'{model_path}/{diffusion_model}.pt’, map_location=’cpu’))
    163 model.requires_grad_(False).eval().to(device)
    164 for name, param in model.named_parameters():

    2 frames
    /usr/local/lib/python3.7/dist-packages/torch/serialization.py in __init__(self, name, mode)
    210 class _open_file(_opener):
    211 def __init__(self, name, mode):
    –> 212 super(_open_file, self).__init__(open(name, mode))
    213
    214 def __exit__(self, *args):

    FileNotFoundError: [Errno 2] No such file or directory: ‘/content/drive/MyDrive/AI/Disco_Diffusion/models/512x512_diffusion_uncond_finetune_008100.pt’

    • There is some error with file paths, with the file open in VS code, press ctrl+f and type “/content/drive/MyDrive/AI/Disco_diffusion” and it will bring you to that line in the code, replace “/content/drive/MyDrive/AI/Disco_diffusion” with “./” and make sure that you downloaded the models into the model folder.
      If you join our Discord (https://discord.gg/sNd4TNmhxE) we can likely help you better with this.

  • vaportrail

    Stuck on step 8. Error is below.

    PS C:UsersOwnerDownloadsdisco-diffusion-main> pip3 install -r requirements.txt
    ERROR: Could not open requirements file: [Errno 2] No such file or directory: ‘requirements.txt’

    Also, for step 7, can you clarify what you mean by the “content folder”

    Thanks

    • There should be a folder called content, and you need to find the directory with the requirements.txt file, and cd into it. Then the command will work.

      • vaportrail

        Can you be more specific? I don’t see a folder called content. Is this part of the Disco-diffusion-main folder that I downloaded or is it part of something else?

        • Are you in the folder called main? Sometimes there will be another folder called main inside the first one, go into that. Then you will find a folder called content.

  • vaportrail

    DISCO-DIFFUSION-MAIN>disco-diffusion-main>docker>main>Dockerfile

    Is the only instance of a folder called “main” I can see.

    • How did you get a dockerfile? are you using my guide?

  • Can u give me screenshots of the procedure on how to pass the CUDA_LAUNCH_BLOCKING=1. I am not that techy with all this, so is there a certain platform where I have to run this function? In the disco diffusion AI bot generator, it says that, “RuntimeError: CUDA error: an illegal memory access was encountered
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.”

    • This is a weird variation of the out of VRAM, try a smaller images and fewer models.

  • makvanser

    Hi! I’m geting this error:
    NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
    I was trying to install CUDA 11.3 but it’s still doesn’t fix a proplem.
    I can’t understand what is the problem. Can you help me?

    • You need to install the nvidia driver.

      • makvanser

        I did it, but it’s not working. May be it’s COLLAB error, because sometimes the code runs a couple of times, but then this error starts to appear. Or do I need some special driver besides the standard nvidia and CUDA?

  • Ahmed Behiry

    Hello
    In step 11 after typing python3 main.py , I’m getting this error:
    C:UsersDesktopmainmain.py:1200: SyntaxWarning: “is not” with a literal. Did you mean “!=”?
    if steps_per_checkpoint is not 0 and intermediates_in_subfolder is True:
    filepath ./content/init_images exists.
    filepath ./content/images_out exists.
    filepath ./content/models exists.
    Traceback (most recent call last):
    File “C:UsersDesktopmainmain.py”, line 191, in
    import timm
    ModuleNotFoundError: No module named ‘timm’

    I tried this: pip install timm, and it worked fine:
    Successfully installed timm-0.6.7

    but I’m still getting this error..

    • You are probably using a different python to run and install, try “python -m pip install timm”

  • Dr X

    I’m attempting to install 5.6 and after solving some errors, I’m stumped on InferenceHelper in disco_xform_utils.py

    —————————————————————————
    ImportError Traceback (most recent call last)
    File h:Projects#AI Image GenDiscoDiffusionmaindisco_xform_utils.py:12, in
    11 try:
    —> 12 from infer import InferenceHelper
    13 except:

    ImportError: cannot import name ‘InferenceHelper’ from ‘infer’ (h:Projects#AI Image GenDiscoDiffusionmainvenvlibsite-packagesinfer__init__.py)

    ———-

    Would it be easier to just use google colab and have it run on my local gpu? (RTX 3090)
    Seems like it has a lot of trouble with folders, file names and python versions installed directly to my drive.

    • I haven’t seen this before, I don’t know what inferenceHelper is, hopefully someone else can solve this.

      • Dr X

        I keep having issues where the installer simply can not find things that are right there in the local folder they’re supposed to be in.. Same goes for the AI training script.

        I’m not a Python programmer but I’m thinking if I knew how to point everything to an exact absolute path(including drive letters) it might solve some problems, I have a feeling relative path may be broken for me. There also might be something wrong with my Python installation, because originally I just tried to download it from the Python site, but then VS Code didn’t detect it and installed it again from the Horrible microsoft store.

  • linshenqi

    I use Quadro M4000, just works fine with this setthing. Thank you~

  • Atom

    Thank you so much for sharing and helping I really appreciate what you are doing. I deleted my folders from google drive and now I am trying to start fresh. Diffusion and CLIP model settings> will not load the secondary model. It ran all night and secondary model is 85MB and the SHA does not match. Ive tried restarting everything and deleted the model again nothing seems to work. I feel like there is a cache somewhere that needs to be reset since deleting the folders which i now regret doing.

    • The cache is a hidden folder (starts with “.”) somewhere either in the main folder or in your user directory.

  • Понравилась статья? Поделить с друзьями:
  • Runtime error member access within null pointer of type
  • Rsa operation error
  • Rpgvx rtp ошибка на японском
  • Routing error uninitialized constant
  • Self protection failed error code 4 как исправить