Unable to enum cuda gpus unknown error - Исправление ошибок и поиск оптимальных решений проблем

Народ, столкнулся с такой вот проблемой RX 580 Pulse 8gb стала выдавать такую вот ошибку.. раньше майнила в Клейморе и не было проблем никаких а тут вспомнил про нее и решил попробовать в Фениксе… Майнинг начинается 30 -31,2 ( Биос прошитый) мхш показывает но температура в автобернере как башенная начинает расти … За минуту до 80 градусов доходит, хотя на клее выше 65 не поднималась.. даже летом в жару однажды 66 было и всё.. OpdenCl драйвер неизвестен пишет. (Хотя дрова поставил последние и всё определилось без ошибок) потом через DDU удалил и заново накатил и без результатно.. Помогите кто в курсе как пофиксить

Народ, столкнулся с такой вот проблемой RX 580 Pulse 8gb стала выдавать такую вот ошибку.. раньше майнила в Клейморе и не было проблем никаких а тут вспомнил про нее и решил попробовать в Фениксе… Майнинг начинается 30 -31,2 ( Биос прошитый) мхш показывает но температура в автобернере как башенная начинает расти … За минуту до 80 градусов доходит, хотя на клее выше 65 не поднималась.. даже летом в жару однажды 66 было и всё.. OpdenCl драйвер неизвестен пишет. (Хотя дрова поставил последние и всё определилось без ошибок) потом через DDU удалил и заново накатил и без результатно.. Помогите кто в курсе как пофиксить

какие напряжения ядра? и чё после перегрева дрова не ставятся? или что? ты хоть скрин бы приложил)

Потому, что тупой и не читает ридми.тхт. Драйвера нужно ставить не последние, а определеной версии. Там все написано. Потому и драйвер opencl unknown пишет и не управляет картами

Для начала стоит разобраться как снизить температуру, если ещё не ушата чип с восьмидесяткой на ядре

Потому, что тупой и не читает ридми.тхт. Драйвера нужно ставить не последние, а определеной версии. Там все написано. Потому и драйвер opencl unknown пишет и не управляет картами

Сорь но я в режим этого не нашел так то, а на англоязычном форуме писали что Рита дрова нужны какой то предыдущей версии.. но я не смог их найти в доступе..

Редми* типа* Т9 взбесился.. если в курсе где скачать буду благодарен в READMY к батнику правда не нашел … 3 дня из инета не вылезаю

IMG-20210504-WA0001.jpg

267,1 КБ · Просмотры: 32

Я не нашел в инете дров таких версий .. со старыми блокчейн дровами 17 года не идёт((

IMG-20210504-WA0002.jpg

288,5 КБ · Просмотры: 30

Попробую поставить 20.4.2. , Потом напишу какой результат

Попробую поставить 20.4.2. , Потом напишу какой результат

20.5.1

Обязательно до 20.5.1 препрепрепренепременно! Дякую дуже

Обязательно до 20.5.1 препрепрепренепременно! Дякую дуже

И «Феникс» не ниже 5.5с

Сорь но я в режим этого не нашел так то, а на англоязычном форуме писали что Рита дрова нужны какой то предыдущей версии.. но я не смог их найти в доступе..

использую феникс 5.6b beta и дрова 20.11.2

Не работает.. та же ошибка.. Феникс 5.5с дрова 20.5.1

IMG-20210504-WA0003.jpg

397,5 КБ · Просмотры: 21

использую феникс 5.6b beta и дрова 20.11.2

Ну.. что могу сказать увы и ах… Не работает и выдает ту же самую ошибку… Что можно ещё попробовать? Может Биос обратно на заводской перешить?

использую феникс 5.6b beta и дрова 20.11.2

IMG-20210504-WA0004.jpg

174,6 КБ · Просмотры: 27

Ну.. что могу сказать увы и ах… Не работает и выдает ту же самую ошибку… Что можно ещё попробовать? Может Биос обратно на заводской перешить?

Попробуй в «батник» прописать -clKernel 0

Попробуй в «батник» прописать -clKernel 0

Да писал уже… Не помогает да же ещё и 1 и 2 попробовал .. то же самое

Да писал уже… Не помогает да же ещё и 1 и 2 попробовал .. то же самое

— amd

Источник

Ошибки Видеокарты При Майнинге

Самое полное собрание ошибок в майнинге на Windows, HiveOS и RaveOS и их быстрых и спокойных решений

Can’t find nonce with device CUDA_ERROR_LAUNCH_FAILED

Ошибка майнера Can’t find nonce

Ошибка говорит о том, что майнер не может найти нонс и сразу же сам предлагает решение — уменьшить разгон. Особенно начинающие майнеры стараются выжать из видеокарты максимум — разгоняют слишком сильно по ядру или памяти. В таком разгоне видеокарта даже может запуститься, но потом выдавать ошибки как указано ниже. Помните, лучше — стабильная отправка шар на пул, чем гонка за цифрами в майнере.

Зарабатывай на чужих сделках на бирже BingX. Подробнее — тут.

Phoenixminer Connection to API server failed — что делать?

Ошибка Connection to API server failed

Такая ошибка встречается на PhoenixMiner на операционной систему HiveOS. Она говорит о том, что майнинг-ферма/риг не может подключиться к серверу статистики. Что делать для ее решения:

Введите команду net-test и запомните/запишите сервер с низким пингом. После чего смените его в веб интерфейсе Hive (на воркере) и перезагрузите ваш риг.
Если это не помогло, выполните команду dnscrypt -i && sreboot

Phoenixminer CUDA error in CudaProgram.cu:474 : the launch timed out and was terminated (702)

Ошибка майнера Phoenixminer CUDA error in CudaProgram

Эта ошибка, как и в первом случае, говорит о переразгоне карты. Откатите видеокарту до заводских настроек и постепенно поднимайте разгон до тех пор, пока не будет ошибки.

UNABLE TO ENUM CUDA GPUS: INVALID DEVICE ORDINAL

Ошибка майнера Unable to enum CUDA GPUs: invalid device ordinal

Проверяем драйвера видеокарты и саму видеокарту на работоспособность (как она отмечена в диспетчере устройств, нет ли восклицательных знаков).
Если все ок, то проверяем райзера. Часто бывает, что именно райзер бывает причиной такой ошибки.

UNABLE TO ENUM CUDA GPUS: INSUFFICIENT CUDA DRIVER: 5000

Ошибка майнера Unable to enum CUDA GPUs: Insufficient CUDA driver: 5000

Аналогично предыдущей ошибке — проверяем драйвера видеокарты и саму видеокарту на работоспособность (как она отмечена в диспетчере устройств, нет ли восклицательных знаков).

NBMINER MINING PROGRAM UNEXPECTED EXIT.CODE: -1073740791, REASON: PROCESS CRASHED

Ошибка майнера NBMINER MINING PROGRAM UNEXPECTED EXIT.CODE: -1073740791, REASON: PROCESS CRASHED

Ошибка code 1073740791 nbminer возникает, если ваш риг/майнинг-ферма собраны из солянки Nvidia+AMD. В этом случае разделите майнинг на два .bat файла (или полетника, если вы на HiveOS). Один — с картами AMD, другой с картами Nvidia.

NBMINER CUDA ERROR: OUT OF MEMORY (ERR_NO=2) — как исправить?

Ошибка майнера NBMINER CUDA ERROR: OUT OF MEMORY (ERR_NO=2)

Одна из самых распространённых ошибок на Windows — нехватка памяти, в данном случае на майнере Nbminer, но встречается и в майнере Nicehash. Чтобы ее исправить — надо увеличить файл подкачки. Файл подкачки должен быть равен сумме гб всех видеокарт в риге плюс 10% запаса. Как увеличить файл подкачки — читаем тут.

GMINER ERROR ON GPU: OUT OF MEMORY STOPPED MINING ON GPU0

Ошибка майнера GMINER ERROR ON GPU: OUT OF MEMORY STOPPED MINING ON GPU0

В данном случае скорее всего виноват не файл подкачки, а переразгон по видеокарте, которая идет под номером 0. Сбавьте разгон и ошибка должна пропасть.

Socket error. the remote host closed the connection, в майнере Nbminer

Socket error. the remote host closed the connection

Также может быть описана как «ERROR — Failed to establish connection to mining pool: Socket operation timed out».
Сетевой конфликт — проверьте соединение рига с интернетом. Перегрузите роутер.
Также может быть, что провайдер закрывает соединение с пулом. Смените пул, попробуйте VPN или измените адреса DNS на внешнего провайдера, например cloudflare 1.1.1.1, 1.0.0.1

Server not responded on share, на майнере Gminer

Server not responded on share

Такая ошибка говорит о том, что у вас что-то с подключением к интернету, что критично для Gminer. Попробуйте сделать рестарт роутера и отключить watchdog на майнере.

DAG has been damaged check overclocking settings, в майнере Gminer

Также в этой ошибке может быть указано Device not responding, check overclocking settings.
Ошибка говорит о переразгоне, попробуйте сначала убавить его.
Если это не помогло, смените майнер — Gminer никогда не славился работой с видеокартами AMD. Мы рекомендуем поменять майнер на Teamredminer, а если вам критична поддержка майнером одновременно Nvidia и AMD видеокарт, то используйте Lolminer.
Если смена майнера не поможет, переставьте видеодрайвер.
Если и это не поможет, то нужно тестировать эту карту отдельно в слоте X16.

ERROR: Can’t start T-Rex, failed to initialize device map: can’t get busid, code -6

Ошибки настройки памяти с кодом -6 обычно указывают на проблему с драйвером.

Если у вас Windows, используйте программу DDU (DisplayDriverUninstaller), чтобы полностью удалить все драйверы Nvidia.
Перезагрузите систему.
Установите новый драйвер прямо с сайта Nvidia.
Перезагрузите систему снова.
Если у вас HiveOS/RaveOS — накатите чистый образ системы. Чтобы наверняка.

TREX: Can’t unlock GPU

Полный текст ошибки:
TREX: Can’t unlock GPU [ID=1, GPU #1], error code 15
WARN: Miner is going to shutdown…
WARN: NVML: can’t get fan speed for GPU #1, error code 15
WARN: NVML: can’t get power for GPU #1, error code 15
WARN: NVML: can’t get mem/core clock for GPU #1, error code 17

Решение:

Проверьте все кабельные соединения видеокарты и райзера, особенно кабеля питания.
Если с первый пунктом все ок, попробуйте поменять райзер на точно рабочий.
Если ошибка остается, вставьте видеокарту в разъем х16 напрямую в материнскую плату.

CAN’T START MINER, FAILED TO INITIALIZE DEVIS MAP, CAN’T GET BUSID, CODE -6

Ошибка майнера CAN’T START MINER, FAILED TO INITIALIZE DEVIS MAP, CAN’T GET BUSID, CODE -6

В конкретном случае была проблема в блоке питания, он не держал 3 видеокарты. После замены блока питания ошибка пропала.
Если вы уверены, что ваш мощности вашего блока питания достаточно, попробуйте сменить майнер.

Зарабатывай на чужих сделках на бирже BingX. Подробнее — тут.

ОШИБКА 511 ГРАДУСОВ НА ВИДЕОКАРТА

Ошибка 511 градусов видеокарта

Ошибка 511 говорит о неисправности райзера или питания карты. Проверьте все соединения. Для выявления неисправности рекомендуется запустить систему с одной картой. Протестировать, и затем добавлять по одной карте.

GPU driver error, no temps в HiveOS — что делать?

Вероятнее всего, вы получили эту ошибку, майнив на HiveOS. Причин ее появления может быть несколько — как софтовая, так и аппаратная (например райзер).
Можно попробовать обойтись малой кровью и вбить в HiveOS команду:
hive-replace -y —stable
Система по новой накатит стабильную версию HiveOS.

Если ошибка не уйдет — проверьте райзер.

GPU are lost, rebooting

Это не ошибка, а ее последствие. Что узнать какая ошибка приводит к перезагрузке карт, сделайте следующее:

Включите сохранение логов (по умолчанию они выключены) командой

logs-on

И перезагрузите риг.
После того как ошибка повторится можно будет скачать логи командами ниже.
Вы можете использовать следующую команду, чтобы загрузить логи майнера прямо с панели мониторинга;

message file «miner.log» -f=/var/log/miner/minername/minername.log

Итак, скажем, например, мне нужны логи TeamRedMiner
message file «teamredminer.log» -f=/var/log/miner/teamredminer/teamredminer.log

Отправленная командная строка будет выделена синим цветом. Загружаемый файл будет отображаться белым цветом. Нажав на него, вы сможете его скачать.
Эта команда позволит скачать лог системы

message file «syslog» -f=/var/log/syslog

exitcode=3 в HiveOS

Если ошибка не уйдет — проверьте райзер.

exitcode=1 в HiveOS

Данная ошибка возникает когда есть проблема с датой в биосе материнской платы (сбитое время) и (или) есть проблема с интернетом.
Если сбито время, то удаленно вы не сможете подключиться.
Тем не менее, обновление драйверов Nvidia должно пройти командой:

nvidia-driver-update —list

gpu fault detected 146

Скорее всего вы пытаетесь майнить с помощью Phoenix miner. Решения два:

Откатитесь на более старую версию, например на 5.4с
(Рекомендуемый вариант) Используйте Trex для видеокарт Nvidia и TeamRedMiner для AMD.

Waiting interface to come up — не работает VPN на HiveOS

Waiting interface to come up

Начните с логов, чтобы понять какая именно ошибка вызывает эту проблему.
Команды для получения логов:
systemctl status openvpn@client
journalctl -u openvpn@client -e —no-pager -n 100

Как узнать ip адрес воркера hive os

Самое простое — зайти в воркера и прокрутить страницу ниже видеокарт. Там будет указан Remote IP — это и есть внешний IP.
Альтернативный вариант — вы можете проверить ваш внешний айпи адрес hive через консоль Hive Shell:
Выполните одну из команд:
curl 2ip.ru
wget -qO- eth0.me
wget -qO- ipinfo.io/ip
wget -qO- ipecho.net/plain
wget -qO- icanhazip.com
wget -qO- ipecho.net
wget -qO- ident.me

Repository update failed в HiveOS

Иногда встречается на HiveOS. Полный текст ошибки:

Some index files failed to download. They have been ignored, or old ones used instead.
Repository update failed
------------------------------------------------------
> Restarting autofan and watchdog
> Starting miners
Miner screen is already running
Run miner or screen -r to resume screen
Upgrade failed

Решение:

Выполнить команду apt update && selfupgrade -f
Если не сработала и она, то 99.9%, что разработчики HiveOS уже знают об этой проблеме и решают ее. Попробуйте выполнить обновление через некоторое время.

Rave os не запускается. Boot aborted Rave os

Перепроверьте все настройки ПК и БИОСа материнской платы:
— Установите загрузочное устройство HDD/SSD/M2/USB в зависимости от носителя с ОС.
— Включите 4G decoding.
— Установите поддержку PCIe на Auto.
— Включите встроенную графику.
— Установите предпочтительный режим загрузки Legacy mode.
— Отключите виртуализацию.

Если после данных настроек не определяется часть карт, то выполните следующие настройки в BIOS (после каждого пункта требуется полная перезагрузка):

— Отключите 4G decoding
— Перезагрузка
— Отключите CSM
— Перезагрузка
— Включите 4G decoding, установите PCI-E Gen2/3, а при отсутствии Gen2/3, можно выбрать Gen1

Failed to allocate memory Raveos

Эта же ошибка может называться как:
failed to allocate initramfs memory bailing out, failed to load idlinux c.32
или
failed to allocate memory for kernel boot parameter block
или
failed to allocate initramfs memory raveos bailing

Но решение у нее одно — вы должны правильно настроить БИОС материнской платы.

gpu_driver_fault, GPU #0 fault в RaveOS

gpu_driver_fault, GPU #0 fault в RaveOS

В большинстве случаев эта проблема решается уменьшением разгона (особенно по памяти) на конкретной видеокарте (на скрине это карта номер 0).
Если уменьшение разгона не помогает, то попробуйте обновить драйвера.
Если обновление драйверов не привело к решению проблемы, то попробуйте поменять райзер на этой карте на точно работающий.
Если и это не помогает, перепроверьте все кабельные соединения и мощность блока питания, хватает ли его для вашей конфигурации.

Gpu driver fault. All tasks have been stopped. Worker will be rebooted after 5 minutes в RaveOS

Gpu driver fault. All tasks have been stopped. Worker will be rebooted after 5 minutes

Что приводит к появлению этой ошибки? Вероятно, вы переразогнали видеокарту (часто сильно гонят по памяти), сбавьте разгон. На скрине видно, что проблему дает именно GPU под номером 1 — начните с нее.
Вторая частая причина — нехватка питания БП на систему с видеокартами. Учтите, что сама система потребляет не менее 100 вт, каждый райзер еще закладывайте 50 вт. БП должно хватать с запасом в 20%.

Miner restarted after error RaveOS

Смотрите логи майнера, там будет указана конкретная ошибка, которая приводит к miner restarted. После этого найдите ее на этой странице и исправьте. Проблема уйдет.

Miner restart limit reached. Worker rebooting by flag auto в RaveOS

Аналогично предыдущему пункту — смотрите логи майнера, там будет указана конкретная ошибка, которая приводит к рестарту воркера. Пофиксите ту ошибку — уйдет и эта проблема.

Miner cannot be started, ОС RaveOS

Непосредственно перед этой ошибкой обычно пишется еще другая, которая и вызывает эту проблему. Но если ничего нет, то:

Поставьте майнер на паузу, перезагрузите риг и в консоли выполните команды clear-miners clear-logs и fix-fs. Запустите майнинг.
Если ошибка не ушла, перепишите образ RaveOS.

Overclock can’t be applied в RaveOS

Эта ошибка означает, что значения разгона между собой конфликтуют или выходят за пределы допустимых. Перепроверьте их. Скиньте разгон на стоковый и попробуйте еще раз.
В редких случаях причиной этой ошибки также становится райзер.

Error installing hive miners

Можно попробовать обойтись малой кровью и вбить в HiveOS команду:
hive-replace -y —stable
Система по новой накатит стабильную версию HiveOS.

Если ошибка не уйдет — физически перезапишите образ. Если у вас флешка, то скорее всего она умерла. Купите SSD.

Warning: Nvidia settings applied with errors

Переразгон. Снизьте значения частот ядра и памяти. После этого перезагрузите риг.

Nvtool error или Danger: nvtool error

Скорее всего при установке драйвера появилась проблема с модулем nvtool
Попробуйте переустановить драйвер Nvidia командой через Hive shell:
nvidia-driver-update версия_драйвера —force
Или попробуйте обновить систему полностью командой из Hive shell:
hive-replace -y —stable

nvtool error

Перестал отображаться кулер видеокарты HiveOS

0% скорости вращения кулера.
Это может произойти по нескольким причинам:

кулер действительно не крутится
датчик оборотов отключен или сломан
видеокарта слишком агрессивно работает (высокий разгон)
неисправен райзер или одно из его частей

ERROR: parsing JSON failed

Необходимо выполнить на риге локально (с клавиатурой и монитором) следующую команду:
net-test

Данная команда покажет ваше текущее состояние подключения к разным зеркалам API серверов HiveOS.
Посмотрите, к какому API у вас наименьшая задержка (ping), и когда воркер снова появится в панели, измените стандартное зеркало на то, что ближе к вам.
После смены зеркала, в обязательном порядке перезагрузите ваш воркер.
Изменить сервер API вы можете командой nano /hive-config/rig.conf
После смены нажмите ctrl + o и ентер для того чтобы сохранить файл.
После этого выйдите в консоль командой ctrl + x, f10 и выполните команду hello

NVML: can’t get fan speed for GPU #5, error code 999 hive os

Проблема с скоростью кулеров на GPU 5
0% скорости вращения кулера / ошибки в целом
Это может произойти по нескольким причинам:
— кулер действительно не крутится
— датчик оборотов отключен или сломан
— видеокарта слишком агрессивно работает (высокий разгон)
Начните с визуальной проверки карты и ее кулера.

Can’t get power for GPU #2

Как правило эта ошибка встречается рядом вместе с другими:
Attribute ‘GPUGraphicsClockOffset’ was already set to 0
Attribute ‘GPUMemoryTransferRateOffset’ was already set to 2200
Attribute ‘GPUFanControlState’ (hive1660s_ETH:0[gpu:2]) assigned value
0.

20211029 12:40:50 WARN: NVML: can’t get fan speed for GPU #2, error code 999
20211029 12:40:50 WARN: NVML: can’t get power for GPU #2, error code 999
20211029 12:40:50 WARN: NVML: can’t get mem/core clock for GPU #2, error code 999

Решение:

Проверьте корректность установки драйвера на видеокарте.
Убедитесь что нет проблем с драйвером, если все в порядке, то попробуйте другой параметр разгона. Например уменьшить разгон по памяти.

GPU1 search error: unspecified launch failure

Уменьшите разгон и проверьте контакты райзера

Warning: Autofan: unable to set fan speed, rebooting

Найдите логи майнера, посмотрите какие ошибки майнер пишет в логах. Например:

kernel: [12112.410046][ T7358] NVRM: GPU at PCI:0000:0c:00: GPU-236e3bef-2e03-6cdb-0518-7ac01eb8736d
kernel: [12112.410049][ T7358] NVRM: Xid (PCI:0000:0c:00): 62, pid=7317, 0000(0000) 00000000 00000000
kernel: [12112.433831][ T7358] NVRM: Xid (PCI:0000:0c:00): 45, pid=7317, Ch 00000010
CRON[21094]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)

Исходя из логов, мы видим что есть проблема с видеокартой на слоте PCIE 0c:00 (под номером Gpu пишется номер PCIE слота) с ошибками 45 и 62
Коды ошибок (других, которые также могут быть там) и что с ними делать:

• 13, 43, 45: ошибки памяти, снизить MEM
• 8, 31, 32, 61, 62: снизить CORE, возможно и MEM
• 79: снизить CORE, проверить райзер

Ошибка Kernel-Power код 41

Проверьте все провода (от БП до карт, от БП до райзеров), возможно где-то идёт оплавление. Если визуальный осмотр показал, что все ок, то ошибка программная и вам нужно переустановить Windows.

Danger: hive-replace -y —stable (failed, exitcode=137)

Очень редкая ошибка, которая вылезла в момент удаленного обновления образа HiveOS. Она не встречается в тематических майнинг группах и сайтах. Не поверите что произошло.
На балконе, где стоял риг, поселилась семья голубей. Они засрали риг, в прямом смысле, из-за этого он постоянно уходил в оффлайн. После полной продувки материнской платы и видеокарт проблема решилась сама.

MALFUNCTION HIVEOS

Malfunction — неисправность. Причин и решений может быть несколько:

Вам следует переустановить видео драйвер;
Если драйвер не помог, тогда отключайте все GPU и поочередно вставляйте по 1 шт, и смотрите вызовет ли какая-то видеокарта подобную ошибку или нет. Если да, то возможно это райзер.
Неисправен носитель, на который записана Hive OS, запишите образ еще раз.

Не нашли своей ошибки? Помогите сделать мир майнинга лучше. Отправьте ее по этой форме и мы обновим наш гайд в самое ближайшее время.

Источник

TTMiner is refusing to run KawPOW on my 1070Tis today for some reason. Log file is below:

TT-Miner — Version: 5.0.2 (May 15 2020 07:52:47)

No cuda shared libraries found. Cannot continue.

What is the Nvidia driver version number?

I am using the same. Eventually TT miner needs some extra dll. I’ll check this now.

It’s doing it on KawPOW on my 1660Tis too now.

Phoenix is saying this for my 1660Tis:

Unable to enum CUDA GPUs: unknown error
No avaiable GPUs for mining. Please check your drivers and/or hardware.

I found this in gpu-count.txt

00:02.0 VGA compatible controller: Intel Corporation HD Graphics 510 (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)
06:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)
0a:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)
0b:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)
0c:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)
0d:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)
0e:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)

This is showing 5 1070Tis and Intel graphics. I have Intel graphics but don’t have any display connected. I also only have 3 1070Tis installed currently. Could this be causing me issues, and if so, how can I fix it?

Actually, this sounds like a driver problem. Eventually the drivers got corrupted. At first, I would try to re-install the Nvidia drivers (remember to set Force P2 State to Off, afterwards.)

The gpu-count.txt should not have any impact. It is set, when you run Install.bat

Actually, this sounds like a driver problem. Eventually the drivers got corrupted. At first, I would try to re-install the Nvidia drivers (remember to set Force P2 State to Off, afterwards.)

The gpu-count.txt should not have any impact. It is set, when you run Install.bat

I’ve taken GPUs out since running Install.bat. Do I need to reconfigure RBM?

I’ve taken GPUs out since running Install.bat. Do I need to reconfigure RBM?

No. This is only necessary, if RainbowMiner runs on a Linux machine and the rig hosts AMD GPUs.

Is this problem still persisting? If yes, could you eventually upload a Debug file?

Is this problem still persisting? If yes, could you eventually upload a Debug file?

I think it’s an issue with my Asus B250 Mining Motherboard not liking my 1660Tis. I’ve ordered an Asrock H110 BTC+ which should hopefully resolve the issue. You might as well close this one as I’m 90% sure the issue is something to do with my motherboard.

again it’s the —opencl-plateform parameter the problem try KawPOW standalone without and it works.remove nvidia/amd palteform parameter and let auto-detect would be the solution!
i have the same error with Vega 56 (amd so…)
remember RainbowMiner, i think it was the same issue on NBminer…

again it’s the —opencl-plateform parameter the problem try KawPOW standalone without and it works.remove nvidia/amd palteform parameter and let auto-detect would be the solution!
i have the same error with Vega 56 (amd so…)
remember RainbowMiner, i think it was the same issue on NBminer…

This is an issue with my motherboard.

I’ve resolved the problems with my rig, but TTminer is still acting funny. I’ve got my GPUs split into different device groups, but TTminer is not working when running on more than one card.

Is it, that two TTminers running in parallel fail? Or is it one TTminer failing to run on more than one GPU?

Is it, that two TTminers running in parallel fail? Or is it one TTminer failing to run on more than one GPU?

One instance runs fine, then the 3 others fail.

Источник

Содержание

CUDA error in CudaProgram.cu:373 : out of memory (2) #1857
Comments
Phoenix Miner 4.7c Windows/msvc — Release build
Ошибки Видеокарты При Майнинге
UNABLE TO ENUM CUDA GPUS: INVALID DEVICE ORDINAL
UNABLE TO ENUM CUDA GPUS: INSUFFICIENT CUDA DRIVER: 5000
NBMINER MINING PROGRAM UNEXPECTED EXIT.CODE: -1073740791, REASON: PROCESS CRASHED
NBMINER CUDA ERROR: OUT OF MEMORY (ERR_NO=2) — как исправить?
GMINER ERROR ON GPU: OUT OF MEMORY STOPPED MINING ON GPU0
Socket error. the remote host closed the connection, в майнере Nbminer
Server not responded on share, на майнере Gminer
DAG has been damaged check overclocking settings, в майнере Gminer
ERROR: Can’t start T-Rex, failed to initialize device map: can’t get busid, code -6
TREX: Can’t unlock GPU
CAN’T START MINER, FAILED TO INITIALIZE DEVIS MAP, CAN’T GET BUSID, CODE -6
ОШИБКА 511 ГРАДУСОВ НА ВИДЕОКАРТА
GPU driver error, no temps в HiveOS — что делать?
GPU are lost, rebooting
exitcode=3 в HiveOS
exitcode=1 в HiveOS
gpu fault detected 146
Waiting interface to come up — не работает VPN на HiveOS
Как узнать ip адрес воркера hive os
Repository update failed в HiveOS
Rave os не запускается. Boot aborted Rave os
Failed to allocate memory Raveos
gpu_driver_fault, GPU #0 fault в RaveOS
Gpu driver fault. All tasks have been stopped. Worker will be rebooted after 5 minutes в RaveOS
Miner restarted after error RaveOS
Miner restart limit reached. Worker rebooting by flag auto в RaveOS
Miner cannot be started, ОС RaveOS
Overclock can’t be applied в RaveOS
Error installing hive miners
Warning: Nvidia settings applied with errors
Nvtool error или Danger: nvtool error
Перестал отображаться кулер видеокарты HiveOS
ERROR: parsing JSON failed
NVML: can’t get fan speed for GPU #5, error code 999 hive os
Can’t get power for GPU #2
GPU1 search error: unspecified launch failure
Warning: Autofan: unable to set fan speed, rebooting

CUDA error in CudaProgram.cu:373 : out of memory (2) #1857

Phoenix Miner 4.7c Windows/msvc — Release build

CUDA version: 10.0, CUDA runtime: 8.0
Available GPUs for mining:
GPU1: GeForce GTX 1050 Ti (pcie 2), CUDA cap. 6.1, 4 GB VRAM, 6 CUs
Nvidia driver version: 441.12
Eth: the pool list contains 1 pool (1 from command-line)
Eth: primary pool: daggerhashimoto.br.nicehash.com:3353
Starting GPU mining
Eth: Connecting to ethash pool daggerhashimoto.br.nicehash.com:3353 (proto: Nicehash)
GPU1: 30C 30% 36W
GPUs power: 35.6 W
Eth: Connected to ethash pool daggerhashimoto.br.nicehash.com:3353 (172.65.195.159)
Listening for CDM remote manager at port 4000 in read-only mode
Eth: Subscribed to ethash pool
Eth: Worker 3KbfgWhzLi4QrM6METY3aAzAA1NBhWj7j9$0-FfPFLHLcjVWIwstHKh8jHw authorized
Eth: New job #79ff35a2 from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
GPU1: Starting up. (0)
GPU1: Generating ethash light cache for epoch #296
Eth: New job #b158f306 from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Eth: New job #7f4f2f7d from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Light cache generated in 2.6 s (20.7 MB/s)
GPU1: Allocating DAG (3.33) GB; good for epoch up to #298
CUDA error in CudaProgram.cu:373 : out of memory (2)
GPU1: CUDA memory: 4.00 GB total, 3.30 GB free
GPU1 initMiner error: out of memory
Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00
Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00
Eth: New job #831b4fb4 from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Eth: New job #e68b5bc2 from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00
Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00
Eth: New job #f7f9bd07 from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Eth: New job #be9671bf from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00
GPU1: 30C 30% 36W
GPUs power: 35.6 W
Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00
Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00
Eth: New job #40399f1d from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Eth: New job #c947cfff from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Eth: New job #9a1e3f7f from daggerhashimoto.br.nicehash.com:3353; diff: 8590MH
Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00

*** 0:00 *** 11/7 17:13 **************************************
Eth: Mining ETH on daggerhashimoto.br.nicehash.com:3353 for 0:00
Eth: Accepted shares 0 (0 stales), rejected shares 0 (0 stales)
Eth: Incorrect shares 0 (0.00%), est. stales percentage 0.00%
Eth: Average speed (5 min): 0.000 MH/s

Eth speed: 0.000 MH/s, shares: 0/0/0, time: 0:00

The text was updated successfully, but these errors were encountered:

Hi @WillianSalceda
Thank you for reaching out.
The reason your gpu is unable to mine daggerhashimoto because it doesn’t have enough memory.
It hash 3.30 GB free memory but current DAG SIZE is over this number. So if you would still want to mine this algorithm install Windows 7, since it doesn’t take that much memory as Windows 10. Or just start mining another algorithm.

I hope my answer was helpful, if you still have any questions please reopen the issue.

Источник

Ошибки Видеокарты При Майнинге

Зарабатывай на чужих сделках на бирже BingX. Подробнее — тут.

UNABLE TO ENUM CUDA GPUS: INVALID DEVICE ORDINAL

UNABLE TO ENUM CUDA GPUS: INSUFFICIENT CUDA DRIVER: 5000

NBMINER MINING PROGRAM UNEXPECTED EXIT.CODE: -1073740791, REASON: PROCESS CRASHED

NBMINER CUDA ERROR: OUT OF MEMORY (ERR_NO=2) — как исправить?

GMINER ERROR ON GPU: OUT OF MEMORY STOPPED MINING ON GPU0

Socket error. the remote host closed the connection, в майнере Nbminer

DAG has been damaged check overclocking settings, в майнере Gminer

ERROR: Can’t start T-Rex, failed to initialize device map: can’t get busid, code -6

Ошибки настройки памяти с кодом -6 обычно указывают на проблему с драйвером.

TREX: Can’t unlock GPU

Полный текст ошибки:
TREX: Can’t unlock GPU [ID=1, GPU #1], error code 15
WARN: Miner is going to shutdown.
WARN: NVML: can’t get fan speed for GPU #1, error code 15
WARN: NVML: can’t get power for GPU #1, error code 15
WARN: NVML: can’t get mem/core clock for GPU #1, error code 17

Решение:

Проверьте все кабельные соединения видеокарты и райзера, особенно кабеля питания.
Если с первый пунктом все ок, попробуйте поменять райзер на точно рабочий.
Если ошибка остается, вставьте видеокарту в разъем х16 напрямую в материнскую плату.

CAN’T START MINER, FAILED TO INITIALIZE DEVIS MAP, CAN’T GET BUSID, CODE -6

Зарабатывай на чужих сделках на бирже BingX. Подробнее — тут.

ОШИБКА 511 ГРАДУСОВ НА ВИДЕОКАРТА

GPU driver error, no temps в HiveOS — что делать?

Если ошибка не уйдет — проверьте райзер.

GPU are lost, rebooting

Это не ошибка, а ее последствие. Что узнать какая ошибка приводит к перезагрузке карт, сделайте следующее:

Включите сохранение логов (по умолчанию они выключены) командой

message file «miner.log» -f=/var/log/miner/minername/minername.log

Итак, скажем, например, мне нужны логи TeamRedMiner
message file «teamredminer.log» -f=/var/log/miner/teamredminer/teamredminer.log

message file «syslog» -f=/var/log/syslog

exitcode=3 в HiveOS

Если ошибка не уйдет — проверьте райзер.

exitcode=1 в HiveOS

gpu fault detected 146

Waiting interface to come up — не работает VPN на HiveOS

Как узнать ip адрес воркера hive os

Repository update failed в HiveOS

Rave os не запускается. Boot aborted Rave os

Failed to allocate memory Raveos

Но решение у нее одно — вы должны правильно настроить БИОС материнской платы.

gpu_driver_fault, GPU #0 fault в RaveOS

gpu_driver_fault, GPU #0 fault в RaveOS

Gpu driver fault. All tasks have been stopped. Worker will be rebooted after 5 minutes в RaveOS

Miner restarted after error RaveOS

Miner restart limit reached. Worker rebooting by flag auto в RaveOS

Miner cannot be started, ОС RaveOS

Поставьте майнер на паузу, перезагрузите риг и в консоли выполните команды clear-miners clear-logs и fix-fs. Запустите майнинг.
Если ошибка не ушла, перепишите образ RaveOS.

Overclock can’t be applied в RaveOS

Error installing hive miners

Если ошибка не уйдет — физически перезапишите образ. Если у вас флешка, то скорее всего она умерла. Купите SSD. 🙂

Warning: Nvidia settings applied with errors

Перестал отображаться кулер видеокарты HiveOS

ERROR: parsing JSON failed

Необходимо выполнить на риге локально (с клавиатурой и монитором) следующую команду:
net-test

NVML: can’t get fan speed for GPU #5, error code 999 hive os

Can’t get power for GPU #2

Решение:

GPU1 search error: unspecified launch failure

Warning: Autofan: unable to set fan speed, rebooting

Найдите логи майнера, посмотрите какие ошибки майнер пишет в логах. Например:

• 13, 43, 45: ошибки памяти, снизить MEM
• 8, 31, 32, 61, 62: снизить CORE, возможно и MEM
• 79: снизить CORE, проверить райзер

Источник

Worker is offline on minerstat but appears to be mining on the pool

When you have minerstat software properly installed and your worker is mining but showing offline in minerstat dashboard and online on the pool, something broke the communication between minerstat software, minerstat dashboard, and your machine. This … Read more →

Why is chart sometimes dropping to zero?

First of all, there’s no need to panic. This may be a symptom of a problem that needs further inspection and fixing, or can be something absolutely normal and nothing to worry about. Read more →

Why does Windows rig keep restarting?

The first thing to note is do not panic — there are many reasons for a rig to be restarting, and it may take some time to diagnose, especially if a problem is only ocurring periodicaly. In that case, Triggers functionality can be a great aid, bu… Read more →

What does undetected hardware mean?

GPU rigs
If your hardware is undetected it means that your mining rig didn’t manage to send out data to the minerstat server about your hardware. This could be due to the node crash or you didn’t start mining yet. Read more →

Coin is displayed as ‘Unknown’ or ‘NO’

Coin can be displayed as ‘NO’ or ‘Unknown’ in two cases.
A) We didn’t add support for it yet
In case your worker shows «NO» under the coin’s name, but it is mining normally and you can see the pool on which the … Read more →

GPU(s) displayed as Unknown

In case your GPUs are not displayed with their model name (for example, GTX 1080Ti, RX 580, etc.) do note that this is only a display interface issue and all other functionalities should work as expected. Read more →

Why has GPU sometimes temperature 511°C?

If you see drivers reporting temperature 511 °C this isn’t the actual temperature of your GPU. Drivers denote a driver error with such value of temperature. Read more →

msOS doesn’t detect all GPUs

If msOS doesn’t detect all of your GPUs in the system, check the following.
BIOS settings
Set UEFI boot in BIOS
Set all GPUs to PCIe GEN 2. Read more →

What to do if auto-fans aren’t working?

In case auto-fans aren’t working or you have set up the fan triggers and you see that the trigger was sent but wasn’t applied, your GPU drivers aren’t accepting fan control command.Usually, this is connected with too intense overclocking a… Read more →

Invalid user provided

If you are getting errors like Invalid user provided or Invalid wallet provided you are not mining to correct wallet address.
You will need to recheck your worker’s config. Read more →

Missing pool port

The pool address to which you connect for mining consists of two elements:
Domain, subdomain, or IP
Port
These two elements are connected with the colon, like this:
eu.sandbox. Read more →

Benchmark: Failed (Missing hashrate)

When you are benchmarking your cards, it can happen that some mining clients won’t report the speed. In this case, the result of the benchmark for this mining client and algorithm combination will be Failed (Missing hashrate). Read more →

Error (Error: certificate has expired) Waiting for connection

If you get the following certificate has expired error on your msOS rig
Error (Error: certificate has expired)
Waiting for connection
this means that the date and time in your BIOS aren’t properly set (either they date into the past or in t… Read more →

Error: You need to load the kernel first

If you see the following error on your screen:
error: /boot/vmlinuz-5.0. Read more →

Error: EHOSTUNREACH

If you see an error that says EHOSTUNREACH, it could be that you are having internet issues. You can quickly check if this is the case by calling netcheck command. Read more →

Nvidia 460.79 drivers not working

If you are mining on Windows and unlucky enough to install recent Nvidia drivers, then you want to use our simple NVML patcher. Usage is simple: Just close down every application/driver control panel that can use NVML, then run the . Read more →

Connect error: Connection refused

An error Connect error: «Connection refused» is usually connected with the following reasons: No internet access; DNS issues; Incompatibe mining client; IP banned by the pool; Pool is unreachable (having temporary issues); Hardware related, … Read more →

How to solve API bind error?

Sometimes mining client will show you an output such as:
API bind error
TCP API bind address already in use
Bind failed with error
Port is busy
Port already in use
Failed to bind to port
These errors occur when the previous mining… Read more →

Nvidia-settings: Couldn’t connect to accessibility bus

If you run into an error saying that Nvidia-settings couldn’t connect to the accessibility bus, it means that the driver had failed to locate the GPU. In case that you have only one GPU in the system, make sure you have plugged it to the first x16… Read more →

Unable to query number of CUDA devices

If you see an error «Unable to query number of CUDA devices» your drivers didn’t detect any GPU in the system and because of that the mining won’t work. There can be several different reasons for it. Read more →

Socket connection closed remotely by pool

There could be different reasons why your socket connection was closed by pool. We will list the most common ones. Read more →

Kernel panic — not syncing: Out of memory and no killable process

Kernel panic error that causes out of memory state and no killable process often happens to AMD GPUs that are used for mining CryptoNight variants algorithms.
To fix this error, you need to do add more virtual memory to your system. Read more →

libEGL warning: DRI2: failed to authenticate

If you have Nvidia GPUs and you get libEGL warning: DRI2: failed to authenticate error on your msOS, please type mreconf in the console and the system will restore. Read more →

The semaphore timeout period has expired

If you are getting error The semaphore timeout period has expired as a response to mining to a pool, your router is probably blocking the connection to the pool. Asus routers have an option for «vulnerability protection». Read more →

GPU1: clSetKernelArg (-48) Fatal error detected. Restarting.

Sometimes PhoenixMiner will exit with the following error:
GPU1: clSetKernelArg (-48)
Fatal error detected. Restarting. Read more →

Unable to enum CUDA GPUs: invalid device ordinal

If you are getting Unable to enum CUDA GPUs: invalid device ordinal error, then you can try the following:
Update your BIOS to the latest version.
Call pci-realloc command. Read more →

T-Rex instance wasn’t validated

If you run into an error saying that T-Rex instance wasn’t validated it means that the miner has been unable to connect to sub-domains of trex-miner.com, port 443. Read more →

Console: Pool error

When you see «Pool error» in your 24h logs or worker’s latest activity, this can describe different issues connected to the pool, your network, or your connection with the pool.
The most common issues connected to this console error a… Read more →

Console: Authorization error

If you see the «Authorization error» on your 24h logs or worker’s latest activity, this means that the pool has rejected your login info.
Possible reasons for rejection:
Pool rejected your login due to invalid wallet address or … Read more →

Console: Config error

The «Config error» can be found in 24h logs and worker’s latest activity when the configuration of the mining client is incorrect. This can happen for a lot of different reasons as every mining client uses their own parameter structure a… Read more →

Console: Mining client error

The «Mining client error» that is shown in your 24h logs or worker’s latest activity indicates that the error is connected to the mining client. The most common situtation when this error appears is when the mining client is using an API… Read more →

Console: GPU error

When you see a «GPU error» on your 24h logs or worker’s latest activity there is a trouble with detecting information connected to your GPU — in some cases, you will also be able to see which GPUs are the problematic ones.
We suggest … Read more →

Console: Driver error

If you see «Driver error» in your 24h logs or worker’s latest activity, the error means that there are issues with the drivers on the rig.
To fix this error, check the following:
Check if you have added your worker in correct ty… Read more →

Console: System message

«System message» in your 24h logs or worker’s latest activity denotes the rig’s last response to manual or automatic reboot, shut down, stop and start mining, or power cycle commands. It is just a notification for you to know that th… Read more →

Diagnostic: Summary — Status

The status audit is part of a real-time summary section and available under the summary tab in the diagnostic. It shows the number of workers with specific status: online, offline, and idle. Read more →

Diagnostic: Summary — Temperature

The temperature audit is part of a real-time summary section and available under the summary tab in the diagnostic. It shows the number of workers with a specific issue: workers with at least one GPU or board over the hot limit and workers with a… Read more →

Diagnostic: Summary — Missing data

A missing data audit is part of a real-time summary section and available under the summary tab in the diagnostic. It shows the number of workers with a specific issue: missing temperature data and missing fan data. Read more →

Diagnostic: Summary — Mining issues

A mining issue audit is part of a real-time summary section and available under the summary tab in the diagnostic. It shows the number of workers with a specific issue: missing hashrate (or hashrate speed 0 H/s) and low efficiency (efficiency sma… Read more →

Diagnostic: Profitability trend

The profitability trend is available under the statistics tab in the diagnostic and shows average daily earnings on your account with a profitability range (minimum and maximum peaks of estimated daily earnings; not taking offline periods in… Read more →

Diagnostic: Uptime

The uptime is available under the statistics tab in the diagnostic and is calculated from your global statistics data. This means that it is calculated as the ratio between data points when your rig was detected as online and the total sum o… Read more →

Diagnostic: Volatility

Same as uptime, volatility is available under the statistics tab in the diagnostic and is calculated from your global statistics data. It takes into consideration all data points with online status and is calculated as the ratio between stan… Read more →

Diagnostic: GPU temperature alerts

The GPU temperature alerts audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined de… Read more →

Diagnostic: Offline alerts

Offline alerts audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined depending on h… Read more →

Diagnostic: ASIC alerts

The ASIC alerts audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined depending on … Read more →

Diagnostic: Hashrate drop alerts

The hashrate drop alerts audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined depe… Read more →

Diagnostic: Efficiency drop alerts

The efficiency drop alerts audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined de… Read more →

Diagnostic: Temperature triggers

The temperature triggers audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined depe… Read more →

Diagnostic: Unresponsive triggers

Unresponsive triggers audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined dependi… Read more →

Diagnostic: Idle triggers

The idle triggers audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined depending o… Read more →

Diagnostic: Inactive triggers

The inactive triggers audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined dependi… Read more →

Diagnostic: Hashrate drop triggers

The hashrate drop triggers audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined de… Read more →

Diagnostic: Efficiency drop triggers

The efficiency drop triggers audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined … Read more →

Diagnostic: Earnings drop triggers

The earnings drop triggers audit is available under the activity tab in the diagnostic and shows the current grade from the data obtained yesterday and a small historical chart for the performance of the last week. The grade is determined de… Read more →

Diagnostic: Console errors — Driver

The console errors — driver audit is available under the activity tab in the diagnostic and is currently shown only if you have at least one msOS rig running. The audit shows the current grade from the data obtained yesterday and a small his… Read more →

Diagnostic: Console errors — Pool

The console errors — pool audit is available under the activity tab in the diagnostic and is currently shown only if you have at least one msOS rig running. The audit shows the current grade from the data obtained yesterday and a small histo… Read more →

Diagnostic: Console errors — Config

The console errors — config audit is available under the activity tab in diagnostic and is currently shown only if you have at least one msOS rig running. The audit shows the current grade from the data obtained yesterday and a small histori… Read more →

Diagnostic: Console errors — Authorization

The console errors — authorization audit is available under the activity tab in the diagnostic and is currently shown only if you have at least one msOS rig running. The audit shows the current grade from the data obtained yesterday and a sm… Read more →

Diagnostic: Console errors — Mining client

The console errors audit — mining client is available under the activity tab in the diagnostic and is currently shown only if you have at least one msOS rig running. The audit shows the current grade from the data obtained yesterday and a small h… Read more →

Diagnostic: Console errors — GPU

The console errors audit — GPU is available under the activity tab in the diagnostic and is currently shown only if you have at least one msOS rig running. The audit shows the current grade from the data obtained yesterday and a small historical chart… Read more →

Nvidia throttling — Slowdown

Thermal throttling is a state that happens when your GPU is taking too much load and starts overheating. For Nvidia 3000 series such events are quite common, so it is extremely important to have good cooling system and smart position of your GPUs. Read more →

Nvidia throttling — Thermal slowdown

Nvidia throttling — Power brake slowdown

Nvidia throttling — Software thermal slowdown

Источник

Вот некоторая информация о моей системе:

Я также проверил, установлены ли заголовки ядра:

Установка CUDA

Так что моя система отвечает всем необходимым условиям. Затем я следовал инструкциям по установке через apt-get (я установил cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb ).

PATH а также LD_LIBRARY_PATH установлены для указания на необходимые места:

Драйверы NVIDIA также выглядят современными:

Информация о драйвере компилятора cuda:

В инструкциях упоминается, что это может быть проблемой с разрешением файла:

Если устройство с поддержкой CUDA и драйвер CUDA установлены, но deviceQuery сообщает об отсутствии устройств с поддержкой CUDA, это, вероятно, означает, что файлы /dev/nvidia* отсутствуют или имеют неправильные разрешения.

У тех файлов не было флага выполнения, который я тогда добавил:

Однако после запуска deviceQuery (который по-прежнему не работает) некоторые разрешения сбрасываются:

Может быть связано

Сборка образцов не удалась

Когда я пытаюсь собрать образцы CUDA через make это не удается для одного из них с сообщением

Который действительно, кажется, отсутствует:

Хотя соответствующий заголовочный файл есть:

Проблема со статической связью

Ошибка, которая возникает из deviceQuery предлагает проблему со статической связью:

насколько мне известно LD_LIBRARY_PATH отвечает только за динамическое связывание. Я нашел этот вопрос, где предложение включить /usr/lib/nvidia-current к пути компоновщика. Однако этот каталог не существует в моей установке:

Источник

no CUDA-capable device is detected (using ubuntu 12.04.4 server) [closed]

Want to improve this question? Update the question so it’s on-topic for Stack Overflow.

I recently installed the cuda toolkit 5.5 with driver 331.67 (I have a GeForce GTX 680). For some reason, I cannot run any of the test scrips:

I followed the steps on the «getting started guide» here

and made a script to create the character device files at startup (as I am running the server edition of Ubuntu such graphics files aren’t created by default):

Here is some info on the nvidia module

EDIT #1 I tried downgrading to driver 319.76:

1 Answer 1

So it turns out the main error I was encountering was due to the fact that there was a version mismatch between the nvidia kernel module and the driver component. Here are the steps I took which helped me find a resolution.

2) Having installed the kernel modules from the repos, I just picked the corresponding driver component with correct version. If you don’t know the version of your installed kernel module you can use modprobe and modinfo. For example, on my system

The module nvidia_304_updates was installed from the repos (package nvidia-updates-current). Its exact version is found with modinfo

After downloading and installing the corresponding driver component from the archive on the nvidia website,

, I was able to run the command

And the original script I was trying to execute

Источник

RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected #8

Comments

VictorZuanazzi commented Nov 15, 2019

great work with the library!

I am trying to install it, but I am getting a cuda error. I have been using pytorch the gpus wihout problems until now.

The full line reads: RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /opt/conda/conda-bld/pytorch_1570910687650/work/aten/src/THC/THCGeneral.cpp:50

I am using python 3.7.3 and pytorch 1.3.

The text was updated successfully, but these errors were encountered:

Источник

No CUDA-capable device is detected although requirements are installed

Here is some information about my system:

I also verified the kernel headers are installed:

Installation of CUDA

So my system meets all the prerequisites. I then followed the instructions for the installation via apt-get (I installed cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb ).

PATH and LD_LIBRARY_PATH are set to point to the required locations:

The NVIDIA drivers also seem to be up-to-date:

Information about the cuda compiler driver:

The instructions mention that this could be a problem with file permission:

If a CUDA-capable device and the CUDA Driver are installed but deviceQuery reports that no CUDA-capable devices are present, this likely means that the /dev/nvidia* files are missing or have the wrong permissions.

Those files didn’t have the execution flag which I then added:

However after running deviceQuery (which still fails) some of the permissions are reset:

Maybe related

Samples build fails

When I try to build the cuda samples via make it fails for one of them with the message

Which indeed seems to be missing:

Although the corresponding header file is there:

Problem with static linking

The error which is raised from deviceQuery suggests a problem with static linking:

AFAIK LD_LIBRARY_PATH is only responsible for dynamic linking. I found this question where a suggestion is to include /usr/lib/nvidia-current to the linker path. However this directory doesn’t exist within my installation:

Источник

Tensorflow complains that no CUDA-capable device is detected

I’m trying to run some Tensorflow code, and I get what seems to be a common problem:

The key pieces of that error message seem to be:

How can I install compatible versions? Where is that libcuda version coming from?

Background

A few months ago, I tried installing Tensorflow with GPU support, but the versions either broke my display or wouldn’t work with Tensorflow. Finally, I got it working by following a tutorial on how to install multiple versions of the CUDA libraries on the same machine. That worked at the time, but when I came back to the project after a few months, it has stopped working. I assume that some driver got upgraded during that time.

Investigation

The first thing I tried was to see what versions I have of the nvidia drivers and libcuda package.

Looks like it’s 390.30. Why does the error message say that libcuda reported 390.77?

Again, everything looks like it’s 390.30. There were some packages that had version 390.77, but they were in the rc status. I guess I installed that version and later removed it, so the configuration files were left behind. I purged the configuration files with commands like this:

Now, there are no packages at all with version 390.77.

I tried reinstalling CUDA, to see if it had been compiled with the wrong version.

That didn’t make any difference.

Finally, I tried running nvidia-smi.

All of this is running on Ubuntu 18.04 with Python 3.6.7, and my graphics card is NVIDIA Corporation GM107M [GeForce GTX 960M] (rev a2).

Источник