System fatal error during previous boot core mid level error

DELL PowerEdge-неустранимая ошибка системы во время предыдущей загрузки мой выделенный сервер DELL R710 (CentOS 6.4) перезагружается сам по себе и выскакивает со следующей ошибкой. означает ли это, что коробка не может загрузиться, или это ядро запаниковало во время загрузки Linux, и сервер каким-то образом знает? может кто посоветует на диагностику или если это аппаратная […]

Содержание

  1. DELL PowerEdge-неустранимая ошибка системы во время предыдущей загрузки
  2. 2 ответов
  3. R720 PCIe Limit?
  4. superhappychris
  5. Guide: Flashing H310/H710/H810 Mini & full size to IT Mode
  6. fohdeesha
  7. BlueScope819
  8. fohdeesha
  9. BlueScope819
  10. fohdeesha
  11. BlueScope819
  12. fohdeesha
  13. BlueScope819
  14. BlueSandbox
  15. fohdeesha
  16. BlueSandbox
  17. BlueSandbox
  18. BlueScope819
  19. BlueSandbox
  20. BlueSandbox
  21. BlueScope819
  22. BlueSandbox
  23. BlueScope819

DELL PowerEdge-неустранимая ошибка системы во время предыдущей загрузки

мой выделенный сервер DELL R710 (CentOS 6.4) перезагружается сам по себе и выскакивает со следующей ошибкой.

означает ли это, что коробка не может загрузиться, или это ядро запаниковало во время загрузки Linux, и сервер каким-то образом знает?

может кто посоветует на диагностику или если это аппаратная проблема и должны быть переданы в ЦОД от кого я снимаю коробку? Работает нормально в течение нескольких месяцев и теперь последние два дня случайно перезагрузившей.

обновление — Box продолжает перезагружаться в течение одной минуты, а затем в следующей строке отображается загрузка ядра без выключения или другого сообщения об ошибке.

обновление 2

я запускал утилиту stress на сервере за последние 4 дня сервер не перезагружался ни разу. Его максить всех ядер на 100% ЦП. Мне нужно будет проверить, стресс использует память или дисковые записи, но что касается процессоров обеспокоены тем, что они кажутся в порядке.

2 ответов

Как даты R710 от 2009/2010, компонентный отказ всегда возможность.

Dell документация (хотя для R410) говорит :

Как только я вижу сообщение о скорости вентилятора , Я думаю, что вы должны тщательно изучить и регистрировать температуру и ее изменение.

Он также не мешало бы открыть сервер, очистите его и проверьте все контакты.

вы можете попробовать использовать средства как в статье как устранить аппаратные проблемы в Linux и сообщить здесь результаты.

это сообщение приходит из BIOS с просьбой продолжить. Это означает, что материнская плата увидела что-то, что ей не понравилось на аппаратном уровне. ОС не сделал бы этого, и это и будет регистрироваться что-то в файл сообщений, если бы ему была предоставлена возможность. Я бы попросил, чтобы на сервере был запущен полный diag. приглашение F1/F2 обычно является оповещением о сбое конфигурации или аппаратного обеспечения BIOS.

Источник

R720 PCIe Limit?

superhappychris

New Member

Hi everybody, long time viewer, first time poster.

I’m running into a weird problem with the Ableconn PEXM2-130 Dual PCIe NVMe M.2 SSDs Carrier (ASMedia ASM2824 Switch) I just bought where it only recognizes one of the NVMe drives unless I unplug the disk shelf attached to the LSI 9207-8e card I also have installed. If I attach the NVMe drives using single drive PCIe cards in separate slots then both drives are recognized even if the disk shelf is attached. Has anyone run into this before or know how to go about troubleshooting?

I noticed that when I plug in the Ableconn card the following error shows when booting: «Alert! System fatal error during previous boot PCI Express Error»

Info on the server in question — Dell R720 running Ubuntu Server 20.04.2 equipped with two E5-2630 v2 CPUs, 1100W power supplies and the following items in the PCIe slots

  • Slot 1: LSI 9207-8e (connected to a 15 drive EMC Jbod disk shelf)
  • Slot 2: Ableconn PEXM2-130 Dual PCIe NVMe M.2 SSDs Carrier (ASMedia ASM2824 Switch)
  • Slot 4: HP H240 internal SAS3 card running in HBA mode (the 8 SFF drive bay is connected to this instead of the onboard controller)
  • Slot 5: Generic M.2 NVME to PCIe 3.0 x4 Adapter
  • Slot 6: GTX 1070 GPU

I’ve tried swapping the expansion cards into different slots to no effect. Not sure how to proceed from here.

Источник

Guide: Flashing H310/H710/H810 Mini & full size to IT Mode

fohdeesha

Kaini Industries

I have double checked, SR-IOV Global Enable and I/OAT DMA Engine are disabled.

Here is the output of the command, I simply booted into Debian and ran the command. I also than ran the script, then the command again, if that helps.

When booting up into the Linux image, there were several errors that came across the screen, but then quickly vanished after it booted. It probably isn’t related, but here is a picture just in case.

Also, at the bottom of that first picture, there are a couple errors that dump themselves into the console, although if I press enter I get a shell back. I’m not really sure what those are, although because one of them mentioned mpt3sas I included it the image.

Hope this helps.

That’s really, really strange — the card is getting crossflashed to the correct PCI vendor and sub-vendor IDs, so no problem there. At this point I’m not sure what else to try other than:

ensure your BIOS is at v2.9.0, and your idrac is at v2.65.65.65 — very important. You can easily update both through the idrac web UI under idrac settings > update/rollback. There’s an upload form, just upload these two exe files and it will recognize them as firmware updates, and idrac will install them without needing to boot any OS (they are for R720XD as you mentioned in your other post):

After that, try again after both are fully updated. If still no luck, you need to open it and find the NVRAM_CLR jumper — move the jumper to pins 2-3 instead of 1-2 where it is by default, and leave it for ten seconds, then move it back. then power the server back on. This will reset the BIOS/UEFI settings more thoroughly than a simple «reset to defaults» will in the bios menu. One other member at least in this thread had this solve the same problem:

BlueScope819

New Member

That’s really, really strange — the card is getting crossflashed to the correct PCI vendor and sub-vendor IDs, so no problem there. At this point I’m not sure what else to try other than:

ensure your BIOS is at v2.9.0, and your idrac is at v2.65.65.65 — very important. You can easily update both through the idrac web UI under idrac settings > update/rollback. There’s an upload form, just upload these two exe files and it will recognize them as firmware updates, and idrac will install them without needing to boot any OS (they are for R720XD as you mentioned in your other post):

After that, try again after both are fully updated. If still no luck, you need to open it and find the NVRAM_CLR jumper — move the jumper to pins 2-3 instead of 1-2 where it is by default, and leave it for ten seconds, then move it back. then power the server back on. This will reset the BIOS/UEFI settings more thoroughly than a simple «reset to defaults» will in the bios menu. One other member at least in this thread had this solve the same problem:

I’ve updated the BIOS and iDrac with the files you linked, and cleared NVRAM. Oddly, when I went to revert the card back to stock dell firmware to try again, it detected as a D1 revision instead of a B0 revision. I cancelled the rollback, because it didn’t match up with what I saw earlier. Going through the whole process again (without reverting), I encountered the exact same behavior. Perhaps updating the BIOS changed how the card was detected or something? If I revert back to the D1 firmware and then follow the D1 guide, would that brick the card if it was in fact a B0 revision and the software isn’t detecting it correctly?

Thank you for your help.

fohdeesha

Kaini Industries

BlueScope819

New Member

fohdeesha

Kaini Industries

BlueScope819

New Member

The info command is to show that it’s the LSI firmware before reverting. While the first line detects it as a B0 the warning says D1.

fohdeesha

Kaini Industries

The info command is to show that it’s the LSI firmware before reverting. While the first line detects it as a B0 the warning says D1.

woops! That’s my bad — those script commands don’t actually detect what card you have, it’s just static text for each command — the B0 command you ran had a typo showing d1 instead of B0 — I actually fixed it several months ago but forgot to build a new freedos ISO with the change: fix PB0 revert typo · Fohdeesha/lab-docu@cc94d33

Go ahead and run that revert script (you definitely have a B0) to get it back to stock, then start the flash guide from scratch again now that everything is cleared and up to date

BlueScope819

New Member

woops! That’s my bad — those script commands don’t actually detect what card you have, it’s just static text for each command — the B0 command you ran had a typo showing d1 instead of B0 — I actually fixed it several months ago but forgot to build a new freedos ISO with the change: fix PB0 revert typo · Fohdeesha/lab-docu@cc94d33

Go ahead and run that revert script (you definitely have a B0) to get it back to stock, then start the flash guide from scratch again now that everything is cleared and up to date

Okay, so once I reverted it to stock, the info command in FreeDOS worked just fine. Going through the entire process, same error, it doesn’t detect an LSI SAS card in the system with either the info or setsas command in Linux. I reset it to defaults, then cleared NVRAM and tried again. Same issue. I’ve updated the BIOS and iDrac with the files you linked.

Thankfully I can just revert it to defaults, so no harm done due to the various issues. What do you think I can try next?

BlueSandbox

New Member

Hello. I’m trying to flash my H710 mini B0 (in an r620) over ssh. I’ve gotten to the point where you set the SAS address, but when I enter the command from the guide with my own address it says «No LSI SAS adapters found». Why does it say there are no adapters found when I was able to do all of the previous steps without problem?

Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

No LSI SAS adapters found! Limited Command Set Available!
ERROR: Command Not allowed without an adapter!
ERROR: Couldn’t Create Command -c
Exiting Program.

fohdeesha

Kaini Industries

Hello. I’m trying to flash my H710 mini B0 (in an r620) over ssh. I’ve gotten to the point where you set the SAS address, but when I enter the command from the guide with my own address it says «No LSI SAS adapters found». Why does it say there are no adapters found when I was able to do all of the previous steps without problem?

Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

No LSI SAS adapters found! Limited Command Set Available!
ERROR: Command Not allowed without an adapter!
ERROR: Couldn’t Create Command -c
Exiting Program.

BlueSandbox

New Member

BlueSandbox

New Member

BlueScope819

New Member

You need to revert it back to Dell Firmware to start over. Scroll down to the bottom of the guide, the command is there.

BlueSandbox

New Member

You need to revert it back to Dell Firmware to start over. Scroll down to the bottom of the guide, the command is there.

I don’t think that trying the previous set will work for me, as I was encountering these issues before you posted the new set of ISOs, that was one of the first things I tried, moving to the new ISOs.

BlueSandbox

New Member

BlueScope819

New Member

BlueSandbox

New Member

Now I’m getting this error message when setting the SAS adress

Advanced Mode Set

Adapter Selected is a LSI SAS: SAS2308_2(B0)

Executing Operation: Program SAS Address

ERROR: Programming SAS Address Failed!

Due to error remaining commands will not be executed.
Unable to Process Commands.
Exiting SAS2Flash.

Also when I did the B0-H710 command this happened, but it completed.

$ sudo su —
root@debian:

# B0-H710
rmmod: ERROR: Module megaraid_sas is not currently loaded
rmmod: ERROR: Module mptctl is not currently loaded
rmmod: ERROR: Module mptbase is not currently loaded
Errors above are normal!
Trying unlock in MPT mode.
Device in MPT mode
Device in MPT mode
Resetting adapter in HCB mode.
Trying unlock in MPT mode.
Device in MPT mode
IOC is RESET
Device in MPT mode
Resetting adapter in HCB mode.
Trying unlock in MPT mode.
Device in MPT mode
IOC is RESET
Setting up HCB.
HCDW virtual: 0x7f43d1200000
HCDW physical: 0xc1aa00000
Loading firmware.
Loaded 809340 bytes
Booting IOC.
IOC is READY
IOC Host Boot successful.
Device in MPT mode
Removing PCI device.
Rescanning PCI bus.
PCI bus rescan complete.
Pausing for 20 seconds to allow the card to boot

edit: I’m not sure what happened but the server seems to have rebooted its self when I wasn’t looking. In the boot it said «system fatal error during previous boot. core-instruction fetch unit error, core mid level error» I pressed f1 to continue and nothing seems abnormal.

BlueScope819

New Member

Now I’m getting this error message when setting the SAS adress

Advanced Mode Set

Adapter Selected is a LSI SAS: SAS2308_2(B0)

Executing Operation: Program SAS Address

ERROR: Programming SAS Address Failed!

Due to error remaining commands will not be executed.
Unable to Process Commands.
Exiting SAS2Flash.

Also when I did the B0-H710 command this happened, but it completed.

$ sudo su —
root@debian:

# B0-H710
rmmod: ERROR: Module megaraid_sas is not currently loaded
rmmod: ERROR: Module mptctl is not currently loaded
rmmod: ERROR: Module mptbase is not currently loaded
Errors above are normal!
Trying unlock in MPT mode.
Device in MPT mode
Device in MPT mode
Resetting adapter in HCB mode.
Trying unlock in MPT mode.
Device in MPT mode
IOC is RESET
Device in MPT mode
Resetting adapter in HCB mode.
Trying unlock in MPT mode.
Device in MPT mode
IOC is RESET
Setting up HCB.
HCDW virtual: 0x7f43d1200000
HCDW physical: 0xc1aa00000
Loading firmware.
Loaded 809340 bytes
Booting IOC.
IOC is READY
IOC Host Boot successful.
Device in MPT mode
Removing PCI device.
Rescanning PCI bus.
PCI bus rescan complete.
Pausing for 20 seconds to allow the card to boot

edit: I’m not sure what happened but the server seems to have rebooted its self when I wasn’t looking. In the boot it said «system fatal error during previous boot. core-instruction fetch unit error, core mid level error» I pressed f1 to continue and nothing seems abnormal.

No idea about the SAS address bit, that’s a different error from what I was having. If you look at the guide, it seems like the server kernel panics after first reboot for whatever reason.

As for myself, I’m still stuck where it won’t let me flash the SAS address with the first error you encountered. I’m really not an expert, just sharing my own experiences with troubleshooting so far. I’ll leave it up to @fohdeesha to answer any complicated questions.

Источник

мой выделенный сервер DELL R710 (CentOS 6.4) перезагружается сам по себе и выскакивает со следующей ошибкой.

enter image description here

означает ли это, что коробка не может загрузиться, или это ядро запаниковало во время загрузки Linux, и сервер каким-то образом знает?

может кто посоветует на диагностику или если это аппаратная проблема и должны быть переданы в ЦОД от кого я снимаю коробку? Работает нормально в течение нескольких месяцев и теперь последние два дня случайно перезагрузившей.

обновление — Box продолжает перезагружаться в течение одной минуты, а затем в следующей строке отображается загрузка ядра без выключения или другого сообщения об ошибке.

Jan 10 16:29:12 squirtle kernel: Firewall: *TCP_IN Blocked* IN=em1 OUT= MAC=84:2b:2b:54:84:58:00:04:96:82:74:3e:08:00 SRC=93.174.93.67 DST=13.129.118.21 LEN=40 TOS=0x00 PREC=0x00 TTL=245 ID=54321 PROTO=TCP SPT=35003 DPT=21320 WINDOW=65535 RES=0x00 SYN URGP=0
Jan 10 16:35:50 squirtle kernel: Firewall: *UDP_IN Blocked* IN=em1 OUT= MAC=84:2b:2b:54:84:58:00:04:96:82:74:3e:08:00 SRC=179.107.38.35 DST=13.129.118.21 LEN=443 TOS=0x00 PREC=0x00 TTL=53 ID=0 DF PROTO=UDP SPT=5067 DPT=5060 LEN=423
Jan 10 16:42:05 squirtle kernel: imklog 5.8.10, log source = /proc/kmsg started.
Jan 10 16:42:05 squirtle rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1203" x-info="http://www.rsyslog.com"] start
Jan 10 16:42:05 squirtle kernel: Initializing cgroup subsys cpuset
Jan 10 16:42:05 squirtle kernel: Initializing cgroup subsys cpu
Jan 10 16:42:05 squirtle kernel: Linux version 2.6.32-431.3.1.el6.i686 (mockbuild@c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 18:53:30 UTC 2014
Jan 10 16:42:05 squirtle kernel: KERNEL supported cpus:
Jan 10 16:42:05 squirtle kernel:  Intel GenuineIntel
Jan 10 16:42:05 squirtle kernel:  AMD AuthenticAMD
Jan 10 16:42:05 squirtle kernel:  NSC Geode by NSC
Jan 10 16:42:05 squirtle kernel:  Cyrix CyrixInstead
Jan 10 16:42:05 squirtle kernel:  Centaur CentaurHauls
Jan 10 16:42:05 squirtle kernel:  Transmeta GenuineTMx86
Jan 10 16:42:05 squirtle kernel:  Transmeta TransmetaCPU
Jan 10 16:42:05 squirtle kernel:  UMC UMC UMC UMC

обновление 2

я запускал утилиту stress на сервере за последние 4 дня сервер не перезагружался ни разу. Его максить всех ядер на 100% ЦП. Мне нужно будет проверить, стресс использует память или дисковые записи, но что касается процессоров обеспокоены тем, что они кажутся в порядке.

источник

Aneesh801

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

Hi All,

I have installed ESXi 5.1 in a Dell PowerEdge R720 server and ten vms are residing in that server. The issue i am facing is the server had gone down trhee times and on the kvm it is showing «Core-Mid Level Error». I have attached the image. Is that an OS issue. I have checked the logs, but nothing found.

Guys, ideas please Smiley Sad


  • All forum topics


  • Previous Topic

  • Next Topic

5 Replies

spravtek

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

Broadcom has this in its firmware release text:

Problem:    CQ63085, CQ63119 - PCI Express Error Core-Mid Level Error when
                running continuous reboot test.

    Change:     Disallow PCIe state change in the middle of SMBUS transaction.

    Introduced: 7.2.53 and 7.4.0.

    Relevance:  578xx.

So it might help to update the firmware…

TomHowarth

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

This error is way before the machine becomes aware of whatever operating system the machin is hosting.  The CPLD is part of the iDrac,  consider a firmware update of a call to Dell Support.

Tom Howarth VCP / VCAP / vExpert
VMware Communities User Moderator
Blog: http://www.planetvm.net
Contributing author on VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment
Contributing author on VCP VMware Certified Professional on VSphere 4 Study Guide: Exam VCP-410

Aneesh801

  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Report Inappropriate Content

Hi Guys,

Thanks for your help.

The server was showing an error on the screen » CPU1 Internal Error» after these reboots. I logged a case with Dell and they physically swapped the CPUs. After that, the reboot is not happening. Also i could find that the server BIOS and NIC firmware are not the latest one. I will update it soon.

Thanks & Regards,

Aneesh

fohdeesha


  • #221

I have double checked, SR-IOV Global Enable and I/OAT DMA Engine are disabled.

Here is the output of the command, I simply booted into Debian and ran the command. I also than ran the script, then the command again, if that helps.

img

When booting up into the Linux image, there were several errors that came across the screen, but then quickly vanished after it booted. It probably isn’t related, but here is a picture just in case.

errors

Also, at the bottom of that first picture, there are a couple errors that dump themselves into the console, although if I press enter I get a shell back. I’m not really sure what those are, although because one of them mentioned mpt3sas I included it the image.

Hope this helps.

That’s really, really strange — the card is getting crossflashed to the correct PCI vendor and sub-vendor IDs, so no problem there. At this point I’m not sure what else to try other than:

ensure your BIOS is at v2.9.0, and your idrac is at v2.65.65.65 — very important. You can easily update both through the idrac web UI under idrac settings > update/rollback. There’s an upload form, just upload these two exe files and it will recognize them as firmware updates, and idrac will install them without needing to boot any OS (they are for R720XD as you mentioned in your other post):

idrac: https://dl.dell.com/FOLDER06110107M…roller_Firmware_0GHF4_WN64_2.65.65.65_A00.EXE
bios: https://dl.dell.com/FOLDER05981274M/1/BIOS_8P8WX_WN64_2.9.0.EXE

After that, try again after both are fully updated. If still no luck, you need to open it and find the NVRAM_CLR jumper — move the jumper to pins 2-3 instead of 1-2 where it is by default, and leave it for ten seconds, then move it back. then power the server back on. This will reset the BIOS/UEFI settings more thoroughly than a simple «reset to defaults» will in the bios menu. One other member at least in this thread had this solve the same problem:

  • #222

That’s really, really strange — the card is getting crossflashed to the correct PCI vendor and sub-vendor IDs, so no problem there. At this point I’m not sure what else to try other than:

ensure your BIOS is at v2.9.0, and your idrac is at v2.65.65.65 — very important. You can easily update both through the idrac web UI under idrac settings > update/rollback. There’s an upload form, just upload these two exe files and it will recognize them as firmware updates, and idrac will install them without needing to boot any OS (they are for R720XD as you mentioned in your other post):

idrac: https://dl.dell.com/FOLDER06110107M…roller_Firmware_0GHF4_WN64_2.65.65.65_A00.EXE
bios: https://dl.dell.com/FOLDER05981274M/1/BIOS_8P8WX_WN64_2.9.0.EXE

After that, try again after both are fully updated. If still no luck, you need to open it and find the NVRAM_CLR jumper — move the jumper to pins 2-3 instead of 1-2 where it is by default, and leave it for ten seconds, then move it back. then power the server back on. This will reset the BIOS/UEFI settings more thoroughly than a simple «reset to defaults» will in the bios menu. One other member at least in this thread had this solve the same problem:

I’ve updated the BIOS and iDrac with the files you linked, and cleared NVRAM. Oddly, when I went to revert the card back to stock dell firmware to try again, it detected as a D1 revision instead of a B0 revision. I cancelled the rollback, because it didn’t match up with what I saw earlier. Going through the whole process again (without reverting), I encountered the exact same behavior. Perhaps updating the BIOS changed how the card was detected or something? If I revert back to the D1 firmware and then follow the D1 guide, would that brick the card if it was in fact a B0 revision and the software isn’t detecting it correctly?

Thank you for your help.

fohdeesha


  • #223

Well that’s a first! At this point, what you need to do to be safe is take the perc mini card out of the server, and take a picture of the label on the back. It will have a dell part number, with that I can tell you exactly what it actually is. If you run the wrong flash or revert script with the wrong sbr it will indeed brick it

  • #224

Well that’s a first! At this point, what you need to do to be safe is take the perc mini card out of the server, and take a picture of the label on the back. It will have a dell part number, with that I can tell you exactly what it actually is. If you run the wrong flash or revert script with the wrong sbr it will indeed brick it

Here you go!

controller

fohdeesha


  • #225

Ok, that’s definitely an H710P Mini B0 — where are you seeing it identified as a D1? Can you post a screenshot?

  • #226

Ok, that’s definitely an H710P Mini B0 — where are you seeing it identified as a D1? Can you post a screenshot?

image

The info command is to show that it’s the LSI firmware before reverting. While the first line detects it as a B0 the warning says D1.

fohdeesha


  • #227

image

The info command is to show that it’s the LSI firmware before reverting. While the first line detects it as a B0 the warning says D1.

woops! That’s my bad — those script commands don’t actually detect what card you have, it’s just static text for each command — the B0 command you ran had a typo showing d1 instead of B0 — I actually fixed it several months ago but forgot to build a new freedos ISO with the change: fix PB0 revert typo · Fohdeesha/lab-docu@cc94d33

Go ahead and run that revert script (you definitely have a B0) to get it back to stock, then start the flash guide from scratch again now that everything is cleared and up to date

  • #228

woops! That’s my bad — those script commands don’t actually detect what card you have, it’s just static text for each command — the B0 command you ran had a typo showing d1 instead of B0 — I actually fixed it several months ago but forgot to build a new freedos ISO with the change: fix PB0 revert typo · Fohdeesha/lab-docu@cc94d33

Go ahead and run that revert script (you definitely have a B0) to get it back to stock, then start the flash guide from scratch again now that everything is cleared and up to date

Okay, so once I reverted it to stock, the info command in FreeDOS worked just fine. Going through the entire process, same error, it doesn’t detect an LSI SAS card in the system with either the info or setsas command in Linux. I reset it to defaults, then cleared NVRAM and tried again. Same issue. I’ve updated the BIOS and iDrac with the files you linked.

Thankfully I can just revert it to defaults, so no harm done due to the various issues. What do you think I can try next?

  • #229

Hello. I’m trying to flash my H710 mini B0 (in an r620) over ssh. I’ve gotten to the point where you set the SAS address, but when I enter the command from the guide with my own address it says «No LSI SAS adapters found». Why does it say there are no adapters found when I was able to do all of the previous steps without problem?

Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

No LSI SAS adapters found! Limited Command Set Available!
ERROR: Command Not allowed without an adapter!
ERROR: Couldn’t Create Command -c
Exiting Program.

fohdeesha


  • #230

Hello. I’m trying to flash my H710 mini B0 (in an r620) over ssh. I’ve gotten to the point where you set the SAS address, but when I enter the command from the guide with my own address it says «No LSI SAS adapters found». Why does it say there are no adapters found when I was able to do all of the previous steps without problem?

Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

No LSI SAS adapters found! Limited Command Set Available!
ERROR: Command Not allowed without an adapter!
ERROR: Couldn’t Create Command -c
Exiting Program.

Well shit, I got enough reports of this since updating the Linux ISO I’m thinking debian 11 might not work with lsiutil — can you (and anyone else with this problem including @BlueScope819 ) try starting from scratch but with this previous set of ISOs? — https://fohdeesha.com/data/other/perc/perc-crossflash-v1.8.zip

  • #231

I’ll do that and come back.

  • #232

I just booted into the free dos to do info but it said nothing about the raid controller. (I’m pretty sure its the same message just without the raid controller part)

  • #233

I just booted into the free dos to do info but it said nothing about the raid controller. (I’m pretty sure its the same message just without the raid controller part)

You need to revert it back to Dell Firmware to start over. Scroll down to the bottom of the guide, the command is there.

Well shit, I got enough reports of this since updating the Linux ISO I’m thinking debian 11 might not work with lsiutil — can you (and anyone else with this problem including @BlueScope819 ) try starting from scratch but with this previous set of ISOs? — https://fohdeesha.com/data/other/perc/perc-crossflash-v1.8.zip

I don’t think that trying the previous set will work for me, as I was encountering these issues before you posted the new set of ISOs, that was one of the first things I tried, moving to the new ISOs.

  • #234

You need to revert it back to Dell Firmware to start over. Scroll down to the bottom of the guide, the command is there.

I don’t think that trying the previous set will work for me, as I was encountering these issues before you posted the new set of ISOs, that was one of the first things I tried, moving to the new ISOs.

Thanks, I’ll do that.

  • #235

Well shit, I got enough reports of this since updating the Linux ISO I’m thinking debian 11 might not work with lsiutil — can you (and anyone else with this problem including @BlueScope819 ) try starting from scratch but with this previous set of ISOs? — https://fohdeesha.com/data/other/perc/perc-crossflash-v1.8.zip

I just did B0CROSS and got «Error in erasing Flash chip. Error code = 524288» at 70% of earasing the flash chip. (after finishing B0REVERT and rebooting)

  • #236

I just did B0CROSS and got «Error in erasing Flash chip. Error code = 524288» at 70% of earasing the flash chip. (after finishing B0REVERT and rebooting)

I’ve had a similar issue before, mine failed at 30% and I think it’s a similar error code. Just run the command again, it didn’t pose any issues for me as it does two cycles of erasing.

  • #237

I’ve had a similar issue before, mine failed at 30% and I think it’s a similar error code. Just run the command again, it didn’t pose any issues for me as it does two cycles of erasing.

Now I’m getting this error message when setting the SAS adress

Advanced Mode Set

Adapter Selected is a LSI SAS: SAS2308_2(B0)

Executing Operation: Program SAS Address

ERROR: Programming SAS Address Failed!

Due to error remaining commands will not be executed.
Unable to Process Commands.
Exiting SAS2Flash.

Also when I did the B0-H710 command this happened, but it completed.

user@debian:~$ sudo su —
root@debian:~# B0-H710
rmmod: ERROR: Module megaraid_sas is not currently loaded
rmmod: ERROR: Module mptctl is not currently loaded
rmmod: ERROR: Module mptbase is not currently loaded
Errors above are normal!
Trying unlock in MPT mode…
Device in MPT mode
Device in MPT mode
Resetting adapter in HCB mode…
Trying unlock in MPT mode…
Device in MPT mode
IOC is RESET
Device in MPT mode
Resetting adapter in HCB mode…
Trying unlock in MPT mode…
Device in MPT mode
IOC is RESET
Setting up HCB…
HCDW virtual: 0x7f43d1200000
HCDW physical: 0xc1aa00000
Loading firmware…
Loaded 809340 bytes
Booting IOC…
IOC is READY
IOC Host Boot successful.
Device in MPT mode
Removing PCI device…
Rescanning PCI bus…
PCI bus rescan complete.
Pausing for 20 seconds to allow the card to boot

edit: I’m not sure what happened but the server seems to have rebooted its self when I wasn’t looking. In the boot it said «system fatal error during previous boot. core-instruction fetch unit error, core mid level error» I pressed f1 to continue and nothing seems abnormal.

Last edited: Jul 23, 2021

  • #238

Now I’m getting this error message when setting the SAS adress

Advanced Mode Set

Adapter Selected is a LSI SAS: SAS2308_2(B0)

Executing Operation: Program SAS Address

ERROR: Programming SAS Address Failed!

Due to error remaining commands will not be executed.
Unable to Process Commands.
Exiting SAS2Flash.

Also when I did the B0-H710 command this happened, but it completed.

user@debian:~$ sudo su —
root@debian:~# B0-H710
rmmod: ERROR: Module megaraid_sas is not currently loaded
rmmod: ERROR: Module mptctl is not currently loaded
rmmod: ERROR: Module mptbase is not currently loaded
Errors above are normal!
Trying unlock in MPT mode…
Device in MPT mode
Device in MPT mode
Resetting adapter in HCB mode…
Trying unlock in MPT mode…
Device in MPT mode
IOC is RESET
Device in MPT mode
Resetting adapter in HCB mode…
Trying unlock in MPT mode…
Device in MPT mode
IOC is RESET
Setting up HCB…
HCDW virtual: 0x7f43d1200000
HCDW physical: 0xc1aa00000
Loading firmware…
Loaded 809340 bytes
Booting IOC…
IOC is READY
IOC Host Boot successful.
Device in MPT mode
Removing PCI device…
Rescanning PCI bus…
PCI bus rescan complete.
Pausing for 20 seconds to allow the card to boot

edit: I’m not sure what happened but the server seems to have rebooted its self when I wasn’t looking. In the boot it said «system fatal error during previous boot. core-instruction fetch unit error, core mid level error» I pressed f1 to continue and nothing seems abnormal.

No idea about the SAS address bit, that’s a different error from what I was having. If you look at the guide, it seems like the server kernel panics after first reboot for whatever reason.

As for myself, I’m still stuck where it won’t let me flash the SAS address with the first error you encountered. I’m really not an expert, just sharing my own experiences with troubleshooting so far. I’ll leave it up to @fohdeesha to answer any complicated questions.

  • #239

That was part of the goal, I also made a couple tweaks to the flash scripts under linux to hopefully help — the new ISO package is now live on the site, I encourage non-dell hardware people with issues in the past to retry

awesome! I will give it a shot and report back.

  • #240

@fohdeesha thanks for the response. I read through the whole thread again last night and realized my issue was probably because I’m using a custom home server that has this Dell card in it. As far as I know I don’t have those bio settings but my problem is probably because I don’t have it in an actual Dell Server. Was a work around ever found for that? I don’t have access to a Dell server. Thanks

@fohdeesha I tried with the latest images and get the same error. I am flashing from a custom home built PC with an Asus motherboard from about 2006. It says no LSI sas adapters found limited command set available. This happens when trying to use the setsas command after all of the flashing is completed successfully. Any ideas?

Мой выделенный сервер DELL R710 (CentOS 6.4) сам по себе перезагружается и выдает следующую ошибку.

Означает ли это, что окно не может загрузиться или ядро запаниковало во время загрузки Linux, и сервер каким-то образом знает?

Может ли кто-нибудь посоветовать диагностику, или это аппаратная проблема, и ее следует передать в центр обработки данных, у которого я арендую коробку? Работал нормально месяцами, а теперь последние два дня случайно перезагружается.

Обновление — Box продолжает перезагружаться в течение одной минуты, пока работает, затем в следующей строке отображается загрузка ядра без выключения или других сообщений об ошибках.

Jan 10 16:29:12 squirtle kernel: Firewall: *TCP_IN Blocked* IN=em1 OUT= MAC=84:2b:2b:54:84:58:00:04:96:82:74:3e:08:00 SRC=93.174.93.67 DST=13.129.118.21 LEN=40 TOS=0x00 PREC=0x00 TTL=245 ID=54321 PROTO=TCP SPT=35003 DPT=21320 WINDOW=65535 RES=0x00 SYN URGP=0
Jan 10 16:35:50 squirtle kernel: Firewall: *UDP_IN Blocked* IN=em1 OUT= MAC=84:2b:2b:54:84:58:00:04:96:82:74:3e:08:00 SRC=179.107.38.35 DST=13.129.118.21 LEN=443 TOS=0x00 PREC=0x00 TTL=53 ID=0 DF PROTO=UDP SPT=5067 DPT=5060 LEN=423
Jan 10 16:42:05 squirtle kernel: imklog 5.8.10, log source = /proc/kmsg started.
Jan 10 16:42:05 squirtle rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1203" x-info="http://www.rsyslog.com"] start
Jan 10 16:42:05 squirtle kernel: Initializing cgroup subsys cpuset
Jan 10 16:42:05 squirtle kernel: Initializing cgroup subsys cpu
Jan 10 16:42:05 squirtle kernel: Linux version 2.6.32-431.3.1.el6.i686 (mockbuild@c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 18:53:30 UTC 2014
Jan 10 16:42:05 squirtle kernel: KERNEL supported cpus:
Jan 10 16:42:05 squirtle kernel:  Intel GenuineIntel
Jan 10 16:42:05 squirtle kernel:  AMD AuthenticAMD
Jan 10 16:42:05 squirtle kernel:  NSC Geode by NSC
Jan 10 16:42:05 squirtle kernel:  Cyrix CyrixInstead
Jan 10 16:42:05 squirtle kernel:  Centaur CentaurHauls
Jan 10 16:42:05 squirtle kernel:  Transmeta GenuineTMx86
Jan 10 16:42:05 squirtle kernel:  Transmeta TransmetaCPU
Jan 10 16:42:05 squirtle kernel:  UMC UMC UMC UMC

Обновление 2

Я запустить утилиту stress на сервер в течение последних 4 дней, сервер не перезагружается один раз. Максимальное использование всех ядер при 100% процессоре. Мне нужно проверить, используется ли нагрузка на память или запись на диск, но что касается процессоров, то они кажутся нормальными.

Понравилась статья? Поделить с друзьями:
  • System fan 90b ошибка при загрузке hp как исправить
  • System fan 908 hp как исправить
  • Tabitem wpf как изменить цвет
  • T1m6 error 20
  • T12 multicontrol ошибка