Unrecoverable fatal error found system will not boot until the error is resolved

Intel MFSYS25V2 Manual Online: Processor Initialization Error Summary. The following table describes mixed processor conditions and recommended actions for the MFS2600KIdesigned around the Intel C602-J chipset product family architecture. The errors fall into one of the following...

Intel

Compute Module MFS2600KI TPS

®

Processors which have different Intel

together if they are otherwise compatible and if a common link frequency can be

selected. The common link frequency would be the highest link frequency that all

installed processors can achieve.

Processor stepping within a common processor family can be mixed as long as it is

listed in the processor specification updates published by Intel Corporation.

3.1.2

The following table describes mixed processor conditions and recommended actions for the

MFS2600KIdesigned around the Intel

C602-J chipset product family architecture. The errors fall into one of the following categories:

Fatal: If the system can boot, it pauses at a blank screen with the text «Unrecoverable

fatal error found. System will not boot until the error is resolved» and «Press <F2>

to enter setup», regardless of whether the «Post Error Pause» setup option is enabled or

disabled.

When the operator presses the <F2> key on the keyboard, the error message is

displayed on the Error Manager screen, and an error is logged to the System Event Log

(SEL) with the POST Error Code.

The system cannot boot unless the error is resolved. The user needs to replace the

faulty part and restart the system.

For Fatal Errors during processor initialization, the System Status LED will be set to a

steady Amber color, indicating an unrecoverable system failure condition.

Major: If the «Post Error Pause» setup option is enabled, the system goes directly to the

Error Manager to display the error, and logs the POST Error Code to SEL. Operator

intervention is required to continue booting the system.

Otherwise, if «POST Error Pause» is disabled, the system continues to boot and no

prompt is given for the error, although the Post Error Code is logged to the Error

Manager and in a SEL message.

Minor: The message is displayed on the screen or on the Error Manager screen, and

the POST Error Code is logged to the SEL. The system continues booting in a degraded

state. The user may want to replace the erroneous unit. The POST Error Pause option

setting in the BIOS setup does not have any effect on this error.

Revision 1.0

®

Quickpath (QPI) Link Frequencies may operate

®

®

Xeon

processor E5-2600 product family and Intel

Intel order number: G51989-002

Functional Architecture

®

7

Error Code

Severity

Error Message

Possible Needed Response

9687

Fatal

DXE core component encountered a illegal software state error.

 

8565

Major

DIMM_C2 Component encountered a Serial Presence Detection (SPD) fail error.

 

8567

Major

DIMM_D2 Component encountered a Serial Presence Detection (SPD) fail error.

 

85A2

Major

DIMM_B1 Uncorrectable ECC error encountered.

 

85AB

Major

DIMM_F2 Uncorrectable ECC error encountered.

 

9000

Major

Unspecified processor component has encountered a non-specific error.

 

0xB6A3

Major

DXE boot services driver Unrecognized.

 

8604

Minor

Chipset Reclaim of non critical variables complete.

 

9223

Minor

Keyboard component was not detected.

 

9266

Minor

Local Console component encountered a controller error.

 

9286

Minor

Remote Console component encountered a controller error.

 

94C6

Minor

LPC component encountered a controller error.

 

95A7

Minor

PCI component encountered a read error.

 

9609

Minor

Unspecified software component encountered a start error.

 

0xA028

Minor

Processor component encountered a high voltage error.

 

0xA501

Minor

ATA/ATPI ATA SMART is disabled.

 

0192

Fatal

Processor 0x cache size mismatch detected.

The user needs to replace the faulty part and restart the system.

0194

Fatal

Processor 0x family mismatch detected.

The user needs to replace the faulty part and restart the system.

0196

Fatal

Processor 0x model mismatch.

 

0197

Fatal

Processor 0x speeds mismatched.

 

0198

Fatal

Processor 0x family is not supported.

 

9667

Fatal

PEI module component encountered an illegal software state error.

 

96A7

Fatal

DXE boot services driver component encountered an illegal software state error.

 

96E7

Fatal

SMM driver component encountered an illegal software state error.

 

0xA421

Fatal

PCI component encountered a SERR error.

 

0xA5A1

Fatal

PCI Express component encountered a SERR error.

 

0012

Major

CMOS date / time not set

Set the Time and Date.

0048

Major

Password check failed

 

0113

Major

Fixed Media SAS RAID firmware cannot run properly.

Attempt to reflash the firmware.

0140

Major

PCI component encountered a PERR error.

 

0141

Major

PCI resource conflict.

 

0146

Major

PCI out of resources error.

 

0195

Major

Processor 0x Intel(R) QPI speed mismatch.

 

019F

Major

Processor and chipset stepping configuration is unsupported.

 

5220

Major

CMOS/NVRAM Configuration Cleared

 

5221

Major

Passwords cleared by jumper.

 

5224

Major

Password clear Jumper is Set.

 

8160

Major

Processor 01 unable to apply microcode update.

 

8161

Major

Processor 02 unable to apply microcode update.

 

8190

Major

Watchdog timer failed on last boot.

 

8198

Major

OS boot watchdog timer failure.

 

8300

Major

Baseboard management controller failed self-test.

 

84F2

Major

Baseboard management controller failed to respond.

 

84F3

Major

Baseboard management controller in update mode.

 

84F4

Major

Sensor data record empty.

 

8500

Major

Memory component could not be configured in the selected RAS mode.

 

8501

Major

DIMM Population Error.

 

8502

Major

CLTT Configuration Failure Error.

 

8520

Major

DIMM_A1 failed Self Test (BIST).

 

8521

Major

DIMM_A2 failed Self Test (BIST).

 

8522

Major

DIMM_B1 failed Self Test (BIST).

 

8523

Major

DIMM_B2 failed Self Test (BIST).

 

8524

Major

DIMM_C1 failed Self Test (BIST).

 

8525

Major

DIMM_C2 failed Self Test (BIST).

 

8526

Major

DIMM_D1 failed Self Test (BIST).

 

8527

Major

DIMM_D2 failed Self Test (BIST).

 

8528

Major

DIMM_E1 failed Self Test (BIST).

 

8562

Major

DIMM_B1 Component encountered a Serial Presence Detection (SPD) fail error.

 

8563

Major

DIMM_B2 Component encountered a Serial Presence Detection (SPD) fail error.

 

8564

Major

DIMM_C1 Component encountered a Serial Presence Detection (SPD) fail error.

 

8566

Major

DIMM_D1 Component encountered a Serial Presence Detection (SPD) fail error.

 

8568

Major

DIMM_E1 Component encountered a Serial Presence Detection (SPD) fail error.

 

8569

Major

DIMM_E2 Component encountered a Serial Presence Detection (SPD) fail error.

 

856A

Major

DIMM_F1 Component encountered a Serial Presence Detection (SPD) fail error.

 

856B

Major

DIMM_F2 Component encountered a Serial Presence Detection (SPD) fail error.

 

85A0

Major

DIMM_A1 Uncorrectable ECC error encountered.

 

85A1

Major

DIMM_A2 Uncorrectable ECC error encountered.

 

85A3

Major

DIMM_B2 Uncorrectable ECC error encountered.

 

85A4

Major

DIMM_C1 Uncorrectable ECC error encountered.

 

85A5

Major

DIMM_C2 Uncorrectable ECC error encountered.

 

85A6

Major

DIMM_D1 Uncorrectable ECC error encountered.

 

85A7

Major

DIMM_D2 Uncorrectable ECC error encountered.

 

85A8

Major

DIMM_E1 Uncorrectable ECC error encountered.

 

85A9

Major

DIMM_E2 Uncorrectable ECC error encountered.

 

85AA

Major

DIMM_F1 Uncorrectable ECC error encountered.

 

92A3

Major

Serial port component was not detected.

 

92A9

Major

Serial port component encountered a resource conflict error.

 

94C9

Major

LPC component encountered a resource conflict error.

 

0xA022

Major

Processor component encountered a mismatch error.

 

0xA5A4

Major

PCI Express IBIST error.

 

0108

Minor

Keyboard component encountered a locked error.

Unlock the keyboard.

0109

Minor

Keyboard component encountered a stuck key error.

 

0193

Minor

Processor 0x stepping mismatch.

 

8180

Minor

Processor 0x microcode update not found.

 

84FF

Minor

System event log full.

 

9226

Minor

Keyboard component encountered a controller error.

 

9243

Minor

Mouse component was not detected.

 

9246

Minor

Mouse component encountered a controller error.

 

9268

Minor

Local Console component encountered an output error.

 

9269

Minor

Local Console component encountered a resource conflict error.

 

9287

Minor

Remote Console component encountered an input error.

 

9288

Minor

Remote Console component encountered an output error.

 

92C6

Minor

Serial Port controller error.

 

92C7

Minor

Serial Port component encountered an input error.

 

92C8

Minor

Serial Port component encountered an output error.

 

9506

Minor

ATA/ATPI component encountered a controller error.

 

95A6

Minor

PCI component encountered a controller error.

 

95A8

Minor

PCI component encountered a write error.

 

9641

Minor

PEI Core component encountered a load error.

 

96AB

Minor

DXE boot services driver component encountered invalid configuration.

 

0xA000

Minor

TPM device not detected.

 

0xA001

Minor

TPM device missing or not responding.

 

0xA002

Minor

TPM device failure.

 

0xA003

Minor

TPM device failed self test.

 

0xA027

Minor

Processor component encountered a low voltage error.

 

0xA500

Minor

ATA/ATPI ATA bus SMART not supported.

 

0xA5A0

Minor

PCI Express component encountered a PERR error.

 

0xA6A0

Minor

DXE boot services driver Not enough memory available to shadow a legacy option ROM.

 

  • #1

Добрый день! Есть сервер HP DL 560 Gen9, работающий на windows server 2016. Там живет Microsoft SQL server 2016 и сервер 1с предприятия. В выходные на емэйл пришли ошибки от ilo, сервер перестал пинговаться и видимо завис. Ошибки такие:

— PCI Bus 01/26/2021 15:15 01/26/2021 15:15 1 PCI Bus Error (Slot 0, Bus 0, Device 0, Function 0)
— System Error 01/26/2021 15:15 01/26/2021 15:15 1 Unrecoverable System Error (NMI) has occurred. System Firmware will log additional details in a separate IML entry if possible
— PCI Bus 01/26/2021 15:15 01/26/2021 15:15 1 Uncorrectable PCI Express Error (Embedded device, Bus 0, Device 0, Function 0, Error status 0x00000000)
— PCI Bus 01/26/2021 15:15 [NOT SET] 1 Uncorrectable PCI Express Error (Embedded device, Bus 0, Device 0, Function 0, Error status 0x00000000)

Перезагрузил сервер удаленно через ilo — вроде он ожил. Но на долго ли.. Помогите понять что за ошибка и что сломалось ?:confused:

  • #2

HP ProLiant Servers — How to Decode Uncorrectable PCI Express Error

Information​

This document will help user in decoding the Uncorrectable PCI Express Error.
Ex: Uncorrectable PCI Express Error (Embedded device, Bus 0, Device 8, Function 0, Error status 0x00000000

Details​

This particular PCI Express Error could be decoded by using the logs mentioned below.

  1. Advanced Survey Report.
  2. lspci Output from a Linux Machine or ESX Machine.

Advanced Survey Report:

c03525471.jpg

NOTE: Use the Vendor ID and the Device ID to determine the hardware device.
LSPCI Output:
If the server is running Linux or ESX, collect the OS logs from the server.
Check the lspci.txt in the OS logs. User should be able to find the information as listed in the screenshot below:

c03525472.jpg

In this Example, check the numbers listed before the word Bridge.
000:000:08.0 Bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 8.
000 —> Represents PCI Domain (Every PCI Domain could have 256 PCI Buses).
000 —> Bus
08 —> Device
0 —> Function
By using either of these logs, the PCI Express Error could be narrowed down to the hardware device causing the error.
NOTE: The Values mentioned in the IML Logs are Decimal Values. The values in the Advanced Survey Report is in the decimal Value. However the values in the lspci command output is in hexadecimal value. Everytime the values has to be converted to hexadecimal when comparing the values in the lspci output.

  • #3

Если не последняя версия прошивки то можно попробовать вылечить через сервисный пак SPP Gen9 Production Version *: 2021.10.0
Подобного рода алерты как правило часто вызваны устарвшими версиями прошивок таких компонентов как System Rom и ILO.

  • #4

Если не последняя версия прошивки то можно попробовать вылечить через сервисный пак SPP Gen9 Production Version *: 2021.10.0
Подобного рода алерты как правило часто вызваны устарвшими версиями прошивок таких компонентов как System Rom и ILO.

Попробую но не думаю что это поможет, позже напишу. Сейчас нет возможности перезагрузить сервер

Surf_rider


  • #5

Если не последняя версия прошивки то можно попробовать вылечить через сервисный пак SPP Gen9 Production Version *: 2021.10.0
Подобного рода алерты как правило часто вызваны устарвшими версиями прошивок таких компонентов как System Rom и ILO.

Хьюлеты при обращении в саппорт или в другой любой непонятной ситуации выдают стандартный ответ — обновите прошивку:cautious:

  • #6

Обновил все firmware. Пока полет нормальный. Думаю помогло.

UPDATE: This issue is now resolved in the 2019-05-21 cumulative update (KB4497934)

In the last couple of weeks, we have been hearing reports from customers who are encountering problems after migrating virtual machines directly from Windows Server 2012 R2 to Windows Server 2019. People are seeing error messages like the following:

Critical 03/01/2019 16:13:49 Hyper-V-Worker 18604 None
‘Test VM 1’ has encountered a fatal error but a memory dump could not be generated. Error 0x2. If the problem persists, contact Product Support for the guest operating system. (Virtual machine ID 90B45891-E0EB-4842-8070-F30FF25C663A)

Critical 03/01/2019 16:13:49 Hyper-V-Worker 18560 None
‘Test VM 1’ was reset because an unrecoverable error occurred on a virtual processor that caused a triple fault. If the problem persists, contact Product Support. (Virtual machine ID 90B45891-E0EB-4842-8070-F30FF25C663A)

We have been digging into these issues and have identified the root cause. In Windows Server 2019 we made several changes to the virtual machine firmware (a topic that I plan to blog about another day). In the process we unfortunately exposed a bug. The effect of the bug is that the firmware state on a version 5.0, Generation 2 virtual machine from Windows Server 2012 R2 cannot boot on Windows Server 2019.

Specifically – the bug is exposed by the IPv6 boot data that is stored in the firmware of a Generation 2 virtual machine. Note, this will not effect Generation 1 virtual machines.

We are actively working on a fix for this issue right now.

Workaround

In the meantime, it is possible to work around this. To get the virtual machine to boot you need to get Hyper-V to create new firmware entries for the IPv6 boot data. The easiest way to do this is to change the MAC addresses on any network adapters connected to the affected virtual machine. This process is different for virtual machines with dynamic and static MAC addresses.

Static MAC addresses

For virtual machines with network adapters that are set to use static MAC addresses – all you need to do is to open the virtual machine settings and change the MAC address to a new value:

I have also put together the following PowerShell snippet for people who like to automate things. This script will go through all network adapters on a virtual machine, find the ones with static MAC addresses, and increment them by 100.

# The name of the virtual machine that needs to be fixed
$VMname = "Broken VM"

# Iterate over all the network adapters in the virutal machine
Get-VMNetworkAdapter -VMName $VMname | % {
     # Skip any network adapters that are using dynamic MAC addresses
     if (!($_.DynamicMacAddressEnabled)) 
         {
             # Read the current MAC address, add 100, and set the new MAC address
             $newMac = ([int64]"0x$($_.MacAddress)"+100).ToString("X").PadLeft(12,"0")
             Set-VMNetworkAdapter -VMNetworkAdapter $_ -StaticMacAddress $newMac
         }
     } 

The reason why I chose to increment by 100 was incase people have consecutive MAC addresses.

Dynamic MAC addresses

If your virtual machine is using dynamic MAC addresses – it is possible that you will not hit this problem at all. There are a number of cases where Hyper-V will regenerate the MAC address automatically.

You can also force Hyper-V to regenerate dynamic MAC addresses by changing the dynamic MAC address pool range used by Hyper-V. Dimitris Tonias has written a great article on how to configure this that you should review.

Live Migration

One interesting note to make here – when you live migrate a virtual machine it does not boot through the firmware. This means that if you live migrate a virtual machine from Windows Server 2012 R2 to Windows Server 2019 it will continue to run. However it will not boot if you shut it down and try to start it again after the migration. In this situation the above work around will also address the problem.

My apologies to anyone who has been affected by this issue. hopefully we will have a fix out for this out soon!

Cheers,
Ben

  • Печать

Страницы: [1]   Вниз

Тема: dpkg. unrecoverable fatal error, aborting  (Прочитано 2310 раз)

0 Пользователей и 1 Гость просматривают эту тему.

Оффлайн
perat

При попытке установки любого пакета, появляется ошибка:

(Reading database ... 75%dpkg: unrecoverable fatal error, aborting:
 files list file for package `libgtk2.0-cil' contains empty filename
E: Sub-process /usr/bin/dpkg returned an error code (2)

Что делать? :(


Оффлайн
ArcFi

sudo mv /var/lib/dpkg/status{,.bak} ; sudo cp /var/lib/dpkg/status{-old,}


Оффлайн
perat

sudo mv /var/lib/dpkg/status{,.bak} ; sudo cp /var/lib/dpkg/status{-old,}

Не помогло:

taras@taras-desktop:~$ sudo mv /var/lib/dpkg/status{,.bak} ; sudo cp /var/lib/dpkg/status{-old,}
taras@taras-desktop:~$ sudo apt-get upgrade
E: dpkg was interrupted, you must manually run 'sudo dpkg --configure -a' to correct the problem.
taras@taras-desktop:~$ sudo dpkg --configure -a
Processing triggers for python-central ...
taras@taras-desktop:~$ sudo apt-get upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages will be upgraded:
  opera
1 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 0B/15.8MB of archives.
After this operation, 4,155kB of additional disk space will be used.
Do you want to continue [Y/n]? y
Preconfiguring packages ...
(Reading database ... 75%dpkg: unrecoverable fatal error, aborting:
 files list file for package `libgtk2.0-cil' contains empty filename
E: Sub-process /usr/bin/dpkg returned an error code (2)


Оффлайн
ArcFi


Оффлайн
perat


  • Печать

Страницы: [1]   Вверх

Понравилась статья? Поделить с друзьями:
  • Unrecoverable error rangecheck in putdeviceprops
  • Unrecoverable error occurred please reinstall plarium play
  • Unrecoverable error bombing out
  • Unrecoverable build error 0x80004005
  • Unrecoverable bootloader error 0x00000008 asus tf300tg