Problem
System x3630M4/x3530M4 may report uncorrectable bus error on Central Processing Unit (CPU) and uncorrectable error on memory while performing excessive memory usage application or memory diagnostic.
The error from the Integrated Management Module (IMM)chassis event log is as follows and results in a system restart.
CHASSIS:(12/23/2014 02:23:11) An Uncorrectable Bus Error has occurred on bus CPUs.
CHASSIS:(12/23/2014 02:23:01) Uncorrectable error detected for One of the DIMMs on Subsystem System Memory.
The symptom can only be observed by the Three (3) DIMMs for One (1) CPU or Six (6) DIMMs for Two (2) CPUs ( One DIMM per channel ) configuration.
Resolving The Problem
Source
RETAIN tip: H213778
Symptom
System x3630M4/x3530M4 may report uncorrectable bus error on
Central Processing Unit (CPU) and uncorrectable error on memory
while performing excessive memory usage application or memory
diagnostic.
The error from the Integrated Management Module (IMM) chassis
event log is as follows and results in a system restart.
CHASSIS:(12/23/2014 02:23:11) An Uncorrectable Bus Error has CHASSIS:(12/23/2014 02:23:01) Uncorrectable error detected for |
The symptom can only be observed by the Three (3) DIMMs for One
(1) CPU or Six (6) DIMMs for Two (2) CPUs ( One DIMM per channel )
configuration.
Affected configurations
The system may be any of the following IBM servers:
- System x3530 M4, type 7160 E5-xxxxV2, any model
- System x3630 M4, type 7158 E5-xxxxV2, any model
This tip is not software specific.
This tip is not option specific.
The following system BIOS/UEFI level(s) are affected: UEFI
version 1.60/1.70/1.71 are affected.
Solution
The issue has been addressed by the UEFI firmware code version
2.12 or later.
The file is or will be available by selecting the appropriate
Product Group, type of System, Product name, Product machine type,
and operating system on IBM Support’s Fix Central web page, at the
following URL:
http://www.ibm.com/support/fixcentral/
Workaround
This issue can be worked around by changing the UEFI setting to
lower memory speed from default 1600 MT/s to 1333 MT/s.
In UEFI setup menu, Select ‘System Settings’ —> ‘Operating
Modes’ —> Choose Operating Mode: ‘Custom Mode’ —> Change
Memory Speed to ‘Balanced’.
Additional information
The system mistakenly triggers the failure symptom. The UEFI
firmware will address this issue in next release.
Document Location
Worldwide
Operating System
System x:Operating system independent / None
[{«Type»:»HW»,»Business Unit»:{«code»:»BU016″,»label»:»Multiple Vendor Support»},»Product»:{«code»:»QU01GCQ»,»label»:»System x->System x3530 M4->7160″},»Platform»:[{«code»:»PF025″,»label»:»Platform Independent»}],»Line of Business»:{«code»:»»,»label»:»»}},{«Type»:»HW»,»Business Unit»:{«code»:»BU016″,»label»:»Multiple Vendor Support»},»Product»:{«code»:»QU91NCW»,»label»:»System x->System x3630 M4->7158″},»Platform»:[{«code»:»PF025″,»label»:»Platform Independent»}],»Line of Business»:{«code»:»»,»label»:»»}}]
About Lenovo
-
Our Company
-
News
-
Investor Relations
-
Sustainability
-
Product Compliance
-
Product Security
-
Lenovo Open Source
-
Legal Information
-
Jobs at Lenovo
Shop
-
Laptops & Ultrabooks
-
Tablets
-
Desktops & All-in-Ones
-
Workstations
-
Accessories & Software
-
Servers
-
Storage
-
Networking
-
Laptop Deals
-
Outlet
Support
-
Drivers & Software
-
How To’s
-
Warranty Lookup
-
Parts Lookup
-
Contact Us
-
Repair Status Check
-
Imaging & Security Resources
Resources
-
Where to Buy
-
Shopping Help
-
Sales Order Status
-
Product Specifications (PSREF)
-
Forums
-
Registration
-
Product Accessibility
-
Environmental Information
-
Gaming Community
-
LenovoEDU Community
-
LenovoPRO Community
©
Lenovo.
|
|
|
|
Posted by dianesalas 2017-06-14T09:22:32Z
Hello.
I have replaced motherboard and IO shuttle after a severe power failure. I am now getting:
0x8007020c2587ffff — Sensor «MemSpareErr7» has transitioned to critical from a less severe state
0x806f08131801ffff — A Uncorrectable Bus Error has occurred on system «Host»
«MemSpareErr7» and «Host» are not found in any documentation and all the info is vague. Can someone point me in a good direction?
3 Replies
-
Not sure if that one will help:
https://www.manualslib.com/manual/1217508/Ibm-X3950-X5.html?page=750 Opens a new window
Text
806f0813-1801ffff A Uncorrectable Bus Error has occurred on system [ComputerSystemElementName]. Explanation: IMM has detected a bus uncorrectable error. May also be shown as 806f08131801ffff or 0x806f08131801ffff Severity: Error Alert Category: Critical Serviceable: No CIM Information: Prefix: PLAT and ID: 0240 SNMP Trap ID: 50 Automatically notify Support: No User response: This is a UEFI detected event. The UEFI diagnostic code for this event can be found in the logged IMM message text. Please refer to the UEFI diagnostic code in the "UEFI diagnostic code" section of the Info Center for the appropriate user response
And that ones:
http://systemx.lenovofiles.com/help/index.jsp?topic=%2Fcom.lenovo.conv.5462.doc%2Fr_imm_error_messag… Opens a new window
http://systemx.lenovofiles.com/help/index.jsp?topic=%2Fcom.lenovo.conv.5462.doc%2F8007020c2502ffff.h… Opens a new window
Text
8007020c-2502ffff : Sensor [SensorElementName] has transitioned to critical from a less severe state. (nvDIMM 02 Status) Sensor [SensorElementName] has transitioned to critical from a less severe state. (nvDIMM 02 Status) This message is for the use case when an implementation has detected a Sensor transitioned to critical from less severe. May also be shown as 8007020c2502ffff or 0x8007020c2502ffff Severity Error Alert Category Critical - Other Serviceable Yes CIM Information Prefix:PLAT and ID:0522 SNMP Trap ID 50 Automatically Notify Support No User Response None
Was this post helpful?
thumb_up
thumb_down
-
Did you check the firmware levels?
Is everything reseated properly and grounded?
Check all memory and CPU and cards
Can you access IMM or does it not even get that far?
Was this post helpful?
thumb_up
thumb_down
-
If you get into the BIOS/UEFI check to see that it recognizes the drives
Play with those settings
Was this post helpful?
thumb_up
thumb_down
-
#1
Добрый день! Есть сервер HP DL 560 Gen9, работающий на windows server 2016. Там живет Microsoft SQL server 2016 и сервер 1с предприятия. В выходные на емэйл пришли ошибки от ilo, сервер перестал пинговаться и видимо завис. Ошибки такие:
— PCI Bus 01/26/2021 15:15 01/26/2021 15:15 1 PCI Bus Error (Slot 0, Bus 0, Device 0, Function 0)
— System Error 01/26/2021 15:15 01/26/2021 15:15 1 Unrecoverable System Error (NMI) has occurred. System Firmware will log additional details in a separate IML entry if possible
— PCI Bus 01/26/2021 15:15 01/26/2021 15:15 1 Uncorrectable PCI Express Error (Embedded device, Bus 0, Device 0, Function 0, Error status 0x00000000)
— PCI Bus 01/26/2021 15:15 [NOT SET] 1 Uncorrectable PCI Express Error (Embedded device, Bus 0, Device 0, Function 0, Error status 0x00000000)
Перезагрузил сервер удаленно через ilo — вроде он ожил. Но на долго ли.. Помогите понять что за ошибка и что сломалось ?
-
#2
HP ProLiant Servers — How to Decode Uncorrectable PCI Express Error
Information
This document will help user in decoding the Uncorrectable PCI Express Error.
Ex: Uncorrectable PCI Express Error (Embedded device, Bus 0, Device 8, Function 0, Error status 0x00000000
Details
This particular PCI Express Error could be decoded by using the logs mentioned below.
- Advanced Survey Report.
- lspci Output from a Linux Machine or ESX Machine.
Advanced Survey Report:
NOTE: Use the Vendor ID and the Device ID to determine the hardware device.
LSPCI Output:
If the server is running Linux or ESX, collect the OS logs from the server.
Check the lspci.txt in the OS logs. User should be able to find the information as listed in the screenshot below:
In this Example, check the numbers listed before the word Bridge.
000:000:08.0 Bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 8.
000 —> Represents PCI Domain (Every PCI Domain could have 256 PCI Buses).
000 —> Bus
08 —> Device
0 —> Function
By using either of these logs, the PCI Express Error could be narrowed down to the hardware device causing the error.
NOTE: The Values mentioned in the IML Logs are Decimal Values. The values in the Advanced Survey Report is in the decimal Value. However the values in the lspci command output is in hexadecimal value. Everytime the values has to be converted to hexadecimal when comparing the values in the lspci output.
-
#3
Если не последняя версия прошивки то можно попробовать вылечить через сервисный пак SPP Gen9 Production Version *: 2021.10.0
Подобного рода алерты как правило часто вызваны устарвшими версиями прошивок таких компонентов как System Rom и ILO.
-
#4
Если не последняя версия прошивки то можно попробовать вылечить через сервисный пак SPP Gen9 Production Version *: 2021.10.0
Подобного рода алерты как правило часто вызваны устарвшими версиями прошивок таких компонентов как System Rom и ILO.
Попробую но не думаю что это поможет, позже напишу. Сейчас нет возможности перезагрузить сервер
-
#5
Если не последняя версия прошивки то можно попробовать вылечить через сервисный пак SPP Gen9 Production Version *: 2021.10.0
Подобного рода алерты как правило часто вызваны устарвшими версиями прошивок таких компонентов как System Rom и ILO.
Хьюлеты при обращении в саппорт или в другой любой непонятной ситуации выдают стандартный ответ — обновите прошивку
-
#6
Обновил все firmware. Пока полет нормальный. Думаю помогло.
806f0813-2581ffff • 806f0813-2584ffff
806f0813-2581ffff A Uncorrectable Bus Error has occurred on system [ComputerSystemElementName].
Explanation: IMM has detected a bus uncorrectable error.
May also be shown as 806f08132581ffff or 0x806f08132581ffff
Severity: Error
Alert Category: Critical
Serviceable: Yes
CIM Information: Prefix: PLAT and ID: 0240
SNMP Trap ID: 50
Automatically notify Support: Yes
User response: This is a UEFI detected event. The UEFI diagnostic code for this event can be found in the logged
IMM message text. Please refer to the UEFI diagnostic code in the «UEFI diagnostic code» section of the Info Center
for the appropriate user response.
806f0813-2582ffff A Uncorrectable Bus Error has occurred on system [ComputerSystemElementName].
Explanation: IMM has detected a Bus Uncorrectable Error.
May also be shown as 806f08132582ffff or 0x806f08132582ffff
Severity: Error
Alert Category: Critical
Serviceable: Yes
CIM Information: Prefix: PLAT and ID: 0240
SNMP Trap ID: 50
Automatically notify Support: Yes
User response: This is a UEFI detected event. The UEFI diagnostic code for this event can be found in the logged
IMM message text. Please refer to the UEFI diagnostic code in the «UEFI diagnostic code» section of the Info Center
for the appropriate user response.
806f0813-2584ffff A Uncorrectable Bus Error has occurred on system [ComputerSystemElementName].
Explanation: IMM has detected a bus uncorrectable error.
May also be shown as 806f08132584ffff or 0x806f08132584ffff
Severity: Error
Alert Category: Critical
Serviceable: No
CIM Information: Prefix: PLAT and ID: 0240
SNMP Trap ID: 50
Automatically notify Support: No
User response: This is a UEFI detected event. The UEFI diagnostic code for this event can be found in the logged
IMM message text. Please refer to the UEFI diagnostic code in the «UEFI diagnostic code» section of the Info Center
for the appropriate user response.
735
Appendix B. Integrated management module error messages