Error Code
Message Information
MEM8000
Message
Correctable memory error logging disabled for a memory device at
location <
location
>.
Details
Errors are being corrected but no longer logged.
Action
Review system logs for memory exceptions. Reinstall memory at
location <
location
>.
PCI1302
Message
A bus time-out was detected on a component at bus <
bus
>
device<
device
> function <
func
>.
Details
System performance may be degraded. The device has failed to
respond to a transaction.
Action
Cycle input power, update component drivers, if device is removable,
reinstall the device.
PCI1304
Message
An I/O channel check error was detected.
Action
Cycle input power, update component drivers, if device is removable,
reinstall the device.
PCI1308
Message
A PCI parity error was detected on a component at bus
<
bus
>device<
device
>function <
func
>.
Details
System performance may be degraded, PCI device may fail to
operate, or system may fail to operate.
Action
Cycle input power, update component drivers, if device is removable,
reinstall the device.
PCI1320
Message
A bus fatal error was detected on a component at bus
<
bus
>device<
device
>function <
func
>.
Details
System performance may be degraded, or system may fail to operate.
Action
Cycle input power, update component drivers, if device is removable,
reinstall the device.
PCI1342
Message
A bus time-out was detected on a component at slot <
number
>.
Details
System performance may be degraded, or system may fail to operate.
Action
Cycle input power, update component drivers, if device is removable,
reinstall the device.
PCI1348
Message
A PCI parity error was detected on a component at slot <
number
>.
109
Обновлено 14.12.2016
Всем привет сегодня на IBM Blade HS22 вылезла ошибка Correctable ECC memory error logging limit reached. Я расскажу как ее решить. Появляется данная проблема в журналах AMM, кто не в курсе AMM это вебинтерфейс управления корзиной с блейд серверами IBM.
Вот как выглядит данная ошибка в AMM.
Ошибка Correctable ECC memory error logging limit reached на IBM HS22-1
Ошибка Correctable ECC memory error logging limit reached, возникает с проблемой в оперативной памяти, сам IBM в первую очередь советует прошить все по максимуму, и если не поможет вытащить блейд и пере ткнуть DDR память.
и в логах эта ошибка тоже присутствует и имеет код 0x806f050c.
Ошибка Correctable ECC memory error logging limit reached на IBM HS22-2
Я пошел первым путем решил все обновить. Ранее я вам рассказывал Как обновить все прошивки на IBM Blade HS22
После обновления видим в логах что ошибка в состоянии recovery
Ошибка Correctable ECC memory error logging limit reached на IBM HS22-11
и когда будет произведена перезагрузка после обновления вы увидите, что ошибка благополучно исчезла и все зеленое.
Как обновить все прошивки на IBM Blade HS22-10
Вот так вот просто решается Ошибка Correctable ECC memory error logging limit reached на IBM HS22.
Материал сайта pyatilistnik.org
Дек 14, 2016 10:49
Suggest you check the BIOS logs on the servers that have had the problem.
We get that once every so often on a Dell 2950 with quad-core intels — looked in the BIOS log the other day and found a note that said that number 8 memory dimm had been ‘disabled’ due to it failing an ECC check. Rebooted the server after the ESX failed and it will run for ages — then the BIOS will lock out DIMM8 and things go hinky again.
Doesn’t seem to matter whih physical dimm is in socket 8 — even when the memory is swapped out we get same issue — Uncorrectable memory error in esx and a Dimm8 Disabled due to ecc failure in bios.
I ended up leaving the last pair of dimms out and the error hasn’t occured now for about 3 months
suspicion = something I saw with HP DL servers once… each dimm can be seen as a single rank per side, or dual rank per side — some servers can only take so many ranks… the DL380G4 had a limitation (I seem to recall) of 8 ranks total and not more than dual-rank (or 2 ranks) per dimm… therefore a dual-sided, Dual-rank dimm counted as four ranks (and thus wouldn’t work or wouldnt function as expected)
With the Dell, I believe that the fact that I had 4Gb dimms in it (that counted as dual-sided AND dual-rank) might somehow exeed whatever the rank limitation is. I haven’t found a limit published… is just a thoery based on experience
Detailed
The memory may not be operational. This an early indicator of a possible future uncorrectable
Description
error.
Recommended
Re-install the memory component. If the problem continues, contact support.
Response Action
Category
System Health
SubCategory
MEM = Memory
Severity
Severity 1 (Critical)
Trap/EventID
FALSE
LCD Message
Correctable memory error rate exceeded for <location>. Reseat memory.
Initial Default.
FALSE
Filter
IPMI Alert SNMP
Visibility
Alert
FALSE
FALSE
FALSE
Message
Correctable memory error logging disabled for a memory device at location
Arguments
Detailed
Errors are being corrected but no longer logged.
Description
Recommended
Review system logs for memory exceptions. Re-install memory at location <location>
Response Action
Category
System Health
SubCategory
MEM = Memory
Severity
Severity 1 (Critical)
Trap/EventID
FALSE
LCD Message
SBE log disabled on <location>. Reseat memory
Initial Default.
FALSE
Filter
IPMI Alert SNMP
Visibility
Alert
FALSE
FALSE
FALSE
MEM8001
Message
Persistent correctable memory error logging enabled for a memory device at location
Arguments
LC Log
Alert
FALSE
FALSE
arg1 = location
•
LC Log
Alert
FALSE
FALSE
arg1 = location
•
LCD
Power Off Power
Cycle
FALSE
FALSE
FALSE
LCD
Power Off Power
Cycle
FALSE
FALSE
FALSE
Reset
FALSE
arg1 .
Reset
FALSE
arg1 .
379
Detailed
Description
The memory may not be operational. This an early indicator of a possible future uncorrectable
error.
Recommended
Response Action
Re-install the memory component. If the problem continues, contact support.
Category
System Health
SubCategory
MEM = Memory
Severity
Severity 1 (Critical)
Trap/EventID
FALSE
LCD Message
Correctable memory error rate exceeded for <location>. Reseat memory.
Initial Default.
FALSE
Filter
Visibility
IPMI Alert SNMP
Alert
Alert
LC Log
LCD
Power Off Power
Cycle
Reset
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
MEM8000
Message
Correctable memory error logging disabled for a memory device at location
arg1
.
Arguments
•
arg1 =
location
Detailed
Description
Errors are being corrected but no longer logged.
Recommended
Response Action
Review system logs for memory exceptions. Re-install memory at location <location>
Category
System Health
SubCategory
MEM = Memory
Severity
Severity 1 (Critical)
Trap/EventID
FALSE
LCD Message
SBE log disabled on <location>. Reseat memory
Initial Default.
FALSE
Filter
Visibility
IPMI Alert SNMP
Alert
Alert
LC Log
LCD
Power Off Power
Cycle
Reset
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
FALSE
MEM8001
Message
Persistent correctable memory error logging enabled for a memory device at location
arg1
.
Arguments
•
arg1 =
location
379
Problem
The Error Light Emitting Diode (LED) is illuminated on the chassis and the BladeCenter HS22 blade server front information panel. The Advanced Management Module (AMM) system status indicates that there is a «correctable ECC memory error logging limit reached» error. The AMM logs the following errors:
19 E Blade_05 12/08/09, 11:29:06 (octans012)
Correctable memory error logging limit reached
20 E Blade_05 12/08/09, 11:29:05 (octans012)
Correctable memory error logging limitreached on DIMM 5
The memory errors occur in the following BladeCenter HS22 configuration:
— CPU-C states [Enable]
— Thermal Mode [Normal] double refresh rate
— 4 Gigabyte (GB) Samsung VLP DIMMs installed, Option part number 44T1488, replacement part number (FRU) 44T1498.
Resolving The Problem
Source
RETAIN tip: H196525
Symptom
The Error Light Emitting Diode (LED) is illuminated on the chassis and the BladeCenter HS22 blade server front information panel. The Advanced Management Module (AMM) system status indicates that there is a «correctable ECC memory error logging limit reached» error. The AMM logs the following errors:
19 E Blade_05 12/08/09, 11:29:06 (octans012)
Correctable memory error logging limit reached
20 E Blade_05 12/08/09, 11:29:05 (octans012)
Correctable memory error logging limit reached on DIMM 5
The memory errors occur in the following BladeCenter HS22 configuration:
— CPU-C states [Enable]
— Thermal Mode [Normal] double refresh rate
— 4 Gigabyte (GB) Samsung VLP DIMMs installed, Option part number 44T1488, replacement part number (FRU) 44T1498.
Affected configurations
The system may be any of the following IBM servers:
- BladeCenter HS22, Type 1936, any model
- BladeCenter HS22, Type 7870, any model
This tip is not software specific.
This tip is not option specific.
The system has the symptom described above.
Solution
Choose one of the following two (2) methods to resolve the errors:
Method 1:
Change Thermal Mode setting (preferred method)
- Boot the blade into the F1 «System Configuration and Boot Management» screen. Highlight «System Settings.» Press Enter and select Memory. Select Thermal Mode and change the setting to «Performance.»
- Press the Esc key twice to get to «System Configuration and Boot Management» and then select Save Settings and Exit Setup.
- Follow the instructions on the next screen to exit the «Setup Utility.»
- Power the blade off for the changes to take effect and restart.
Changing «Normal» mode to «Performance» mode affects the way that the Dual In-Line Memory Modules (DIMMs) are refreshed. This results in a DIMM temperature warning message occurring at a 10 degree lower temperature. This causes no impact in most industry standard data centers.
Method 2:
Disable CPU C-State
- Boot the blade into the F1 «System Configuration and Boot Management» screen. Highlight System Settings, press Enter, and select Processors. Select CPU C-States, and then change the setting to «Disable.»
- Press the Esc key twice to get to «System Configuration and Boot Management» and then select Save Settings and Exit Setup.
- Follow the instructions on the next screen to exit the «Setup Utility.
- Power the blade off for the changes to take effect and restart.
If the LED stays on after the changes have been made, do one of the following to turn it off:
-
Using the IPMItool application (which is a third party application available for Windows and Linux):
- impitool sel list (to verify the log contains messages)
- ipmitool sel clear
- ipmitool sel list (to verify the log is now empty)
- Restart the IMM. This can be done via the AMM GUI interface (select Blade Tasks, Power/Restart, and Restart Blade System Mgmt Processor for the appropriate blade) or with the ASU command line tool (asu rebootimm).
- Fully power the blade off, then power it back on (do not restart the blade). This can be done with the AMM or locally at the blade.
Additional information
This error message usually indicates a failing DIMM, however, a very rare condition has been identified with Samsung DIMMs that can cause a false error. By implementing either of the recommended Workaround s above, the false «correctable ECC memory logging limit reached» error should not occur.
Note: The false «correctable ECC memory error logging limit reached» error does not indicate defective DIMMs.
[{«Type»:»HW»,»Business Unit»:{«code»:»BU054″,»label»:»Systems w/TPS»},»Product»:{«code»:»HW21Q»,»label»:»BladeCenter HS Series Server (7809-H22)»},»Platform»:[{«code»:»PF025″,»label»:»Platform Independent»}],»Line of Business»:{«code»:»LOB18″,»label»:»Miscellaneous LOB»}}]
ProLiant Servers (ML,DL,SL)
-
- Forums
-
- Advancing Life & Work
- Alliances
- Around the Storage Block
- HPE Ezmeral: Uncut
- OEM Solutions
- Servers & Systems: The Right Compute
- Tech Insights
- The Cloud Experience Everywhere
- HPE Blog, Austria, Germany & Switzerland
- Blog HPE, France
- HPE Blog, Italy
- HPE Blog, Japan
- HPE Blog, Latin America
- HPE Blog, Poland
- HPE Blog, Hungary
- HPE Blog, UK, Ireland, Middle East & Africa
- Blogs
- Information
-
Forums
-
Blogs
- Advancing Life & Work
- Alliances
- Around the Storage Block
- HPE Ezmeral: Uncut
- OEM Solutions
- Servers & Systems: The Right Compute
- Tech Insights
- The Cloud Experience Everywhere
- HPE Blog, Austria, Germany & Switzerland
- Blog HPE, France
- HPE Blog, Italy
- HPE Blog, Japan
- HPE Blog, Latin America
- HPE Blog, UK, Ireland, Middle East & Africa
- HPE Blog, Poland
- HPE Blog, Hungary
-
Information
-
English