Whea logger 18 cache hierarchy error

Technical articles, content and resources for IT Professionals working in Microsoft technologies

Table of Contents

  • Applies to:
  • Details
  • Explanation
  • User Action
  • Related Information

Applies to:

Windows Server 2008, Windows Vista, Windows Server 2008 R2, Windows 7

Details

Product:

Windows Operating System

Event ID:

18

Source:

Microsoft-Windows-WHEA-Logger          

Version:

6.1

Symbolic Name:

Boot Performance Monitoring

Message:

A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Unknown Error
Processor ID: 1

The details view of this entry contains further information.

Explanation

This error indicates that there is a hardware problem.   A machine check exception indicates a  computer hardware error that occurs when a computer’s central processing unit detects a hardware problem.    

Note:  WHEA stands for Windows Hardware Error Architecture.  

Some of the main hardware problems which cause machine check exceptions  include:

  • System bus errors (error communicating between the processor and the motherboard) 
  • Memory errors that may include parity and error correction code (ECC) problems.   Error checking ensures that data is stored correctly in the RAM; if the information is corrupted, then random errors occur.
  • Cache errors in the processor; the cache stores important data and code. If this is corrupted, errors often occur.
  • Poor voltage regulation (i.e. power supply problem, voltage regulator malfunction, capacitor degradation)
  • Damage due to power spikes
  • Static damage to the motherboard 
  • Incorrect processor voltage setting in the BIOS (too low or too high)
  • Overclocking
  • Permanent motherboard or power supply damage caused by prior overclocking
  • Excessive temperature caused by insufficient airflow (possibly caused by fan failure or blockage of air inlet/outlet)
  • Improper BIOS initialization (the BIOS configuring the motherboard or CPU incorrectly)
  • Installation of a processor that is too much for your motherboard to handle (excessive power requirement, incompatibility)
  • Defective hardware that may be drawing excessive power or otherwise disrupting proper voltage regulation

User Action

  • Update the BIOS and the drivers for the motherboard chipset.   
  • Update all the hardware drivers, if updates are available from your manufacturer. 
  • Check the temperature inside the computer to make sure your processor and related peripherals are not overheating.
  • Check the fan on your CPU to make sure it is properly attached to the CPU.
  • If you have overclocked your CPU,  reset your settings to the default settings. 
  • Make sure you power supply fan is working correctly

Related Information


WHEA Design Guide

http://msdn.microsoft.com/en-us/library/ff559288(v=vs.85).aspx


WHEA —  Windows Hardware Error Architecture Overview

http://msdn.microsoft.com/en-us/windows/hardware/gg463286

  • #1

Hey guys,

Since weeks i’ve had problems with my pc. Randomly shutting down (90% of the time with a BSOD).

CPU: AMD Ryzen 7 5800X 3.80 GHz
Motherboard: MSI MPG X570 Gaming Edge Wifi
GPU: AMD Radeon RX 5700 XT Red Devil 8GB
Memory: Corsair DIMM 32GB DDR4-3600 (2x 18GB)

Samsung SSD 860 EVO 1TB
NVMe Samsung SSD970 SCSI Disk Device
KINGSTON SA400S37480G

First time today:
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 0

Second time today:
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 14

Anyone that could help?
Updated my BIOS last week to current version.
CPU & GPU drivers are up to date.

Chrispy_


  • #2

Run your DDR4 at default speeds (ie, disable XMP) for a week and see if that makes a difference.

I’ve built several hundred Ryzen 3000 and 5000 computers and the cache heirarchy error I’ve only seen twice, both times I identified it as a faulty CPUs by swapping the CPU and only the CPU with another PC, only to see the problem jump with the CPU.

AMD replaced both times under warranty.

P4-630


  • #3

Corsair DIMM 32GB DDR4-3600 (2x 18GB)

That’s some rare memory sticks you got there…

  • #4

That’s some rare memory sticks you got there…

32GB (2x16GB) Corsair Vengeance RGB PRO DDR4-3600 RAM CL18 (18-22-22-42) Kit​

https://www.cyberport.at/marken/corsair.html

Run your DDR4 at default speeds (ie, disable XMP) for a week and see if that makes a difference.

I’ve built several hundred Ryzen 3000 and 5000 computers and the cache heirarchy error I’ve only seen twice, both times I identified it as a faulty CPUs by swapping the CPU and only the CPU with another PC, only to see the problem jump with the CPU.

AMD replaced both times under warranty.

Apperantly XMP was already disabled.
Just enabled it. See if that makes a change.

I’ll need to check if I know someone with a rig that can fit my CPU to see if the problem also accures there.

  • WhatsApp Image 2022-07-24 at 11.29.28 PM.jpeg

    WhatsApp Image 2022-07-24 at 11.29.28 PM.jpeg

    413.8 KB · Views: 70

Last edited: Jul 24, 2022

Chrispy_


  • #5

32GB (2x16GB) Corsair Vengeance RGB PRO DDR4-3600 RAM CL18 (18-22-22-42) Kit​

https://www.cyberport.at/marken/corsair.html

Apperantly XMP was already disabled.
Just enabled it. See if that makes a change.

I’ll need to check if I know someone with a rig that can fit my CPU to see if the problem also accures there.

Enabling it won’t improve stability.
Sounds like a dodgy CPU but it can’t hurt to update your motherboard BIOS to the latest version in case there is an issue with the version you’re running.

Honestly though, the cache hierarchy error really is something you only tend to see when the CPU is faulty. It’s not a symptom of something else like bad RAM or GPU, it’s literally a CPU hardware error.

If its only rebooting twice a day then it’s marginal and you may find you can correct for it by throwing a lit more voltage at the CPU — but the fact you’re getting errors at stock speeds means you should probably RMA it while you still have a warranty, provided you can prove it’s the CPU at fault.

Last edited: Jul 24, 2022

GerKNG


  • #6

Any OC, manual Voltage, Curve Optimizer or other tweaks?

if not, run OCCTs Memtest with AVX for at least 30-40 Minutes.

Zach_01


  • #7

Hey guys,

Since weeks i’ve had problems with my pc. Randomly shutting down (90% of the time with a BSOD).

CPU: AMD Ryzen 7 5800X 3.80 GHz
Motherboard: MSI MPG X570 Gaming Edge Wifi
GPU: AMD Radeon RX 5700 XT Red Devil 8GB
Memory: Corsair DIMM 32GB DDR4-3600 (2x 18GB)

Samsung SSD 860 EVO 1TB
NVMe Samsung SSD970 SCSI Disk Device
KINGSTON SA400S37480G

First time today:
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 0

Second time today:
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 14

Anyone that could help?
Updated my BIOS last week to current version.
CPU & GPU drivers are up to date.

WHEA Event ID 18

Error Type: Cache Hierarchy Error

Indicates some issue with the memory subsystem (DRAM <> MemoryController=UMC <> DataFabric=IF=InfinityFabric

Usually this type is caused by OCing all related parts (DRAM <> UMC <> IF), some times on XMP/DOCP profile as well. Never saw it on default 2133MHz though or I dont remember it anyway…
Some times a combination of DRAM, Board, and CPU can cause this also, even if no OC (beyond XMP) is achieved.
I’m not excluding a faulty CPU as the UMC and IF are parts of the CPU package.

Can you post screenshot of the latest ZenTimings in both situations? (XMP on/off)

MCLK is DRAM speed
FCLK is IF speed
UCLK is UMC speed

1658734226216.png

Last edited: Jul 25, 2022

  • #8

WHEA Event ID 18

Error Type: Cache Hierarchy Error

Indicates some issue with the memory subsystem (DRAM <> MemoryController=UMC <> DataFabric=IF=InfinityFabric

Usually this type is caused by OCing all related parts (DRAM <> UMC <> IF), some times on XMP/DOCP profile as well. Never saw it on default 2133MHz though or I dont remember it anyway…
Some times a combination of DRAM, Board, and CPU can cause this also, even if no OC (beyond XMP) is achieved.
I’m not excluding a faulty CPU as the UMC and IF are parts of the CPU package.

Can you post screenshot of the latest ZenTimings in both situations? (XMP on/off)

MCLK is DRAM speed
FCLK is IF speed
UCLK is UMC speed

View attachment 255847

I’ve never overclocked any of the parts, since I don’t think it’s necessary with these parts.
After the first couple of BSOD’s I read somewhere that underclocking the CPU could work, but it only caused the PC not to reboot anymore after multiple different settings and tests.

I still have a Ryzen 5 3600 from my old rig.
I could test if that one works. Only need to buy some new cooling paste.

Hereby the 2 screenshots.

Any OC, manual Voltage, Curve Optimizer or other tweaks?

if not, run OCCTs Memtest with AVX for at least 30-40 Minutes.

No OC. Only some underclocking after the first couple of BSOD’s, but then it wouldn’t start anymore.
Only tweaks I did where fan related for cooling purposes.

I’ll try the Memtest later today.

  • ZenTimings_Screenshot_XMP_off.png

    ZenTimings_Screenshot_XMP_off.png

    32.7 KB · Views: 64

  • ZenTimings_Screenshot_XMP_on.png

    ZenTimings_Screenshot_XMP_on.png

    32.9 KB · Views: 63

Zach_01


  • #9

What happened with XMP enabled? Same?

BTW all looks normal in both your ZenTimings shots.
Try to run different CPU with this DRAM/Board combo and also if you can borrow different DRAM sticks to try also with the same 5800X/Board. DRAM between 3200-3600MHz.

Some corsairs are designed for Ryzen but maybe most of them can causing such issues, again with certain CPU/Board combos.

Chrispy_


  • #10

What happened with XMP enabled? Same?

BTW all looks normal in both your ZenTimings shots.
Try to run different CPU with this DRAM/Board combo and also if you can borrow different DRAM sticks to try also with the same 5800X/Board. DRAM between 3200-3600MHz.

Some corsairs are designed for Ryzen but maybe most of them can causing such issues, again with certain CPU/Board combos.

I have used Corsair Vengeance (not LPX in several builds) and not had the grief that LPX so often causes. Still a small sample size of maybe a dozen machines as I prefer to steer clear of Corsair RAM altogether given my ridiculous RMA rate with LPX.

I don’t typically get cache heirarchy errors with bad RAM, but I guess it wouldn’t hurt to run an OCCT memtest or even boot to Memtest86 to confirm that the RAM is actually stable. Plenty of Corsair LPX that was faulty failed at JEDEC default speeds and voltage so that part isn’t too surprising to me.

DeathtoGnomes


  • #11

might wanna consider re-seating RAM and maybe even changing slots.

puma99dk|


  • #12

@strikemaker could you fill out your System Specs?

Also what PSU are you powering your system with?

What happened with XMP enabled? Same?

BTW all looks normal in both your ZenTimings shots.
Try to run different CPU with this DRAM/Board combo and also if you can borrow different DRAM sticks to try also with the same 5800X/Board. DRAM between 3200-3600MHz.

Some corsairs are designed for Ryzen but maybe most of them can causing such issues, again with certain CPU/Board combos.

Some RAM yes, but I been using my mixed Geil Dragon DDR4 with Hynix and Samsung B-Die chips running at 3000MHz (Hynix kit speed) and both kits of mine ran stabile with my AMD Ryzen 9 3900X with a Asus ROG Strix B550-A Gaming, and now I am on a Intel Core i7-11700K and a Gigabyte Z590 Vision G which BSOD with either XMP or a manual 3000MHz tune on both my kits and I been talking with Gigabyte support but they won’t suggest anything and says that it’s a Geil problem if I can’t do XMP or manual tuning of my ram.

Maybe I am just lucky with my Geil Dragon DDR4 ram since I only ran them at 3000MHz on AMD Ryzen 3000 series CPU not sure or Asus just did a good job with their bios and B550 board.

  • #13

Any OC, manual Voltage, Curve Optimizer or other tweaks?

if not, run OCCTs Memtest with AVX for at least 30-40 Minutes.

Nothing special happend with the memtest.

Temperatures also stayed way lower then when i’m gaming. CPU could reach 70-75 celsius.
I’ve ordered some MX4 cooling paste, so I can try the whole setup with a different Ryzen cpu.

  • OCCT-Screenshot-20220725-134752.png

    OCCT-Screenshot-20220725-134752.png

    131.8 KB · Views: 57

GerKNG


  • #14

Nothing special happend with the memtest.

Temperatures also stayed way lower then when i’m gaming. CPU could reach 70-75 celsius.
I’ve ordered some MX4 cooling paste, so I can try the whole setup with a different Ryzen cpu.

if you swap the CPUs take a look at the pins and in the socket. normally a CPU does not die or malfunction that easy.
i know the good ol’ idle crashes and similar (which can be worked against with changing the PSU Idle Control to typical instead of auto/low)
but since you get BSODs i don’t think that that’s the problem.
for the next step i’d reinstall windows.

Last edited: Jul 25, 2022

  • #15

if you swap the CPUs take a look at the pins and in the socket. normally a CPU does not die or malfunction that easy.
i know the good ol’ idle crashes and similar (which can be worked against with changing the PSU Idle Control to typical instead of auto/low)
but since you get BSODs i don’t think that that’s the problem.
for the next step i’d reinstall windows.

I’ve checked the CPU 2 weeks ago. Pins are okay. Socket is okay. Put new cooling paste on it and that helped for a short period of time i thought.

PSU is 800 watt, more like a overkill then not enough.

Windows is reinstalled as well 2 weeks ago.

  • #16

Try different ram and get back to us, not corsair

GerKNG


  • #17

I’ve checked the CPU 2 weeks ago. Pins are okay. Socket is okay. Put new cooling paste on it and that helped for a short period of time i thought.

PSU is 800 watt, more like a overkill then not enough.

Windows is reinstalled as well 2 weeks ago.

i own the whole Zen 3 Lineup and some CPUs double.

i had a launch 5800X that crashed randomly and i never fixed it.
you could try to change the PSU idle control.

in the bios should be under CPU config a AMD CBS option.
there is the Power Supply Idle Control Mode which should be on auto.
change that to typical and see what happens. (i have two CPUs that need that on typical to function normally. (one was RMAd and replaced)

  • #18

i own the whole Zen 3 Lineup and some CPUs double.

i had a launch 5800X that crashed randomly and i never fixed it.
you could try to change the PSU idle control.

in the bios should be under CPU config a AMD CBS option.
there is the Power Supply Idle Control Mode which should be on auto.
change that to typical and see what happens. (i have two CPUs that need that on typical to function normally. (one was RMAd and replaced)

After switching this to typical i got this an hour later:

1658780379887.png

GerKNG


  • #19

TPM error? or is that from the crash?

  • #20

TPM error? or is that from the crash?

it’s all from the crash. Now again.

Both times during playing Dota 2

1658780769280.png

Assimilator


  • #21

Corsair RAM has a long history of not liking Ryzen CPUs. As others have said, get some other RAM to test with. If the same issue occurs with different RAM then the CPU is the issue.

tabascosauz


  • #22

it’s all from the crash. Now again.

Both times during playing Dota 2

Cache Hierarchy type WHEA is strictly CPU core related. If it was Fabric/memory controller, it would be Bus/Interconnect type (WHEA 19 I think).

Either:

  • the board you have doesn’t like your CPU and isn’t giving it enough idle or load Vcore,
  • the BIOS you’re currently on doesn’t like your CPU and isn’t it giving it enough Vcore,
  • or your CPU is bad and needs to be RMA’d

Have not yet heard of bent pins causing Cache Hierarchy, but you never know. CPU being bad is rare, but WHEA 18 used to happen quite a bit for earlier production Ryzen 5000 (late 2020 or very early 2021).

In the meantime, you can try some different BIOSes, or up the Vcore a bit by using a slight positive Vcore offset (don’t exactly remember how the MSI BIOS setting is laid out), or using positive Curve Optimizer offset on the affected APIC ID cores (looks like it’s all over the place so maybe just apply to all cores).

Otherwise, you can try turning Power Supply Idle Current to Typical as mentioned. More drastic measure would be to disable C-states entirely (Global Cstates setting, not DF Cstates).

If all fails, it’s time to hit up AMD RMA. They will try to make you jump through troubleshooting hoops, just document what you’ve already done and be persistent that troubleshooting is useless.

  • #23

Corsair RAM has a long history of not liking Ryzen CPUs. As others have said, get some other RAM to test with. If the same issue occurs with different RAM then the CPU is the issue.

Just switched my RAM back to my oldies:
Curcial Ballistix Sport LT 16GB (2x 8GB) DDR4

Curious what will happen now.
That one always worked together with my ryzen5 2600

  • ZenTimings_Screenshot.png

    ZenTimings_Screenshot.png

    32.5 KB · Views: 35

Zach_01


  • #24

Just switched my RAM back to my oldies:
Curcial Ballistix Sport LT 16GB (2x 8GB) DDR4

Curious what will happen now.
That one always worked together with my ryzen5 2600

Is it no XMP enabled (2400MHz) or thats their XMP speed?

  • #25

Is it no XMP enabled (2400MHz) or thats their XMP speed?

XMP is not enabled

Last edited: Jul 25, 2022

Всем добрый день! Проблема следующая: на новом ноутбуке Lenovo v580c 20160 наблюдается следующее событие:

Событие 18, WHEA-logger
Произошла неустранимая аппаратная ошибка.
Сообщивший компонент: ядро процессора
Источник ошибки: Исключение проверки компьютера
Тип ошибки: Ошибка иерархии кэша
ИД процесса: 0
Дополнительные сведения содержатся в подробном представлении этой записи.

Имя журнала: System
Источник: Microsoft-Windows-WHEA-Logger
Дата: 12.01.2014 12:30:40
Код события: 18
Категория задачи:Отсутствует
Уровень: Ошибка
Ключевые слова:
Пользователь: LOCAL SERVICE
Компьютер: lt-metelkov
Описание:
Произошла неустранимая аппаратная ошибка.

Сообщивший компонент: ядро процессора
Источник ошибки: Исключение проверки компьютера
Тип ошибки: Ошибка иерархии кэша
ИД процесса: 0

Дополнительные сведения содержатся в подробном представлении этой записи.
Xml события:
<Event xmlns=»http://schemas.microsoft.com/win/2004/08/events/event»>
<System>
<Provider Name=»Microsoft-Windows-WHEA-Logger» Guid=»{C26C4F3C-3F66-4E99-8F8A-39405CFED220}» />
<EventID>18</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000000</Keywords>
<TimeCreated SystemTime=»2014-01-12T08:30:40.690060600Z» />
<EventRecordID>4087</EventRecordID>
<Correlation ActivityID=»{67043670-53BB-471A-A7D6-7AD5AB937AEE}» />
<Execution ProcessID=»1844″ ThreadID=»2400″ />
<Channel>System</Channel>
<Computer>laptop</Computer>
<Security UserID=»S-1-5-19″ />
</System>
<EventData>
<Data Name=»ErrorSource»>3</Data>
<Data Name=»ApicId»>0</Data>
<Data Name=»MCABank»>6</Data>
<Data Name=»MciStat»>0xae2000000003110a</Data>
<Data Name=»MciAddr»>0xffb05000</Data>
<Data Name=»MciMisc»>0x38a0000086</Data>
<Data Name=»ErrorType»>9</Data>
<Data Name=»TransactionType»>2</Data>
<Data Name=»Participation»>256</Data>
<Data Name=»RequestType»>0</Data>
<Data Name=»MemorIO»>256</Data>
<Data Name=»MemHierarchyLvl»>2</Data>
<Data Name=»Timeout»>256</Data>
<Data Name=»OperationType»>256</Data>
<Data Name=»Channel»>256</Data>
<Data Name=»Length»>928</Data>
<Data Name=»RawData»>435045521002FFFFFFFF03000100000002000000A0030000091E08000C010E140000000000000000000000000000000000000000000000000000000000000000BDC407CF89B7184EB3C41F732CB57131FE6FF5E89C91C54CBA8865ABE14913BBE4DEE47E700FCF0102000000000000000000000000000000000000000000000058010000C00000000102000001000000ADCC7698B447DB4BB65E16F193C4F3DB0000000000000000000000000000000001000000000000000000000000000000000000000000000018020000800000000102000000000000B0A03EDC44A19747B95B53FA242B6E1D0000000000000000000000000000000001000000000000000000000000000000000000000000000098020000080100000102000000000000011D1E8AF94257459C33565E5CC3F7E8000000000000000000000000000000000100000000000000000000000000000000000000000000007F010000000000000002010000020000A90603000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007000000000000000000000000000000A906030000081000BFE3BA7DFFFBEBBF0000000000000000000000000000000000000000000000000000000000000000F50157A5EFE3DE43AC72249B573FAD2C03000000000000009F008206000000000050B0FF000000000000000000000000000000000000000000000000000000000100000001000000625B0881700FCF010000000000000000000000000000000000000000060000000A110300000020AE0050B0FF00000000860000A0380000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000</Data>
</EventData>
</Event>

Конфигурация компьютера следующая: intel core i5 3230, 6gb RAM, GeForce 740m 1gb, Windows 7 Professional.
PS: на работе компьютера ошибка никак не отражается. BSODов и ничего такого подозрительного нет. Разве только загрузка самой ОС длится немного дольше, как мне кажется, но это может быть моей паранойей. Гугл ничего конкретного не дал, так что надеюсь на вашу помощь! Заранее спасибо!

Понравилась статья? Поделить с друзьями:
  • Whatsminer error code 206
  • Whea internal error как исправить
  • Whatsminer error code 131
  • Whea internal error windows 10 синий экран
  • Whea error event logs что это