Cpu 1 has an internal error ierr

Explains the IERR error and how to troubleshoot for Intel® Server Boards.

The browser version you are using is not recommended for this site.
Please consider upgrading to the latest version of your browser by clicking one of the following links.

  • Safari
  • Chrome
  • Edge
  • Firefox

How to Recover from an Internal Error (IERR) for Intel® Server Boards

Documentation

Content Type
Troubleshooting

Article ID
000006043

Last Reviewed
01/10/2023

What am I seeing?

An IERR is a catastrophic error reported by the processor but generally caused by devices outside of the processor core (e.g., memory, PCIe).

  • The processor execution has stalled due typically to an event outside of the processor.
  • This issue is often accompanied by a CATERR event that can be cross-referenced for additional information.

How to fix it.

Follow these steps in order:

  1. Review the System Event Log (SEL) for Error correction code (ECC) events. Defective memory can trigger an IERR.
  2. Review the SEL for any PCIe events. Malfunctioning PCIe devices can trigger an IERR.
  3. Ensure that Operating System (OS) drivers are up to date for the server as well as for any recently added hardware devices. Out-of-date OS drivers can trigger an IERR.
  4. Check the OS logs for any Machine Check Architecture (MCA) entries that may indicate a hardware fault that could have triggered the IERR. 
  5. Confirm that you have the latest BIOS for the server system.
  6. If the logs confirm that there is a specific memory module(s) that can be causing the issue, proceed to reseat the memory stick(s) and monitor the server for 24 hours.

Related Products

This article applies to 205 products.

Intel® Server Board S2600STK
Intel® Server Board S2600STS
Intel® Compute Module D50TNP1U Family
Intel® Compute Module D50TNP2U Family
Intel® Server System D50TNP1MHCRAC Compute Module
Intel® Server System D50TNP1MHCRLC Compute Module
Intel® Server System D50TNP1MHEVAC Compute Module
Intel® Server System D50TNP2MFALAC Acceleration Module
Intel® Server System D50TNP2MHSTAC Storage Module
Intel® Server System D50TNP2MHSVAC Management Module
Intel® Compute Module HNS2600BPB
Intel® Compute Module HNS2600BPB24
Intel® Compute Module HNS2600BPB24R
Intel® Compute Module HNS2600BPBLC
Intel® Compute Module HNS2600BPBLC24
Intel® Compute Module HNS2600BPBLC24R
Intel® Compute Module HNS2600BPBLCR
Intel® Compute Module HNS2600BPQ
Intel® Compute Module HNS2600BPQ24
Intel® Compute Module HNS2600BPQ24R
Intel® Compute Module HNS2600BPQR
Intel® Compute Module HNS2600BPS
Intel® Compute Module HNS2600BPS24
Intel® Compute Module HNS2600BPS24R
Intel® Compute Module HNS2600BPSR

Discontinued Products

Need more help?

Alt text to be used for img

Give Feedback

Oh man, didn’t get any notification of your quick reply so thought no-one had any idea of this issue.

Anyway, lately it has started to happen more and more, so several times a day. I’d say that currently 25% of the times there is the error and server does hard reset and starts up fine after that without any intervention (which is the only positive thing here).

With the lack of this power control in the iRMC API, I’m just doing two authenticated HTTP POST calls — the same ones done when you go to iRMC Web page and do «Press Power Button» —> Apply POST followed by the Confirm button POST.

Server is physically unavailable so cannot comment on if and how often this occurs using the button.

Server gets suspended (not real suspend, basically just turns off monitor or some other strange power state as Fujitsu doesn’t support S3 or S4 power states) after not being in use for 30 minutes. It is done by the Light-Out add-on for Windows Server Essentials. These CPU errors only occur when waking the server, never when suspending it.

The browser version you are using is not recommended for this site.
Please consider upgrading to the latest version of your browser by clicking one of the following links.

  • Safari
  • Chrome
  • Edge
  • Firefox

Article ID: 000026566

Content Type: Error Messages

Last Reviewed: 08/16/2022

My Server Crashes and Shows this Error: Processor CPU Machine Chk

Environment

Server systems/boards 

Operating System

OS Independent

BUILT IN — ARTICLE INTRO SECOND COMPONENT

Summary

How to recover from an IERR («Processor CPU Machine Chk», «CPU Internal Error», «CPU IErr,» or «CPU Machine Check error»)

Description

Server crashes with (due to) Processor CPU Machine Chk.

Usually, the System Event Log (SEL) Viewer shows the following:

2 | 12/18/2017 | 03:17:30 | Unknown #0x2e | | Asserted
3 | 12/18/2017 | 03:17:30 | Processor CPU Machine Chk | Transition to Non-recoverable | Asserted
4 | 12/18/2017 | 03:17:30 | Unknown MSR Info Log | | Asserted
5 | 12/18/2017 | 03:17:30 | Unknown MSR Info Log | | Asserted
etc​

Resolution

While this (also known as «CPU Internal Error,» «CPU IErr,» or «CPU Machine Check» errors) may be a signal that indicates an unrecoverable processor scenario, it is usually an indication that the Central Processing Unit (CPU) has detected an error in the system, or received an erroneous instruction from a system component. Still, the following are valid troubleshooting steps:

  1. Restart the system.
  2. Check the System Event Log to find out which processor is generating the error. This would depend on the error found. Therefore, if you have any questions, contact Support with a copy of these logs, plus those logs of the System Information Retrieval Utility.
  3. Clear the system event log.
  4. Ensure the BIOS/firmware is the latest.
  5. Try with one, compatible processor at a time.
  6. Test with another, compatible processor, if possible. If the board is an Intel® Server Board, refer to the Product Specifications page for processor-board compatibility information.
  7. Remove and reinstall the memory.
Additional information

There could be several causes of this error, not necessarily the CPU. A system bus interruption or a memory interruption can even start it up.

  • Summary
  • Description
  • Resolution
  • Additional information

Need more help?

Alt text to be used for img

Give Feedback

Disclaimer

Hi, 

We have a Dell PowerEdge R710 running as a Hyper-V host which crashed last night. During some troubleshooting this morning, I could not get the server to stay up. Once, I got it to stay up long enough to look some logs through open-manage, and found two relevant entries:

CPU2 Machine Check Error

CPU2 Has an internal error (IERR)

After seeing these entries, I removed the second processor and the server has stayed up. 

My questions are:

At this point I am assuming CPU 2 needs to be replaced, but I am seeing a lot online about this error being resolved by BIOS updates. Is it possible I could see this result too? I want to update the BIOS as it is very out of date, but I ran into some other problems while attempting to do that, so I’ll need some time to figure those out. 

How likely is it that this could be a socket / motherboard issue vs the processor itself? In the meantime, I will be ordering a processor.

What are the implications of running on just the one processor? Not necessarily from a load perspective, but in general, is it alright to run the system like that?

Thanks for the help!!

This topic has been deleted. Only users with topic management privileges can see it.

  • Dear All ,

    I have bought a new Dell Power Edge R330 . I can’t install the new OS on my New Server . It print CPU 0000 CPU1 Internal error (IERR) Contact Support . Please let me know how can i resolve this issue .

    Thank you in advance

  • If the server is new you should contact support.

    No point in your troubleshooting the issue, as you want these issues known by your support company so they can fix any issues.

  • The error listed appears to be a semi-generic one, in which its most often is a case of bad memory on the server.

  • Agree with @DustinB3403. However, here’s a KB article from Dell that discusses potential causes and how to resolve.

  • Check that one of your memory sticks did not come loose in shipment; namely reseat them all. If that does not fix, call Dell Support. I had that error on a new server and it turned out we had a loose module

  • Tagged and modified the title for SEO purposes.

  • If it is new, you could have already contacted Dell support and had a solution in the time it took you to write this post. It’s new, use your warranty, contact Dell, and get it fixed. That’s how you do that. We can’t send you replacement components even if you knew exactly what you needed.

  • The error says to contact support, I’d start there. While waiting for them. I’d do routine testing… reseat everything, pull some memory out, etc. If it is not time sensitive, I’d just wait for Dell to get back to you.

  • Also, you mentioned installing an OS to the server, make sure you don’t do that. Only a hypervisor to the server, OS to the hypervisor.

  • @scottalanmiller said in Dell Power Edge R330 Issue: CPU 0000 CPU1 Internal error (IERR):

    The error says to contact support, I’d start there. While waiting for them. I’d do routine testing… reseat everything, pull some memory out, etc. If it is not time sensitive, I’d just wait for Dell to get back to you.

    I’ve never had to wait more than 5 minutes for Dell chat support.

Question
I got the error message “IERR Asserted” when I boot up my motherboard X9DRW-IF. I already updated my BIOS and IPMI firmware rev. What is the root cause?
Answer
IERR is a Processor Internal Error, a signal that indicates a Processor unrecoverable error or even a non-CPU event, such as a System BUS interruption or a memory can start this signal.
On an Intel® Server Board, a Processor IERR can be confirmed or discarded by running a CPU Retest from the BIOS (Basic Input Output System) Setup Utility.
In some cases a system restart can also eliminate an IERR but if the problem persists please try to boot up the system with one processor at the time, test another processor if possible and remove and reinstall the memory.
The IERR Filtering Algorithm helps to determine if the IERR signal came from a false CPU internal error or from another hardware source; preventing unnecessary processor replacements and at the same time helps to isolate IERR events.
Was this FAQ helpful?
YES      NO

Enter Comments Below:
Note: Your comments/feedback should be limited to this FAQ only. For technical support, please send an email to support@supermicro.com.

 Enter your email address below if you’d like technical support staff to reply:

 Please type the Captcha (no space)

A X R Y

FAQ Stats
FAQ ID Related Category / Keyword Date Posted Code
14686 N/A 08/13/12 JL

    Print Answer

Понравилась статья? Поделить с друзьями:
  • Cpp runtime error
  • Cpp error handling
  • Cpl как изменить
  • Cparticletypemanager get error resource дота
  • Cpanel как изменить язык