Bus fatal error was detected on a component at bus

The following errors are reported on the server console. 0.000033 Thu Feb 13 2014 20:07:05 CPU 2 has an internal error (IERR). Normal 0.000032 Thu Feb 13 2014 09:04:50 An OEM diagnostic event has occurred. Critical 0.000031 Thu Feb 13 2014 09:04:50 A bus fatal error was detected on a component at bus 0 device 5 function 0. Critical 0.000030 Thu Feb 13 2014 09:04:50 A bus fatal error was detected on a component at bus 9 device 0 function 0.

Issue

  • The following errors are reported on the server console.
         0.000033 Thu Feb 13 2014 20:07:05 CPU 2 has an internal error (IERR).  
Normal   0.000032 Thu Feb 13 2014 09:04:50 An OEM diagnostic event has occurred.  
Critical 0.000031 Thu Feb 13 2014 09:04:50 A bus fatal error was detected on a component at bus 0 device 5 function 0.  
Critical 0.000030 Thu Feb 13 2014 09:04:50 A bus fatal error was detected on a component at bus 9 device 0 function 0.

Environment

  • Red Hat Enterprise Linux Server 5
  • Red Hat Enterprise Linux Server 6

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

  • Remove From My Forums
  • Question

  • need help with this Dell R510 Server keeps getting fatal hardware error and rebooting

    A fatal hardware error has occurred.

    Component: PCI Express Root Port
    Error Source: Generic

    Bus:Device:Function: 0x0:0xA:0x0
    Vendor ID:Device ID: 0x8086:0x3411
    Class Code: 0x60400

    The details view of this entry contains further information.

    [ Name] Microsoft-Windows-WHEA-Logger
    [ Guid] {C26C4F3C-3F66-4E99-8F8A-39405CFED220}
    Keywords 0x8000000000000000
    [ SystemTime] 2016-05-21T01:26:57.796273100Z
    [ ActivityID] {3843F13F-98CC-4457-BCE3-AC8C702907D5}
    [ ProcessID] 1788
    [ ThreadID] 1776
    FRUId {00000000-0000-0000-0000-000000000000}
    UncorrectableErrorStatus 0x4000
    CorrectableErrorStatus 0x0
    HeaderLog 00000000000000000000000000000000
    RawData 435045521002FFFFFFFF01000100000002000000980100002A150100150510140000000000000000000000000000000000000000000000000000000000000000BDC407CF89B7184EB3C41F732CB5713167A4623E40AB9A40A698F362D464B38F2A68110688ADD101000000004552000000000000000000000000000000000000C8000000D0000000010200000100000054E995D9C1BB0F43AD91B44DCB3C6F3500000000000000000000000000000000010000000000000000000000000000000000000000000000EF000000000000000400000000010000470510400000000086801134000406000A0000000500000000000000000000000060070010E04201218000002C010400423C3B0A40004130800C0800C00348010E000100000000000000000000000000000000000000000000000000000000000100011500400000008031003070060000000000C13100000E00000000000000000000000000000000000000000000005C0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    Data from iDRAC

    Sat May 21 2016 02:21:46 A runtime critical stop occured. 
    Sat May 21 2016 02:21:39 An OEM diagnostic event has occurred. 
    Sat May 21 2016 02:21:39 A bus fatal error was detected on a component at bus 0 device 10 function 0. 
    Sat May 21 2016 02:21:39 An OEM diagnostic event has occurred. 
    Sat May 21 2016 02:21:39 A bus fatal error was detected on a component at slot 1. 
    Sat May 14 2016 03:26:34 This is an OEM record.

    • Edited by

      Saturday, May 21, 2016 11:48 AM
      update

Answers

    • Proposed as answer by
      Jay Gu
      Friday, May 27, 2016 6:03 AM
    • Marked as answer by
      Jay Gu
      Tuesday, June 7, 2016 3:11 AM

Hi,

we are running some R720’s for VMware View and they all are having this same issues. where they purple screen of death. and in the logs we have the following errors

A bus fatal error was detected on a component at bus 0 device 3 function 0.
A bus fatal error was detected on a component at slot 6.

I tried to use lspci and I get an output of the device but all i see is…

0000:00:03.0 Bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 3a [PCIe RP[0000:00:03.0]]

but I cant go any further than this, the -s switch some people have suggested is not accepted.

can any one help me determine what device is actually causing the issues. what physical slot does 3a refer to?

I would be very very grateful for some help here.

  • If you can’t explain it simply, you don’t understand it well enough. Albert Einstein
  • An arrow can only be shot by pulling it backward. So when life is dragging you back with difficulties. It means that its going to launch you into something great. So just focus and keep aiming.

Error Code

Message Information

Action

PCI1360

Message

LCD Message

Details

Action

PCI1362

Message

Details

Action

PCI2000

Message

LCD Message

Details

Action

PCI2002

Message

LCD Message

Details

Action

PCI3000

Message

Details

Action

PCI3002

Message

104

Cycle input power, update component drivers, if device is removable,

reinstall the device.

A bus fatal error was detected on a component at slot <

number >. Re-seat PCI card.

Bus fatal error on slot <

System performance may be degraded, or system may fail to operate.

Cycle input power, update component drivers, if device is removable,

reinstall the device.

Bus performance degraded for a component at slot <

System performance may be degraded. The bus is not operating at

maximum speed or width.

Cycle input power, update component drivers, remove and reinstall

the device at the next scheduled service time.

A fatal IO error detected on a component at bus

bus >device< device >function < func >.

<

bus > device < device > function < func >.

Fatal IO error on bus <

System performance may be degraded, or system may fail to operate.

Cycle input power, update component drivers, remove and reinstall

the device.

A fatal IO error detected on a component at slot <

number >.

Fatal IO error on slot <

System performance may be degraded, or system may fail to operate.

Cycle input power, update component drivers, remove and reinstall

the device.

Device option ROM on embedded NIC failed to support Link Tuning or

FlexAddress.

Either the BIOS, BMC/iDRAC, or LOM firmware is out of date and does

not support FlexAddress.

Update BIOS, BMC/iDRAC, and LOM firmware. If the issue persists,

see

Getting

Help.

Failed to program virtual MAC address on a component at bus

bus >device< device >function < func >.

<

number >.

number >.

number >.

Skip to content

I had a Dell R815 host crash yesterday, with the following PSOD error message;

The system has found a problem on your machine and cannot continue.
LINT1 motherboard interrupt. This is a hardware problem; please contact your hardware vendor.

PSOD-LINT1

When I checked the system logs on the iDRAC, I could see a bus fatal error logged;

System Event Logs

Severity Time Description
Critical 18:24:36 The watchdog timer expired.
Normal 18:16:37 An OEM diagnostic event has occurred.
Critical 18:16:36 A bus fatal error was detected on a component at bus 4 device 4 function 0.

I ran the integrated hardware diagnostics using the system services on boot (F10) which confirmed these errors, but only because it read the system logs. I find this really annoying because if I had cleared the event logs prior to running the hardware diagnostics no errors would have been reported, and now I’m not sure if the hardware is faulty or not. Here are the reported errors;

Watchdog-Sensor

PCIE-Fatal-Error

Either way I can’t put it back into production without further analysis and need to find out what hardware component is located at bus 4 device 4 function 0 so that I can log a support call to Dell. It turns out this is really easy, using the lspci command which returns detailed info on all PCI devices.

lspci prints the device syntax in the [domain]:[bus]:[device].[function] format, so it’s easy to add the device information to grep the specific component without seeing all the other PCI devices. Here is what mine returned;

lspci

~ # lspci | grep '000:004:04.0'
000:004:04.0 Bridge: PLX Technology, Inc. PEX 8624 24-lane, 6-Port PCI Express Gen 2 (5.0 GT/s) Switch [ExpressLane]

~ # lspci --help
lspci   -p --pciinfo   Prints detailed info on all PCI devices
        -n --nolookup  Don't look up PCI device names and info
        -d --dump      Print hex dump of the full config space
        -v --verbose   Verbose information

So now I know there was a problem with the PCI bridge and can log this to Dell in the hope that they simply replace the component under warranty.

 45,962 total views,  2 views today

An independent IT contractor with a strong focus on VMware virtualisation and infrastructure operations. I am inspired by technology, not afraid to question the status quo and balance my professional commitments with entertaining my three awesome kids (Ashton, Oliver and Lara).
View all posts by Jon Munday

Error Code

Message Information

Action

Cycle input power, update component drivers, if device is removable, 
reinstall the device.

PCI1320

Message

A bus fatal error was detected on a component at bus 
<

bus

>device<

device

>function <

func

>.

LCD Message

Bus fatal error on bus <

bus

> device <

device

> function <

func

>. Power 

cycle system.

Details

System performance may be degraded, or system may fail to operate.

Action

Cycle input power, update component drivers, if device is removable, 
reinstall the device.

PCI1342

Message

A bus time-out was detected on a component at slot <

number

>.

Details

System performance may be degraded, or system may fail to operate.

Action

Cycle input power, update component drivers, if device is removable, 
reinstall the device.

PCI1348

Message

A PCI parity error was detected on a component at slot <

number

>.

LCD Message

PCI parity error on slot <

number

>. Re-seat PCI card.

Details

System performance may be degraded, or system may fail to operate.

Action

Cycle input power, update component drivers, if device is removable, 
reinstall the device.

PCI1360

Message

A bus fatal error was detected on a component at slot <

number

>.

LCD Message

Bus fatal error on slot <

number

>. Re-seat PCI card.

Details

System performance may be degraded, or system may fail to operate.

Action

Cycle input power, update component drivers, if device is removable, 
reinstall the device.

PDR0001

Message

Fault detected on drive <

number

>.

LCD Message

Fault detected on drive <

number

>. Check drive.

Details

The controller detected a failure on the disk and has taken the disk 
offline.

Action

PDR1016

Message

Drive <

number

> is removed from disk drive bay <

bay

>.

163

Понравилась статья? Поделить с друзьями:
  • Bus error python
  • Bus error core dumped ubuntu
  • Bus error 10 mac os что это
  • Bus ccw rot ошибка
  • Bus busy or hardware error 11 ponyprog