Read fpdma queued ошибка - Исправление ошибок и поиск оптимальных решений проблем

Hello:

I’ve an ASUS M5A97 EVO R2.0 motherboard and 3 hard disks. One of them is a Western Digital Caviar Blue 1TB SATA3 (model WD10EZEX-08M2NA0). I use it to store the /home directory exclusively. Some times, when I boot my computer, the kernel logs several messages that indicate that there have been errors with this drive. Some times it doesn’t prevent me from logging in with my user, but others it seems to get stuck in that step. Then I’ve to reset my computer abruptly and pass the «fsck» command to fix whatever that has failed. I believe that the error messages only appear during the boot. I’ve checked the «dmesg» output after hours of work and nothing else related to this failure has been logged.

Code:

[    1.792425] ata2: SATA max UDMA/133 abar m1024@0xfe50b000 port 0xfe50b180 irq 19
[    2.281038] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    2.281598] ata2.00: ATA-9: WDC WD10EZEX-08M2NA0, 01.01A01, max UDMA/100
[    2.281601] ata2.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[    2.282311] ata2.00: configured for UDMA/100
[    9.259321] ata2.00: exception Emask 0x0 SAct 0x8 SErr 0x0 action 0x6
[    9.259351] ata2.00: irq_stat 0x40000008
[    9.259362] ata2.00: failed command: READ FPDMA QUEUED
[    9.259376] ata2.00: cmd 60/08:18:08:00:00/00:00:00:00:00/40 tag 3 ncq 4096 in
[    9.259398] ata2.00: status: { DRDY ERR }
[    9.259407] ata2.00: error: { ICRC ABRT }
[    9.259419] ata2: hard resetting link
[    9.748016] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    9.749363] ata2.00: configured for UDMA/100
[    9.749373] ata2: EH complete
[   10.341975] ata2.00: exception Emask 0x2 SAct 0x100 SErr 0x200401 action 0x6
[   10.341989] ata2.00: irq_stat 0x44000008
[   10.341999] ata2: SError: { RecovData Proto BadCRC }
[   10.342009] ata2.00: failed command: READ FPDMA QUEUED
[   10.342021] ata2.00: cmd 60/20:40:00:08:00/00:00:00:00:00/40 tag 8 ncq 16384 in
[   10.342043] ata2.00: status: { DRDY ERR }
[   10.342051] ata2.00: error: { ICRC ABRT }
[   10.342062] ata2: hard resetting link
[   10.831301] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   10.832752] ata2.00: configured for UDMA/100
[   10.832784] ata2: EH complete
[   11.123146] ata2.00: exception Emask 0x10 SAct 0x1000 SErr 0x200100 action 0x6 frozen
[   11.123165] ata2.00: irq_stat 0x08000000, interface fatal error
[   11.123178] ata2: SError: { UnrecovData BadCRC }
[   11.123189] ata2.00: failed command: READ FPDMA QUEUED
[   11.123204] ata2.00: cmd 60/f8:60:10:09:80/00:00:04:00:00/40 tag 12 ncq 126976 in
[   11.123227] ata2.00: status: { DRDY }
[   11.123238] ata2: hard resetting link
[   11.622694] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   11.623851] ata2.00: configured for UDMA/100
[   11.623912] ata2: EH complete
[   11.631038] ata2: limiting SATA link speed to 3.0 Gbps
[   11.631043] ata2.00: exception Emask 0x10 SAct 0x2000 SErr 0x200100 action 0x6 frozen
[   11.631077] ata2.00: irq_stat 0x08000000, interface fatal error
[   11.631100] ata2: SError: { UnrecovData BadCRC }
[   11.631123] ata2.00: failed command: READ FPDMA QUEUED
[   11.631147] ata2.00: cmd 60/f0:68:18:09:80/00:00:04:00:00/40 tag 13 ncq 122880 in
[   11.631207] ata2.00: status: { DRDY }
[   11.631228] ata2: hard resetting link
[   12.126528] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   12.127952] ata2.00: configured for UDMA/100
[   12.127975] ata2: EH complete

I’ve checked the output of «smartctl» and I’ve not seen anything wrong. However, I admit that I might not be reading that figures correctly:

Code:

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-48-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Blue (SATA 6Gb/s)
Device Model:     WDC WD10EZEX-08M2NA0
Serial Number:    WD-WCC3F3180973
LU WWN Device Id: 5 0014ee 2b49d97fe
Firmware Version: 01.01A01
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Apr  7 09:36:00 2015 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                                        was suspended by an interrupting command from host.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (11400) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 118) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x3035) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   172   167   021    Pre-fail  Always       -       2383
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       503
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       2957
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       503
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       26
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       476
194 Temperature_Celsius     0x0022   116   105   000    Old_age   Always       -       27
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   198   000    Old_age   Always       -       27
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Aborted by host               80%      2941         -
# 2  Conveyance offline  Completed without error       00%      2919         -
# 3  Short offline       Completed without error       00%      2876         -
# 4  Short offline       Completed without error       00%      2793         -
# 5  Extended offline    Completed without error       00%      2769         -
# 6  Extended offline    Interrupted (host reset)      70%      2767         -
# 7  Extended offline    Completed without error       00%      2675         -
# 8  Extended offline    Completed without error       00%      2642         -
# 9  Short offline       Completed without error       00%      2623         -
#10  Short offline       Completed without error       00%      1948         -
#11  Short offline       Completed without error       00%      1579         -
#12  Short offline       Completed without error       00%      1061         -
#13  Short offline       Completed without error       00%       119         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

I’ve read that it could be because of the SATA or power cables, which could be faulty. I’ve connected the cables of my DVD drive but the result has been the same. I’ve also tried disabling the NCQ feature in this unit, but it hasn’t helped. I did this because I interpreted from a bug report that the chipset of my mother board might have some type of problem under Ubuntu (it’s an AMD 970/SB950).

In other computers, when the kernel warned me about errors in a SATA unit, «smartctl» showed obvious signs that the drive was faulty. However, in this case, I’m completely lost. Can you help me?

Thank you!

Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

Yesterday I purchased a new unit of the same model of hard drive (manufactured in December 2,014 instead of in February 2,104). I copied my data from the older disk to the newer one and tested it. It worked fine. This morning I’ve turned my computer on, but the same error has arisen!!! :S

I’m really surprised. Days ago I replaced the cables of this disk by the ones of other of my hard drives, one which works OK (a Samsung SpinPoint F2). But it failed too, so I discarded that it was a faulty cable. Could it be an issue between the model of my motherboard and the model of the disk?

PS: in both units the SMART UDMA_CRC_Error_Count counter has increased.

Last edited by negora; April 11th, 2015 at 08:09 AM.
Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

Finally I returned the new Western Digital Caviar Blue 1TB SATA3 and purchased a Seagate Barracuda 7200.14 1TB SATA (ST1000DM003). After copying the data from one disk to the other one, I rebooted my computer several times and it never failed. I kept my computer running for several hours, then rebooted it again, but it didn’t fail either. I thought that the error had been solved, until I turned on my computer the next morning and it showed the same failure messages!!! >_<

So it was clear that I hadn’t been persevering enough when I checked the cables days ago. Because the error came up randomly, I guess that I didn’t give it enough time to appear again during some of my checks. But after trying more combinations, plugging the hard drive here and there, it resulted that the 2nd SATA connector of my main board has some kind of problem. Because I’ve been using the same hard drive for 2 days, but plugged into a different connector, and it has been working fine. I hope that my conclusion isn’t too premature and it fails again. Sincerely, I doubt it, because in the previous days it already failed one or two times in the same interval of time.
Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

Last update. It failed again :S .

However, I found the real cause thanks to a blog article called «Power supply failures can be pretty annoying to find». It resulted to be the PSU. The voltages weren’t the right ones. It was a good a PSU from the glory days of Enermax, but even a good PSU may fail after ~10 years of service, he he he. So I replaced it by a Seasonic G-550 and all my problems vanished. It has passed almost 2 months without errors .

PS: I opened the Enermax PSU and several capacitors were damaged. I’ll try to replace them, just for fun.
Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

Originally Posted by negora

Last update. It failed again :S .

However, I found the real cause thanks to a blog article called «Power supply failures can be pretty annoying to find». It resulted to be the PSU. The voltages weren’t the right ones. It was a good a PSU from the glory days of Enermax, but even a good PSU may fail after ~10 years of service, he he he. So I replaced it by a Seasonic G-550 and all my problems vanished. It has passed almost 2 months without errors .

PS: I opened the Enermax PSU and several capacitors were damaged. I’ll try to replace them, just for fun.

Negore,
That’s an interesting find you linked to. I’ll have to remember that. I have a WD10EZEX also, which this topic got my attention. Mine went bad, but it was the HD itself.
Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

For me it has been a great discovering. Because the SMART information of my hard drive didn’t show any error. Indeed, time ago I discarded another hard drive for the same reason: the SMART info was OK but it was causing errors during the boot of the OS. I hope that article is also helpful for you and more people .

Источник

I recently bought a new Samsung 870 EVO SSD for my computer (AMD CPU, nvidia GPU) which was previously running on a Samsung 860 EVO SSD. I installed Ubuntu 20.04 on the new disk. No issues during installation.
I encountered no problems with the old SSD (same Ubuntu version).
System works fine once booted (no errors whatsoever in dmesg).
But now I am randomly getting these errors at boot (~ 1 out of 10 boots):

ata1.00: status: { DRDY }
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/08:70:58:a6:46/00:00:10:00:00/40 tag 14 ncq 4096 in

When this happens, computer usually won’t boot, or takes a lot of time and once booted OS is not usable. A simple restart fixes the issue and then everything works as expected.

What I tried :

Sent my first 870 back and asked for a replacement -> Same errors with the new one.
Changed SATA cable and switched SATA port on motherboard -> Same.
Smartctl -t long finds no errors.

There are lots of posts with the same issue, and it seems that the only offered solution is to disable NCQ. From what I understood, disabling it will significantly lower system performance which I would like to avoid.
What can be wrong with this new SSD given that the previous model always worked just fine ?

asked May 31, 2021 at 16:54

Note: Download Samsung Magician and check your SSD firmware. https://www.samsung.com/semiconductor/minisite/ssd/download/tools/

Native Command Queuing (NCQ) is an extension of the Serial ATA protocol allowing hard disk drives to internally optimize the order in which received read and write commands are executed.

Edit sudo -H gedit /etc/default/grub and change the following line to include this extra parameter. Then do sudo update-grub to write the changes to disk. Reboot. Monitor hangs/etc., and watch grep -i FPDMA /var/log/syslog* or dmesg for continued error messages.

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash libata.force=noncq"

Update #1:

User set libata.force=noncqtrim, which is supposed to impact performance less than libata.force=noncq. Will continue to monitor.

answered Jun 2, 2021 at 11:47

heynnemaheynnema

66.1k13 gold badges115 silver badges170 bronze badges

Источник

Hey there,

I just upgraded my NAS with larger HDDs (WD Red 4 TB to Seagate Ironwolf 12 TB). After cloning one of them with Clonezilla, I got this error after boot:

[  154.812778] ata4.00: invalid checksum 0x2 on log page 10h
[  154.812790] ata4: log page 10h reported inactive tag 0
[  154.813097] ata4.00: exception Emask 0x1 SAct 0x80000006 SErr 0x0 action 0x0
[  154.813412] ata4.00: irq_stat 0x40000008
[  154.813604] ata4.00: failed command: READ FPDMA QUEUED
[  154.813861] ata4.00: cmd 60/00:08:20:26:73/01:00:09:00:00/40 tag 1 ncq dma 131072 in
                        res 40/00:10:00:69:00/00:00:e3:01:00/40 Emask 0x1 (device error)
[  154.814521] ata4.00: status: { DRDY }
[  154.814692] ata4.00: failed command: WRITE FPDMA QUEUED
[  154.814943] ata4.00: cmd 61/40:10:00:69:00/05:00:e3:01:00/40 tag 2 ncq dma 688128 out
                        res 40/00:10:00:69:00/00:00:e3:01:00/40 Emask 0x1 (device error)
[  154.815731] ata4.00: status: { DRDY }
[  154.815935] ata4.00: failed command: READ FPDMA QUEUED
[  154.816187] ata4.00: cmd 60/00:f8:20:25:73/01:00:09:00:00/40 tag 31 ncq dma 131072 in
                        res 40/00:10:00:69:00/00:00:e3:01:00/40 Emask 0x1 (device error)
[  154.816824] ata4.00: status: { DRDY }
[  154.925135] ata4.00: configured for UDMA/133
[  154.925221] sd 3:0:0:0: [sdc] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  154.925238] sd 3:0:0:0: [sdc] tag#1 Sense Key : Illegal Request [current] 
[  154.925253] sd 3:0:0:0: [sdc] tag#1 Add. Sense: Unaligned write command
[  154.925269] sd 3:0:0:0: [sdc] tag#1 CDB: Read(16) 88 00 00 00 00 00 09 73 26 20 00 00 01 00 00 00
[  154.925280] print_req_error: I/O error, dev sdc, sector 158541344
[  154.925697] sd 3:0:0:0: [sdc] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  154.925713] sd 3:0:0:0: [sdc] tag#31 Sense Key : Illegal Request [current] 
[  154.925727] sd 3:0:0:0: [sdc] tag#31 Add. Sense: Unaligned write command
[  154.925742] sd 3:0:0:0: [sdc] tag#31 CDB: Read(16) 88 00 00 00 00 00 09 73 25 20 00 00 01 00 00 00
[  154.925751] print_req_error: I/O error, dev sdc, sector 158541088
[  154.926069] ata4: EH complete

SMART data:

# smartctl -a /dev/sdc
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-18-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate IronWolf
Device Model:     ST12000VN0008-2PH103
Serial Number:    XXX
LU WWN Device Id: 5 000c50 0c9110c5d
Firmware Version: SC61
User Capacity:    12.000.138.625.024 bytes [12,0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Dec 26 21:35:45 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  567) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (1104) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x50bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   063   061   044    Pre-fail  Always       -       59411605
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       4
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       128
  7 Seek_Error_Rate         0x000f   063   060   045    Pre-fail  Always       -       1990119
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       18
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       4
 18 Head_Health             0x000b   100   100   050    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   099   099   000    Old_age   Always       -       1
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   058   049   040    Old_age   Always       -       42 (Min/Max 25/42)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       1
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       31
194 Temperature_Celsius     0x0022   042   049   000    Old_age   Always       -       42 (0 25 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       8
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       8
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Pressure_Limit          0x0023   100   100   001    Pre-fail  Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       10h+11m+52.737s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       7845334924
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       32204736

SMART Error Log Version: 1
ATA Error Count: 1
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 18 hours (0 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 e0 25 73 09  Error: WP at LBA = 0x097325e0 = 158541280

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 40 ff ff ff 4f 00      00:15:00.212  WRITE FPDMA QUEUED
  60 00 00 20 26 73 49 00      00:14:53.918  READ FPDMA QUEUED
  60 00 00 20 25 73 49 00      00:14:53.913  READ FPDMA QUEUED
  61 00 40 ff ff ff 4f 00      00:14:51.855  WRITE FPDMA QUEUED
  61 00 40 ff ff ff 4f 00      00:14:51.855  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%        18         -
# 2  Extended offline    Aborted by host               90%        18         -
# 3  Short offline       Completed without error       00%        18         -
# 4  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Reallocated_Sector_Ct doesn’t look good to me. Furthermore, it’s more noisy than before. Is my brand new hard drive already failing?

EDIT: Figures are getting worse after switching the SATA cable:

# smartctl -a /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-18-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate IronWolf
Device Model:     ST12000VN0008-2PH103
Serial Number:    XXX
LU WWN Device Id: 5 000c50 0c9110c5d
Firmware Version: SC61
User Capacity:    12.000.138.625.024 bytes [12,0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Dec 26 22:15:09 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  567) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (1104) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x50bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   063   061   044    Pre-fail  Always       -       66405997
  3 Spin_Up_Time            0x0003   093   093   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       6
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       408
  7 Seek_Error_Rate         0x000f   063   060   045    Pre-fail  Always       -       2007495
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       19
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       5
 18 Head_Health             0x000b   100   100   050    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   099   099   000    Old_age   Always       -       1
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   066   049   040    Old_age   Always       -       34 (Min/Max 32/34)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       1
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       33
194 Temperature_Celsius     0x0022   034   049   000    Old_age   Always       -       34 (0 25 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       8
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       8
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Pressure_Limit          0x0023   100   100   001    Pre-fail  Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       10h+27m+34.865s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       7852313820
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       32220232

SMART Error Log Version: 1
ATA Error Count: 1
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 18 hours (0 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 e0 25 73 09  Error: WP at LBA = 0x097325e0 = 158541280

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 40 ff ff ff 4f 00      00:15:00.212  WRITE FPDMA QUEUED
  60 00 00 20 26 73 49 00      00:14:53.918  READ FPDMA QUEUED
  60 00 00 20 25 73 49 00      00:14:53.913  READ FPDMA QUEUED
  61 00 40 ff ff ff 4f 00      00:14:51.855  WRITE FPDMA QUEUED
  61 00 40 ff ff ff 4f 00      00:14:51.855  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%        19         -
# 2  Short offline       Completed without error       00%        18         -
# 3  Extended offline    Aborted by host               90%        18         -
# 4  Short offline       Completed without error       00%        18         -
# 5  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Last edited by Silenzium (2021-12-27 09:09:52)

Источник

I purchased a 1TB Samsung SSD 860 EVO before Christmas and just kind of left it on the shelf till today.

Well I installed it in my server to replace an old but reliable 500GB 860 EVO and that’s where the trouble started.

Upon copying data to the drive I started getting warnings about CRC errors and the SMART stats for it going up and variations of this continue to show up in the logs:

Feb  1 11:22:43 VOID kernel: ata6.00: irq_stat 0x08000000, interface fatal error
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/e8:40:f8:ce:f0/09:00:02:00:00/40 tag 8 ncq dma 1298432 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/50:48:e0:d8:f0/09:00:02:00:00/40 tag 9 ncq dma 1220608 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/d8:70:30:e2:f0/09:00:02:00:00/40 tag 14 ncq dma 1290240 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/90:78:08:ec:f0/09:00:02:00:00/40 tag 15 ncq dma 1253376 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/f8:80:98:f5:f0/09:00:02:00:00/40 tag 16 ncq dma 1306624 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }
Feb  1 11:22:43 VOID kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Feb  1 11:22:43 VOID kernel: ata6.00: cmd 61/e0:a8:90:ff:f0/09:00:02:00:00/40 tag 21 ncq dma 1294336 ou
Feb  1 11:22:43 VOID kernel:         res 40/00:40:f8:ce:f0/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Feb  1 11:22:43 VOID kernel: ata6.00: status: { DRDY }

So I shut it down and swapped out:

3 different SATA cables, 2 molex to sata power adaters, and the port on the board with another working port on the motherboard.

When I changed the port on the mobo the error changed somewhat, and appeared when i first booted before copying any data:

Feb 1 18:13:05 VOID kernel: ata1.00: READ LOG DMA EXT failed, trying PIO
Feb 1 18:13:05 VOID kernel: ata1.00: exception Emask 0x0 SAct 0xffffffff SErr 0x0 action 0x6
Feb 1 18:13:05 VOID kernel: ata1.00: irq_stat 0x40000008
Feb 1 18:13:05 VOID kernel: ata1.00: failed command: READ FPDMA QUEUED
Feb 1 18:13:05 VOID kernel: ata1.00: cmd 60/20:20:28:1a:25/00:00:0e:00:00/40 tag 4 ncq dma 16384 in
Feb 1 18:13:05 VOID kernel: res 41/84:20:28:1a:25/00:00:0e:00:00/00 Emask 0x410 (ATA bus error) <F>
Feb 1 18:13:05 VOID kernel: ata1.00: status: { DRDY ERR }
Feb 1 18:13:05 VOID kernel: ata1.00: error: { ICRC ABRT }
Feb 1 18:13:05 VOID kernel: ata1: hard resetting link
Feb 1 18:13:05 VOID kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 1 18:13:05 VOID kernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 1 18:13:05 VOID kernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 1 18:13:05 VOID kernel: ata1.00: configured for UDMA/133
Feb 1 18:13:05 VOID kernel: sd 1:0:0:0: [sdb] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s
Feb 1 18:13:05 VOID kernel: sd 1:0:0:0: [sdb] tag#4 Sense Key : 0xb [current]
Feb 1 18:13:05 VOID kernel: sd 1:0:0:0: [sdb] tag#4 ASC=0x47 ASCQ=0x0
Feb 1 18:13:05 VOID kernel: sd 1:0:0:0: [sdb] tag#4 CDB: opcode=0x28 28 00 0e 25 1a 28 00 00 20 00
Feb 1 18:13:05 VOID kernel: blk_update_request: I/O error, dev sdb, sector 237312552 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
Feb 1 18:13:05 VOID kernel: ata1: EH complete
Feb 1 18:13:05 VOID kernel: ata1.00: Enabling discard_zeroes_data

It hasn’t re-appeared yet, I’m copying more data as a test to see if it happens again. But so far the issue continues to follow the drive, so I think I might have gotten a defective one. Can anyone confirm? Other ideas I could try?

On a related note, anyone have any experience with warrantying this type of issue with samsung? Like in idiot I have missed my return window with Amazon.

void-diagnostics-20210201-1818.zip

syslog-20210201-114006.txt

syslog-20210201-161732.txt

syslog-20210201-175504.txt

EDIT: Still doing it, same errors as before:

Feb 1 19:08:20 VOID kernel: ata1.00: exception Emask 0x10 SAct 0x80800000 SErr 0x0 action 0x6 frozen
Feb 1 19:08:20 VOID kernel: ata1.00: irq_stat 0x08000000, interface fatal error
Feb 1 19:08:20 VOID kernel: ata1.00: failed command: WRITE FPDMA QUEUED
Feb 1 19:08:20 VOID kernel: ata1.00: cmd 61/60:b8:00:7e:db/00:00:05:00:00/40 tag 23 ncq dma 49152 out
Feb 1 19:08:20 VOID kernel: res 40/00:b8:00:7e:db/00:00:05:00:00/40 Emask 0x10 (ATA bus error)
Feb 1 19:08:20 VOID kernel: ata1.00: status: { DRDY }
Feb 1 19:08:20 VOID kernel: ata1.00: failed command: WRITE FPDMA QUEUED
Feb 1 19:08:20 VOID kernel: ata1.00: cmd 61/60:f8:80:7e:db/00:00:05:00:00/40 tag 31 ncq dma 49152 out
Feb 1 19:08:20 VOID kernel: res 40/00:b8:00:7e:db/00:00:05:00:00/40 Emask 0x10 (ATA bus error)
Feb 1 19:08:20 VOID kernel: ata1.00: status: { DRDY }
Feb 1 19:08:20 VOID kernel: ata1: hard resetting link
Feb 1 19:08:21 VOID kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 1 19:08:21 VOID kernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 1 19:08:21 VOID kernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 1 19:08:21 VOID kernel: ata1.00: configured for UDMA/133
Feb 1 19:08:21 VOID kernel: ata1: EH complete
Feb 1 19:08:21 VOID kernel: ata1.00: Enabling discard_zeroes_data

Edited February 12, 2021 by weirdcrap

unsolved, errors are back

Источник

Here is the latest error that I got:

Mar 31 12:16:17 hristo-ws kernel: ata5.00: exception Emask 0x1 SAct 0x70000 SErr 0x0 action 0x6 frozen
Mar 31 12:16:17 hristo-ws kernel: ata5.00: irq_stat 0x40000008
Mar 31 12:16:17 hristo-ws kernel: ata5.00: failed command: READ FPDMA QUEUED
Mar 31 12:16:17 hristo-ws kernel: ata5.00: cmd 60/00:80:90:4d:10/01:00:06:00:00/40 tag 16 ncq dma 131072 in
                                           res 40/00:84:90:4d:10/00:00:06:00:00/40 Emask 0x1 (device error)
Mar 31 12:16:17 hristo-ws kernel: ata5.00: status: { DRDY }
Mar 31 12:16:17 hristo-ws kernel: ata5.00: failed command: READ FPDMA QUEUED
Mar 31 12:16:17 hristo-ws kernel: ata5.00: cmd 60/08:88:b8:9c:b1/00:00:07:00:00/40 tag 17 ncq dma 4096 in
                                           res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x5 (timeout)
Mar 31 12:16:17 hristo-ws kernel: ata5.00: status: { DRDY }
Mar 31 12:16:17 hristo-ws kernel: ata5.00: failed command: READ FPDMA QUEUED
Mar 31 12:16:17 hristo-ws kernel: ata5.00: cmd 60/08:90:c8:20:25/00:00:06:00:00/40 tag 18 ncq dma 4096 in
                                           res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x5 (timeout)
Mar 31 12:16:17 hristo-ws kernel: ata5.00: status: { DRDY }
Mar 31 12:16:17 hristo-ws kernel: ata5: hard resetting link
Mar 31 12:16:17 hristo-ws kernel: ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Mar 31 12:16:17 hristo-ws kernel: ata5.00: configured for UDMA/133
Mar 31 12:16:17 hristo-ws kernel: ata5.00: device reported invalid CHS sector 0
Mar 31 12:16:17 hristo-ws kernel: ata5: EH complete
Mar 31 12:16:17 hristo-ws kernel: ata5.00: exception Emask 0x0 SAct 0x14 SErr 0x0 action 0x0
Mar 31 12:16:17 hristo-ws kernel: ata5.00: irq_stat 0x40000008
Mar 31 12:16:17 hristo-ws kernel: ata5.00: failed command: READ FPDMA QUEUED
Mar 31 12:16:17 hristo-ws kernel: ata5.00: cmd 60/00:10:90:4d:10/01:00:06:00:00/40 tag 2 ncq dma 131072 in
                                           res 41/40:00:90:4d:10/00:01:06:00:00/40 Emask 0x409 (media error) <F>
Mar 31 12:16:17 hristo-ws kernel: ata5.00: status: { DRDY ERR }
Mar 31 12:16:17 hristo-ws kernel: ata5.00: error: { UNC }
Mar 31 12:16:17 hristo-ws kernel: ata5.00: configured for UDMA/133
Mar 31 12:16:17 hristo-ws kernel: sd 4:0:0:0: [sdc] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mar 31 12:16:17 hristo-ws kernel: sd 4:0:0:0: [sdc] tag#2 Sense Key : Medium Error [current] 
Mar 31 12:16:17 hristo-ws kernel: sd 4:0:0:0: [sdc] tag#2 Add. Sense: Unrecovered read error - auto reallocate failed
Mar 31 12:16:17 hristo-ws kernel: sd 4:0:0:0: [sdc] tag#2 CDB: Read(10) 28 00 06 10 4d 90 00 01 00 00
Mar 31 12:16:17 hristo-ws kernel: print_req_error: I/O error, dev sdc, sector 101731728

What could this mean? And how to fix it?

asked Mar 31, 2019 at 9:26

Your computer tries to communicate with the disk on ata5 that failed.

Check the SMART data from your disk to see if your disk turned bad
And check the connections / cables to your drive

answered Jun 1, 2019 at 9:26

Источник

I have a new Ubuntu Server (11.04) that keeps crashing, especially during heavy disk I/O (like making a backup). It’s drives are configures as a RAID 10 with 4 1TB Western Digital Caviar Black Hard Drives.

The message I’m seeing via /proc/kmsg when it crashes is, «failed command: READ FPDMA QUEUED».

This seems like something is messed up with either the drives or the software raid is broken?

The machine was doing fine until this afternoon when it crashed during a file transfer Ever since then it’s been crashing when I try to run a backup, but it’s not always the same file or place each time.

How do I know if it’s a software or hardware failure? How do I know if it’s the SATA controller or one of the disks?

Also, all 4 drives in the array «completed without error» when I ran an extended off-line test on them.

This is the full output of /proc/kmsg from the time I rebooted until it crashed again:

[  356.076292] type=1400 audit(1311983491.536:14): apparmor="DENIED" operation="open" parent=1397 profile="/usr/lib/libvirt/virt-aa-helper" name="/dev/dm-9" pid=2222 comm="virt-aa-helper" requested_mask="r" denied_mask="r" fsuid=0 ouid=105
[  356.304840] type=1400 audit(1311983491.766:15): apparmor="STATUS" operation="profile_load" name="libvirt-c67f4a48-2cad-6deb-d7e7-13f9c7620ad9" pid=2223 comm="apparmor_parser"
[  357.002246] device vnet0 entered promiscuous mode
[  357.003702] br0: port 2(vnet0) entering learning state
[  357.003704] br0: port 2(vnet0) entering learning state
[  366.020017] br0: port 2(vnet0) entering forwarding state
[  367.050024] vnet0: no IPv6 routers present
 l)idx08f5
1[20.298 aefas x0000000(
4[20.294 i:38 om d_ad0Nttitd263-0sre 4-bnu 15054]Cl rc: 15055]  15055]  15055]  15055]  15055]  15056]   1bfpo_re05/x0[ad0
4[20.292 [ffff81de>  epo_re05/x0 15057]   ycrqetwie03b040[ad0
4[20.290 [ffffa017>  ad0+xe/x5 ri1] 15058]   eal_pnlc_lg+x/x0 15058]  15059]  15059]  15059]  15059]  15059]   enltra_epr0001
4[20.200 ialn okdbgigdet enltit_ x0000 H D hne
3[55.640 t4 Err  eoCm ess HRyh 08 
3[55.646 t40:fie omn:RA PM UUD 84088]aa.0 m 00:00:1d/40:90:04 a  c 228i
3[55.644     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.643 t40:sau:{DD 
3[55.647 t40:fie omn:RA PM UUD 84080]aa.0 m 00:80:dd/40:90:04 a  c 228i
3[55.654     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.653 t40:sau:{DD 
3[55.656 t40:fie omn:RA PM UUD 84082]aa.0 m 00:00:5d/40:90:04 a  c 228i
3[55.653     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.652 t40:sau:{DD 
3[55.656 t40:fie omn:RA PM UUD 84084]aa.0 m 00:80:9d/40:90:04 a  c 228i
3[55.652     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.651 t40:sau:{DD 
3[55.655 t40:fie omn:RA PM UUD 84086]aa.0 m 00:00:dd/40:90:04 a  c 228i
3[55.651     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.650 t40:sau:{DD 
3[55.654 t40:fie omn:RA PM UUD 84088]aa.0 m 00:80:1d/40:90:04 a  c 228i
3[55.650     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.650 t40:sau:{DD 
3[55.654 t40:fie omn:RA PM UUD 84089]aa.0 m 00:00:5d/40:90:04 a  c 228i
3[55.660     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.668 t40:sau:{DD 
3[55.662 t40:fie omn:RA PM UUD 84081]aa.0 m 00:80:9d/40:90:04 a  c 228i
3[55.668     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.668 t40:sau:{DD 
3[55.724 t40:fie omn:RA PM UUD:03:00/0tg8nq548 n 84008]     e 00:c0:9d/00:90:04 ms x0(T u ro) 84010]aa.0 tts  RY} 84029]aa.0 aldcmad EDFDAQEE
3[55.706 t40:cd6/04:01:a0:03:00/0tg9nq548 n 84034]     e 00:c0:9d/00:90:04 ms x0(T u ro) 84049]aa.0 tts  RY} 84048]aa.0 aldcmad EDFDAQEE
3[55.739 t40:cd6/05:01:a0:03:00/0tg1 c 228i
3[55.739     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.742 t40:sau:{DD 
3[55.703 t40:fie omn:RA PM UUD 84074]aa.0 m 00:80:9d/40:90:04 a 1nq548 n 84074]     e 00:c0:9d/00:90:04 ms x0(T u ro) 84083]aa.0 tts  RY} 84098]aa.0 aldcmad EDFDAQEE
3[55.797 t40:cd6/06:01:a0:03:00/0tg1 c 228i
3[55.798     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.811 t40:sau:{DD 
0 aldcmad EDFDAQEE
3[55.839 t40:cd6/06:0a:90:03:00/0tg1 c 228i
3[55.830     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.851 t40:sau:{DD 
3[55.818 t40:fie omn:RA PM UUD 84048]aa.0 m 00:00:9d/40:90:04 a 4nq548 n 84048]     e 00:c0:9d/00:90:04 ms x0(T u ro) 84058]aa.0 tts  RY} 84064]aa.0 aldcmad EDFDAQEE
3[55.809 t40:cd6/07:0a:90:03:00/0tg1 c 228i
3[55.800     rs4/05:01:a0:03:00/0Eak01 AAbserr
3[55.826 t40:sau:{DD 
3[55.871 t40:fie omn:RA PM UUD 84098]aa.0 m 00:00:1d/40:90:04 a 6nq548 n 84098]     e 00:c0:9d/00:90:04 ms x0(T u ro) 84007]aa.0 tts  RY}[ 5854.091012] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.091573] ata4.00: cmd 60/00:88:00:b5:d9/04:00:39:00:00/40 tag 17 ncq 524288 in
[ 5854.091574]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.092690] ata4.00: status: { DRDY }
[ 5854.093287] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.093817] ata4.00: cmd 60/00:90:00:b9:d9/04:00:39:00:00/40 tag 18 ncq 524288 in
[ 5854.093817]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.094964] ata4.00: status: { DRDY }
[ 5854.095510] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.096105] ata4.00: cmd 60/00:98:00:bd:d9/04:00:39:00:00/40 tag 19 ncq 524288 in
[ 5854.096105]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.097246] ata4.00: status: { DRDY }
[ 5854.097773] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.098338] ata4.00: cmd 60/00:a0:00:c1:d9/04:00:39:00:00/40 tag 20 ncq 524288 in
[ 5854.098339]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.099491] ata4.00: status: { DRDY }
[ 5854.100084] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.100658] ata4.00: cmd 60/00:a8:00:c5:d9/04:00:39:00:00/40 tag 21 ncq 524288 in
[ 5854.100659]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.101775] ata4.00: status: { DRDY }
[ 5854.102392] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.102924] ata4.00: cmd 60/00:b0:00:c9:d9/04:00:39:00:00/40 tag 22 ncq 524288 in
[ 5854.102925]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.104080] ata4.00: status: { DRDY }
[ 5854.104639] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.105206] ata4.00: cmd 60/00:b8:00:cd:d9/04:00:39:00:00/40 tag 23 ncq 524288 in
[ 5854.105207]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.106377] ata4.00: status: { DRDY }
[ 5854.106921] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.107493] ata4.00: cmd 60/00:c0:00:d1:d9/04:00:39:00:00/40 tag 24 ncq 524288 in
[ 5854.107493]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.108651] ata4.00: status: { DRDY }
[ 5854.109201] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.109784] ata4.00: cmd 60/00:c8:00:d5:d9/04:00:39:00:00/40 tag 25 ncq 524288 in
[ 5854.109785]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.110981] ata4.00: status: { DRDY }
[ 5854.111555] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.112111] ata4.00: cmd 60/00:d0:00:d9:d9/04:00:39:00:00/40 tag 26 ncq 524288 in
[ 5854.112111]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.113248] ata4.00: status: { DRDY }
[ 5854.113839] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.114408] ata4.00: cmd 60/00:d8:00:dd:d9/04:00:39:00:00/40 tag 27 ncq 524288 in
[ 5854.114409]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.115586] ata4.00: status: { DRDY }
[ 5854.116127] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.116691] ata4.00: cmd 60/00:e0:00:e1:d9/04:00:39:00:00/40 tag 28 ncq 524288 in
[ 5854.116691]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.117856] ata4.00: status: { DRDY }
[ 5854.118430] ata4.00: failed command: READ FPDMA QUEUED
[ 5854.119004] ata4.00: cmd 60/00:e8:00:e5:d9/04:00:39:00:00/40 tag 29 ncq 524288 in
[ 5854.119005]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
[ 5854.120213] ata4.00: status: { DRDY }
[ 5855.100038] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 5855.104300] ata4.00: configured for UDMA/133
[ 5855.104351] ata4: EH complete
[10013.907683] general protection fault: 0000 [#1] SMP 
[10013.907997] last sysfs file: /sys/devices/system/cpu/cpu5/cache/index2/shared_cpu_map
[10013.908574] CPU 0 
[10013.908577] Modules linked in: ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables kvm_amd kvm eeepc_wmi sparse_keymap bridge stp nouveau sp5100_tco ttm i2c_piix4 edac_core edac_mce_amd drm_kms_helper k10temp drm i2c_algo_bit video lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov usb_storage uas r8169 xhci_hcd ahci libahci raid6_pq async_tx raid1 raid0 multipath linear
[10013.911080] 
[10013.911734] Pid: 349, comm: md0_resync Tainted: G    B       2.6.38-10-server #46-Ubuntu To be filled by O.E.M. To be filled by O.E.M./SABERTOOTH 990FX
[10013.912418] RIP: 0010:[]  [] kmem_cache_alloc+0x58/0x110
[10013.913102] RSP: 0018:ffff88041b6079c0  EFLAGS: 00010006
[10013.913783] RAX: 0000000000000000 RBX: ffff88043f802600 RCX: ffffffff813df60a
[10013.914472] RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffff88043f802600
[10013.915159] RBP: ffff88041b607a00 R08: ffff8800bd416a80 R09: ffff880414db0500
[10013.915839] R10: 00000000684eb800 R11: 0000000000000001 R12: 0200000000000000
[10013.916523] R13: 0000000000000086 R14: 0000000000000020 R15: ffff88041b42b400
[10013.917204] FS:  00007f741ce31700(0000) GS:ffff8800bd400000(0000) knlGS:0000000000000000
[10013.917529] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[10013.917529] CR2: 00007f4402d5dd00 CR3: 00000004108a5000 CR4: 00000000000006f0
[10013.917529] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[10013.917529] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[10013.917529] Process md0_resync (pid: 349, threadinfo ffff88041b606000, task ffff88041b945b80)
[10013.917529] Stack:
[10013.917529]  ffff88041b6079e0 000000201b6079e0 0000000000000082 ffffffff81a77fa0
[10013.917529]  ffff880414db0500 0000000000000020 0000000000000020 ffff88041b42b400
[10013.917529]  ffff88041b607a30 ffffffff813df60a ffff880400000000 ffff88041be74020
[10013.917529] Call Trace:
[10013.917529]  [] scsi_pool_alloc_command+0x4a/0x80
[10013.917529]  [] scsi_host_alloc_command.clone.7+0x33/0xa0
[10013.159 [ffff83ff> _cigtcmad02/x0103972]  cistpf_md08/x0103972] 103972] 103972] 103972]  _eei_nlgdvc+x704
4[01.159 [ffff822c> eei_nlgdvc+x005
4[01.159 [ffff82c9> l_nlg02/x0103972]  ad0upu+xd03 ri1]103972]  dupu+x204
4[01.159 [ffff8421> dd_yc08d0c0103972] [10013.917529]  [] ? _raw_spin_lock+0xe/0x20
[10013.917529]  [] ? recalc_sigpending+0x1/x0103972] 103972] 103972] 103972] 103972]   enltra_epr0001
0[01.159 oe 66 06 04 9c a6 69 66 04 b0 54 30 5a c0 04 b2 d8 40 4a 00 04 34 8103972]RP 103972] S ff80167c>103972]-- n rc 5355716c]-

This is the output from syslog:

Jul 29 21:23:09 Othelo kernel: [ 5854.068449] ata4.00: exception Emask 0x10 SAct 0x7fffffff SErr 0x90202 action 0xe frozen
Jul 29 21:23:09 Othelo kernel: [ 5854.068464] ata4.00: irqstat00400,PYRYcagd<> 84087]aa:Sro:{RcvomPritPYdCg1BB}<> 84087]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.643 t40:cd6/00:0f:90:03:00/0tg0nq548 n<> 84088]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84089]aa.0 tts  RY}<> 84089]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.653 t40:cd6/00:0e:90:03:00/0tg1nq548 n<> 84080]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84081]aa.0 tts  RY}<> 84081]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.652 t40:cd6/01:0f:90:03:00/0tg2nq548 n<> 84082]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84083]aa.0 tts  RY}<> 84083]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.652 t40:cd6/01:0f:90:03:00/0tg3nq548 n<> 84084]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84085]aa.0 tts  RY}<> 84085]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.651 t40:cd6/02:0f:90:03:00/0tg4nq548 n<> 84086]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84087]aa.0 tts  RY}<> 84087]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.650 t40:cd6/02:00:a0:03:00/0tg5nq548 n<> 84088]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84089]aa.0 tts  RY}<> 84089]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.659 t40:cd6/03:00:a0:03:00/0tg6nq548 n<> 84080]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84080]aa.0 tts  RY}<> 84081]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.667 t40:cd6/03:00:a0:03:00/0tg7nq548 n<> 84081]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84099]aa.0 tts  RY}<> 84007]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: [ 5854.070785] ata4.00: cmd 60/00:40:00:0d:da/040:90:04 a  c 228i
Jul 29 21:23:09 Othelo kernel: 3[55.776     rs4/05:01:a0:03:00/0Eak01 AAbserr
Jul 29 21:23:09 Othelo kernel: 3[55.797 t40:sau:{DD 
Jul 29 21:23:09 Othelo kernel: 3[55.743 t40:fie omn:RA PM UUD<> 84034]aa.0 m 00:80:1d/40:90:04 a  c 228i
Jul 29 21:23:09 Othelo kernel: 3[55.707     rs4/05:01:a0:03:00/0Eak01 AAbserr
Jul 29 21:23:09 Othelo kernel: 3[55.710 t40:sau:{DD 
Jul 29 21:23:09 Othelo kernel: 3[55.777 t40:fie omn:RA PM UUD<> 84052]aa.0 m 00:00:5d/40:90:04 a 0nq548 n<> 84052]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84069]aa.0 tts  RY}<> 84075]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.763 t40:cd6/05:01:a0:03:00/0tg1 c 228i
Jul 29 21:23:09 Othelo kernel: 3[55.763     rs4/05:01:a0:03:00/0Eak01 AAbserr
Jul 29 21:23:09 Othelo kernel: 3[55.784 t40:sau:{DD 
Jul 29 21:23:09 Othelo kernel: 3[55.737 t40:fie omn:RA PM UUD<> 84096]aa.0 m 00:00:dd/40:90:04 a 2nq548 n<> 84096]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84018]aa.0 tts  RY}<3>[ 5854.081762] ata4.0:fie omn:RA PM UUD<> 84021]aa.0 m 00:80:5d/40:90:04 a 3nq548 n<> 84022]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84032]aa.0 tts  RY}<> 84045]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.867 t40:cd6/07:0a:90:03:00/0tg1 c 228i
Jul 29 21:23:09 Othelo kernel: 3[55.868     rs4/05:01:a0:03:00/0Eak01 AAbserr
Jul 29 21:23:09 Othelo kernel: 3[55.881 t40:sau:{DD 
Jul 29 21:23:09 Othelo kernel: 3[55.849 t40:fie omn:RA PM UUD<> 84073]aa.0 m 00:80:dd/40:90:04 a 5nq548 n<> 84074]     e 00:c0:9d/00:90:04 ms x0(T u ro)<> 84082]aa.0 tts  RY}<> 84083]aa.0 aldcmad EDFDAQEE
Jul 29 21:23:09 Othelo kernel: 3[55.822 t40:cd6/08:0b:90:03:00/0tg1 c 228i
Jul 29 21:23:09 Othelo kernel: 3[55.823     rs4/05:01:a0:03:00/0Eak01 AAbserr
Jul 29 21:23:09 Othelo kernel: 3[55.943 t40:sau:{DD 
Jul 29 21:23:09 Othelo kernel: [ 5854.120747] ata4.00: failed command: READ FPDMA QUEUED
Jul 29 21:23:09 Othelo kernel: [ 5854.121321] ata4.00: cmd 60/00:f0:00:e9:d9/04:00:39:00:00/40 tag 30 ncq 524288 in
Jul 29 21:23:09 Othelo kernel: [ 5854.121322]          res 40/00:5c:00:19:da/00:00:39:00:00/40 Emask 0x10 (ATA bus error)
Jul 29 21:23:09 Othelo kernel: [ 5854.122483] ata4.00: status: { DRDY }
Jul 29 21:23:09 Othelo kernel: [ 5854.123050] ata4: hard resetting link
Jul 29 21:42:45 Othelo mdadm[1563]: Rebuild62 event detected on md device /dev/md/0
Jul 29 22:16:06 Othelo mdadm[1563]: Rebuild81 event detected on md device /dev/md/0
Jul 29 22:17:01 Othelo CRON[13235]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jul 29 22:32:29 Othelo kernel: 4972] <ffff1d8e]_ss_e_omn+xe0c
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff83fd> cigtcmad04/x0<>103972] <ffff1efd]ss_eu_scn+xd0e
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff843a> dpe_n0a/xa
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff821b>  ediemv_eus+x80a
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff8222> l_ekrqet0c/x1
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff837c> cirqetf+x6040<>103972] <ffff1c77]_gnrcupu_eie03/x0<>103972] <ffff1ca0]gnrcupu_eie03/x0<>103972] <ffff1b75]bkupu+x506
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffffa0f8> nlgsae+x00c ri1]<>103972] <ffff0ead]ri1_nlg01/x0[ad0
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff82c9> l_nlg02/x0<>103972] <ffff19f2]m_nlg02/x0<>103972] <ffff198d]m_osn+x4/x9
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff807c>  uoeb05
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff84ca> dtra+x1/x5
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff84c9>  dtra+x/x5
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff8077> tra+x60a
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff80ce> enltra_epr0401
Jul 29 22:32:29 Othelo kernel: 4[01.159 [ffff807e>  tra+x/x0<>103972] <ffff10d0]?kre_hedhle+x/x0<>103972]Cd:6 69 69 98 5f 66 06 69 c8 76 c0 42 8d 00 d8 04 5e f8 00 00 86 71 4>8 40 98 04 9e 79 66 06 04 5e 5
Jul 29 22:32:29 Othelo kernel: 1[01.159 I [ffff815e> mmccealc05/x1
Jul 29 22:32:29 Othelo kernel: 4[01.159 RP<ff84b090

UPDATE:
Checked all Hard drives by running full scan on each and writing zeros to all drives. No problems found. I’m going to reinstall everything and try again. :/

Источник

Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.

Re: «failed command: READ FPDMA QUEUED» for my WD10EZEX-08M2NA0 drive.