Error idnf at lba - Исправление ошибок и поиск оптимальных решений проблем

Содержание

Is this a TrueNAS error or a Hardware issue?
NASbox
NASbox
Samuel Tai
NASbox
Samuel Tai
NASbox
Cam status: SCSI Status Error
Degraded pool: Please help me fix it.
Mike G
Cam status: SCSI Status Error
cam status ATA status error
Aschtra
kdragon75
Aschtra
kdragon75
Aschtra
kdragon75
Aschtra
kdragon75
Aschtra
Jailer
kdragon75
Johnnie Black
kdragon75
Chris Moore

Is this a TrueNAS error or a Hardware issue?

NASbox

I have had errors showing up shortly after I upgraded from FreeNAS 11.3-U5 to TrueNAS-12.0-U5. (I plan on upgrading to TrueNAS-12.0-U6 next weekend assuming that the release goes of smoothly.) I don’t know if there is a connection or if it is just coincidence.

I tried searching, but I couldn’t find a readable explanation of what causes/what is happening when an IDNF error is reported. (Is this a non-existent LBA or is it something else?)

I got message about an unrecoverable error on my main pool, so I had a look at the smart parameters, and it appears that the issue had to do with the error message: Error: IDNF at LBA = . . The first time it happened, I was able to clear the error and I had about 100kb resilvered. I ran a badblocks (read only) on the disk as well as a full smart surface test and a zpool scrub—other than the IDNF errors there is no other indication of a problem.

I had a recurrence of the pool error condition, and that was easily cleared, and the passed a scrub without incident.

Any insight as as to if this Is this a TrueNAS error (ZFS issue), a hardware issue (Drive DA3, the HBA, a Memory Fault-NonECC) or just corruption (System experienced a bad shutdown due to UPS failure)?

I have included a grep of the relevent messages from /var/log/messages as well as the output from smartctl below. Any insight/assistance would be much appreciated. Thanks in advance.

Log Entries Related to First incident

Log Entries Related to Second incident

Latest Smart Output:

Happy FreeNAS User since 2012

NASbox

Can anyone tell me what an IDNF error is? I did a camcontrol identify on the drive. IIUC these IDNF errors seem to indicate that the command requested an LBA larger than the size of the drive? Do I have that correct? If so, then I assume that means fauty HBA Card/Cable, memory corruption, or TrueNAS Software Error. If someone can offer any clues it would be much appreciated.

Happy FreeNAS User since 2012

Samuel Tai

Never underestimate your own stupidity

NASbox

IIUC it seems like a corrupt/damaged spot on the platter could also be responsible — or am I misinterpreting this quote from the article?

IDNF- Sector ID Not Found . If the sector that holds this information is corrupt there is no way for the hard drive to locate this sector and it will return the result IDNF.

It appears that the error is referring to an LBA48 address, if so, then the addresses should both be valid, otherwise they are both invalid.

Any thoughts/comments? (The pool is RAID-Z2, and I have a burned in spare on the shelf if necessary, but I don’t want to condemn the drive if it’s something else.)

Happy FreeNAS User since 2012

Samuel Tai

Never underestimate your own stupidity

NASbox

@Samuel Tai — This one is kind of confusing because:
5 Reallocated_Sector_Ct, 196 Reallocated_Event_Count, 197 Current_Pending_Sector, and 198 Offline_Uncorrectable are all 0!
AND the drive passed a long test. I’m really surprised that the long self test didn’t flag it.

I’ll keep watching closely, but I wondering if damage caused by a bad shutdown is a likely cause? I have also had a number of crashes on my workstation which has a R/W NFS Share on the pool. There might have been some activity during that crash — don’t know.

Also had 43 errors picked up by scrub after a UPS failure that offlined the disk. I did a badblocks and long smart — nothing — all OK so I resilvered.

If these areas were not in use at the time, or they contained deleted data, they wouldn’t get picked up by scrub. I don’t know how smart test determines bad surface. if it’s by reading ECC, or if the drive actively reads/writes/compares. The pool is a library, so there are a lot of files that have been sitting untouched for years. a bit of fade/bit rot. refresh the surface by re-silvering, and we are good to go. The drive is definitely getting up in age, but usually surface errors show as surface errors, and if there is a real problem there are a few of them. I had a drive go several months ago, and it was pretty obvious it was going. I’ve never had a drive go in such a «sneaky» way.

Is my thinking on the subject sound? Thoughts?

Источник

Cam status: SCSI Status Error

Dabbler

I built my TrueNAS a couple moinths ago, space is (temporarily) about 80% full, but the NAS is only used lighly with SMB.

I get this error about once a week:

smartctl -a /dev/da5:

zpool status tank:

dmesg | grep mps:

It is always disk5.
Read write cksum is either 0 0 0 or 0 1 0 — never more.
After a scrub it is fine — for a few days.

PSU and SAS controller (Dell H310) are new.

What I’ve tried:
— mounted a fan on the SAS controller
— cheked and swapped power and SATA connectors (still same cables) for disk5

I will change power and SATA cables next.

Could the SAS controller be faulty and lead to such an error?
Should I replace disk5? SMART has 2 stored errors for that disk.

Dabbler

After replacing the SAS/SATA cable, the machine ran smooth for 12 days.
But now the error is back!

SMART results are good. zpool status has 1 Write error, resilvered 84K.

I just realized the following:
The last few times the error / command was very similar:
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 01 01 c8 e8 08 00 00 00 08 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 7b 77 df 08 00 00 00 08 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 68 66 7d 98 00 00 00 10 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 68 61 9e b0 00 00 00 08 00 00

Could it be a defect sector on the drive?
Can I leave the disk installed as long as it’s just those single write errors? Or do you recommend replacing that drive?

Источник

Degraded pool: Please help me fix it.

Mike G

Dabbler

Hello, and thank you all in advance for any help you can provide for me.

My Freenas system is composed of the following hardware built in January 2016:

Six (6) of WD Red 4TB WD40EFRX NAS Hard Drive that are NOT connected to the Marvel SATA ports. Instead they are on the four blue and two white sata ports all adjacent to each other.

Four (4) of MICRON MT18KSF1G72AZ-1G6E1ZE 8GB (1X8GB)1600MHZ PC3-12800 CL11 ECC REGISTERED DUAL RANK DDR3 SDRAM 240 PIN DIMM

Power Supply: Antec Earthwatts Green 380W EA-380E HT

I don’t know if my Asrock C2750D4I sata controlling hardware is bad, or if I have a bad WD Red drive, or something else.

Starting on February 21st my daily system report emails showed long strings that repeat this:

(ada3:ahcich13:0:0:0): READ_DMA48. ACB: 25 00 68 38 7c 40 84 01 00 00 d0 00
> (ada3:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada3:ahcich13:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada3:ahcich13:0:0:0): RES: 51 40 af 38 7c 40 84 01 00 7f 00
> (ada3:ahcich13:0:0:0): Retrying command

or
> (ada3:ahcich13:0:0:0): Retrying command
> (ada3:ahcich13:0:0:0): READ_FPDMA_QUEUED. ACB: 60 e8 48 40 3f 40 83 01 00 00 00 00
> (ada3:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada3:ahcich13:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada3:ahcich13:0:0:0): RES: 41 40 9f 40 3f 40 83 01 00 00 00
> (ada3:ahcich13:0:0:0): Error 5, Retries exhausted

So I assumed my ADA3 device has an issue, and after looking at forum threads I tried to do some short and long SMART tests, although I don’t know how to really look at the long test results. I saved most of the output of whatever test i ran and I didn’t judge it a bad result, but I really am not well educated on this to properly troubleshoot. I checked my sata and power cable connections on each end, and rebooted, but when the ATA status errors persisted, I swapped out the SATA cable. The errors remained. I have not yet tried to move any sata connections on the motherboard; just scared to try anything without expert guidance and I read that I shouldn’t use the Marvel ports.

Starting on March 26th I got an email report that gives this information, so I turned it off and unplugged it.:

NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
freenas-boot 29.8G 562M 29.2G — — 1% 1.00x ONLINE —
pool1 21.8T 13.4T 8.39T — 10% 61% 1.00x DEGRADED /mnt

pool: pool1
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using ‘zpool online’ or replace the device with
‘zpool replace’.
scan: scrub in progress since Sun Mar 25 00:00:04 2018
7.02T scanned out of 13.3T at 75.7M/s, 24h19m to go
0 repaired, 52.61% done
config:

NAME STATE READ WRITE CKSUM
pool1 DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
gptid/fd6d1689-88f5-11e5-ae91-d05099c00684 ONLINE 0 0 0
gptid/fe521d35-88f5-11e5-ae91-d05099c00684 ONLINE 0 0 0
gptid/ff38e329-88f5-11e5-ae91-d05099c00684 ONLINE 0 0 0
14083642073395126713 REMOVED 0 0 0 was /dev/gptid/0021127e-88f6-11e5-ae91-d05099c00684
gptid/010698d4-88f6-11e5-ae91-d05099c00684 ONLINE 0 0 0
gptid/01ebee95-88f6-11e5-ae91-d05099c00684 ONLINE 0 0 0

Just today I booted this NAS back up, and looked in the BIOS because I knew the system time had been wrong for a while an wanted to fix it, and then realized that in the BIOS it did not say SMART was enabled. I enabled it. I had been trying to set the SMART schedules to run for quite some time, but I couldn’t tell if it was working or how to make an output file to review. Anyway, once it booted up, I saw this:

This is where I ask for help and wait to be told that my SMART isn’t turned on/get yelled at for not doing things right and not reading the manual. I tried to read the manual, and I feel like I made a good attempt at selecting hardware based on the advice on this board at the time. If I need to buy a new drive, let me know. If you have any instructions for me, please give me as much detail in the steps as possible, since I have to do internet searches to find the proper commands.
Thanks.

Источник

Cam status: SCSI Status Error

Dabbler

I built my TrueNAS a couple moinths ago, space is (temporarily) about 80% full, but the NAS is only used lighly with SMB.

I get this error about once a week:

smartctl -a /dev/da5:

zpool status tank:

dmesg | grep mps:

It is always disk5.
Read write cksum is either 0 0 0 or 0 1 0 — never more.
After a scrub it is fine — for a few days.

PSU and SAS controller (Dell H310) are new.

What I’ve tried:
— mounted a fan on the SAS controller
— cheked and swapped power and SATA connectors (still same cables) for disk5

I will change power and SATA cables next.

Could the SAS controller be faulty and lead to such an error?
Should I replace disk5? SMART has 2 stored errors for that disk.

Dabbler

After replacing the SAS/SATA cable, the machine ran smooth for 12 days.
But now the error is back!

SMART results are good. zpool status has 1 Write error, resilvered 84K.

Could it be a defect sector on the drive?
Can I leave the disk installed as long as it’s just those single write errors? Or do you recommend replacing that drive?

Источник

cam status ATA status error

Aschtra

Dabbler

I’ve been experiencing some problems with my Freenas server. When I run dmesg from the shell then I see alot of cam status ATA status errors which I don’t know what they mean. Here is a screenshot of it when it occured again. It does not always occur:

I have the following hardware:

HP Microserver Gen8
16GB ECC RAM
4x4TB WD RED NAS drives (which I bought new, 2,5 weeks ago and replaces all my previous drives which were 4x2TB drives)
Freenas11.1-U4

The error message says something about ADA1, but all hard drives are new so I don’t expect something to be wrong with it.
I also got this error sometimes before I replaced my 4 drives. I searched this error on google but everyones situation is different so that’s why I made a new topic for this.
I read about replacing sata cables, but the Gen8 server does not use normal sata cables. But this one:

Zpool status -v output:

Anyone has a idea what I can do to fix this?

kdragon75

Wizard

Aschtra

Dabbler

I will reseat them first. I did not burn my drives in. Don’t know how.
I just did a short smartctl test and I think these are the results?:

I am afraid I can’t really reseat the sata cables. They look like this (yes I put the screws back in again)

kdragon75

Wizard

Aschtra

Dabbler

Done that. I’ll wait.
My server is running the following:

— Plex
— Radarr
— Sonarr
— Sabnzbd

4x4TB in a raidz1.

Anything else I can do for now to provide more information maybe?

It happend again, right after my girlfriend tried to watch a show on Plex. Will try something with that file, delete it or something. Play another show, see if that one works

kdragon75

Wizard

Aschtra

Dabbler

Yes always on ada1. I’ve replaced the episode she was watching with a different one and now it doesn’t occur anymore when she watches it. Weird issue, it crashed my whole freenas.. :p

Will keep an eye on it though.

kdragon75

Wizard

Aschtra

Dabbler

The drives are brand new so I don’t suspect the HDD right now. Will switch them up though just to see what happens.

Also will do more scrubs. Had them on each 35 days but will do 7 days now

Jailer

Not strong, but bad

kdragon75

Wizard

Exactly. A brand spanking new drive is more likely to fail than a drive that’s been running for 6 months.

As @Jailer mentioned, smart tests long and short need to be scheduled to help monitor and test your drives.

Johnnie Black

It may be new but this is not a good sign:

kdragon75

Wizard

Yeah its normal to have some raw read errors and have it count up over the life of the drive but if its shooting up. Better get a new drive.

Do some searching on the forum about drive testing and burn-in. It’s important for all drives but especially new drives.

FreeNAS Generalissimo

Chris Moore

Hall of Famer

I have a server at work that has 60 hard drives in it and in the first 6 months four of them failed. One of them catastrophically.
Just because a drive is new, that doesn’t mean it can’t fail.

Sent from my SAMSUNG-SGH-I537 using Tapatalk

This one was built in 2018, but I reused the name from a previous build. This is the 8th FreeNAS unit I have built for home. Eight systems in ten years. I made some mistakes along the way, learned some and I try to share some of those lessons learned experiences here in the forum. I have even put together some hardware just to test things out a time or two.
For a while I had three systems, all at once, at home but I am making some hardware changes right now and only one NAS is online.
The three pools in this one system represent the three NAS systems I had before the consolidation. For a home NAS, this chassis is huge, able to hold 48 data drives and two boot drives with a couple spaces internally for non-hot-swap drives.

Источник

I tried searching, but I couldn’t find a readable explanation of what causes/what is happening when an IDNF error is reported. (Is this a non-existent LBA or is it something else?)

I got message about an unrecoverable error on my main pool, so I had a look at the smart parameters, and it appears that the issue had to do with the error message: Error: IDNF at LBA = .... The first time it happened, I was able to clear the error and I had about 100kb resilvered. I ran a badblocks (read only) on the disk as well as a full smart surface test and a zpool scrub—other than the IDNF errors there is no other indication of a problem.

I had a recurrence of the pool error condition, and that was easily cleared, and the passed a scrub without incident.

I have included a grep of the relevent messages from /var/log/messages as well as the output from smartctl below. Any insight/assistance would be much appreciated. Thanks in advance.

Log Entries Related to First incident

Code:

Sep 24 05:31:36 freenas (da3:mps0:0:5:0): CAM status: SCSI Status Error
Sep 24 05:31:36 freenas (da3:mps0:0:5:0): SCSI status: Check Condition
Sep 24 05:31:36 freenas (da3:mps0:0:5:0): SCSI sense: ILLEGAL REQUEST asc:21,0 (Logical block address out of range)
Sep 24 05:31:36 freenas (da3:mps0:0:5:0): Info: 0x214e59dc0
Sep 24 05:31:36 freenas (da3:mps0:0:5:0): Error 22, Unretryable error

Log Entries Related to Second incident

Code:

Oct  2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 1646 loginfo 31080000
Oct  2 14:02:00 freenas mps0: (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6a b0 00 00 30 00
Oct  2 14:02:00 freenas Controller reported scsi ioc terminated tgt 5 SMID 909 loginfo 31080000
Oct  2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 1816 loginfo 31080000
Oct  2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 1934 loginfo 31080000
Oct  2 14:02:00 freenas mps0: (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:00 freenas Controller reported scsi ioc terminated tgt 5 SMID 147 loginfo 31080000
Oct  2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 812 loginfo 31080000
Oct  2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 521 loginfo 31080000
Oct  2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 2072 loginfo 31080000
Oct  2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 1786 loginfo 31080000
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6a 58 00 00 28 00
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b 08 00 00 30 00
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6a e0 00 00 28 00
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b 38 00 00 28 00
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b 90 00 00 30 00
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b c0 00 00 28 00
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b 68 00 00 28 00
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b e8 00 00 30 00
Oct  2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error
Oct  2 14:02:01 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain
Oct  2 14:02:01 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6a 80 00 00 30 00
Oct  2 14:02:01 freenas (da3:mps0:0:5:0): CAM status: SCSI Status Error
Oct  2 14:02:01 freenas (da3:mps0:0:5:0): SCSI status: Check Condition
Oct  2 14:02:01 freenas (da3:mps0:0:5:0): SCSI sense: ILLEGAL REQUEST asc:21,0 (Logical block address out of range)
Oct  2 14:02:01 freenas (da3:mps0:0:5:0): Info: 0x4f206a80
Oct  2 14:02:01 freenas (da3:mps0:0:5:0): Error 22, Unretryable error

Latest Smart Output:

Code:

smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD60EFRX-68MYMN1
Serial Number:    WD-REDACTED
LU WWN Device Id: 5 0014ee 20b559a84
Firmware Version: 82.00A82
User Capacity:    6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5700 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Oct  4 16:21:41 2021 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         ( 4784) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 702) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x303d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    0
  3 Spin_Up_Time            POS--K   221   190   021    -    7941
  4 Start_Stop_Count        -O--CK   100   100   000    -    472
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   031   031   000    -    50928
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    113
192 Power-Off_Retract_Count -O--CK   200   200   000    -    77
193 Load_Cycle_Count        -O--CK   187   187   000    -    40408
194 Temperature_Celsius     -O---K   108   095   000    -    44
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      6  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa0-0xa7  GPL,SL  VS      16  Device vendor specific log
0xa8-0xb6  GPL,SL  VS       1  Device vendor specific log
0xb7       GPL,SL  VS      40  Device vendor specific log
0xbd       GPL,SL  VS       1  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL     VS      93  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 6
    CR     = Command Register
    FEATR  = Features Register
    COUNT  = Count (was: Sector Count) Register
    LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
    LH     = LBA High (was: Cylinder High) Register    ]   LBA
    LM     = LBA Mid (was: Cylinder Low) Register      ] Register
    LL     = LBA Low (was: Sector Number) Register     ]
    DV     = Device (was: Device/Head) Register
    DC     = Device Control Register
    ER     = Error register
    ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 6 [5] occurred at disk power-on lifetime: 50878 hours (2119 days + 22 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 00 4f 20 6a 80 40 00  Error: IDNF at LBA = 0x4f206a80 = 1327524480

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 30 00 00 00 00 4f 20 6a 80 40 00  7d+03:19:07.767  WRITE FPDMA QUEUED
  ea 00 00 00 00 00 00 00 00 00 00 40 00  7d+03:19:06.998  FLUSH CACHE EXT
  61 00 08 00 10 00 02 ba a0 f4 00 40 00  7d+03:19:06.997  WRITE FPDMA QUEUED
  61 00 08 00 00 00 02 ba a0 f2 00 40 00  7d+03:19:06.997  WRITE FPDMA QUEUED
  61 00 08 00 08 00 00 00 40 04 00 40 00  7d+03:19:06.997  WRITE FPDMA QUEUED

Error 5 [4] occurred at disk power-on lifetime: 50677 hours (2111 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 02 14 e5 9d c0 40 00  Error: IDNF at LBA = 0x214e59dc0 = 8940527040

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 08 00 00 00 02 14 e5 9d c0 40 00 41d+02:52:12.844  WRITE FPDMA QUEUED
  ea 00 00 00 00 00 00 00 00 00 00 40 00 41d+02:51:52.870  FLUSH CACHE EXT
  61 00 08 00 00 00 02 ba a0 f4 48 40 00 41d+02:51:52.869  WRITE FPDMA QUEUED
  61 00 08 00 10 00 02 ba a0 f2 48 40 00 41d+02:51:52.869  WRITE FPDMA QUEUED
  61 00 08 00 08 00 00 00 40 04 48 40 00 41d+02:51:52.869  WRITE FPDMA QUEUED

Error 4 [3] occurred at disk power-on lifetime: 45881 hours (1911 days + 17 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 02 16 47 8b f0 40 00  Error: UNC at LBA = 0x216478bf0 = 8963722224

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 00 30 00 00 00 02 16 47 8b f0 40 00  3d+05:45:24.873  READ FPDMA QUEUED
  60 00 28 00 00 00 02 16 47 8b 98 40 00  3d+05:45:24.832  READ FPDMA QUEUED
  60 00 30 00 00 00 02 16 47 8b 68 40 00  3d+05:45:24.832  READ FPDMA QUEUED
  60 00 28 00 00 00 02 16 47 8b 40 40 00  3d+05:45:24.832  READ FPDMA QUEUED
  60 00 28 00 00 00 02 16 47 8b 10 40 00  3d+05:45:24.831  READ FPDMA QUEUED

Error 3 [2] occurred at disk power-on lifetime: 44438 hours (1851 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 02 5a 4c 71 00 40 00  Error: UNC at LBA = 0x25a4c7100 = 10104893696

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 00 30 00 00 00 02 5a 4c 71 00 40 00  7d+06:54:25.014  READ FPDMA QUEUED
  60 00 28 00 00 00 02 5a 4c 70 a8 40 00  7d+06:54:25.014  READ FPDMA QUEUED
  60 00 30 00 00 00 02 5a 4c 70 78 40 00  7d+06:54:25.013  READ FPDMA QUEUED
  60 00 28 00 08 00 02 5a 4c 70 50 40 00  7d+06:54:25.013  READ FPDMA QUEUED
  60 00 28 00 00 00 02 5a 4c 70 20 40 00  7d+06:54:25.013  READ FPDMA QUEUED

Error 2 [1] occurred at disk power-on lifetime: 43391 hours (1807 days + 23 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 01 70 9d 21 c0 40 00  Error: UNC at LBA = 0x1709d21c0 = 6184313280

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 00 30 00 00 00 01 70 9d 21 c0 40 00  8d+04:08:24.792  READ FPDMA QUEUED
  60 00 28 00 00 00 01 70 9d 21 98 40 00  8d+04:08:24.777  READ FPDMA QUEUED
  60 00 28 00 00 00 01 70 9d 21 68 40 00  8d+04:08:24.777  READ FPDMA QUEUED
  60 00 28 00 00 00 01 70 9d 1f 20 40 00  8d+04:08:24.776  READ FPDMA QUEUED
  60 00 28 00 00 00 01 70 9d 21 10 40 00  8d+04:08:24.776  READ FPDMA QUEUED

Error 1 [0] occurred at disk power-on lifetime: 40486 hours (1686 days + 22 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 02 b7 0b 30 d8 40 00  Error: UNC at LBA = 0x2b70b30d8 = 11660898520

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 00 58 00 08 00 02 b7 0b 89 f8 40 00 33d+00:38:34.307  READ FPDMA QUEUED
  60 00 58 00 00 00 02 b7 0b 30 d0 40 00 33d+00:38:34.296  READ FPDMA QUEUED
  60 00 58 00 08 00 02 b7 0a d7 c0 40 00 33d+00:38:34.284  READ FPDMA QUEUED
  60 00 60 00 00 00 02 b7 0a 7e 88 40 00 33d+00:38:34.284  READ FPDMA QUEUED
  60 00 58 00 00 00 02 b7 0a 25 48 40 00 33d+00:38:34.270  READ FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     49665         -
# 2  Extended offline    Completed without error       00%     49598         -
# 3  Extended offline    Completed without error       00%     46102         -
# 4  Extended offline    Completed without error       00%     18012         -
# 5  Extended offline    Completed without error       00%       177         -
# 6  Extended offline    Completed without error       00%       127         -
# 7  Extended offline    Completed without error       00%        90         -
# 8  Extended offline    Completed without error       00%        55         -
# 9  Extended offline    Completed without error       00%        12         -
#10  Short offline       Completed without error       00%         0         -
#11  Conveyance offline  Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       258 (0x0102)
Device State:                        Active (0)
Current Temperature:                    44 Celsius
Power Cycle Min/Max Temperature:     38/47 Celsius
Lifetime    Min/Max Temperature:      2/57 Celsius
Under/Over Temperature Limit Count:   0/0
Vendor specific:
00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/60 Celsius
Min/Max Temperature Limit:           -41/85 Celsius
Temperature History Size (Index):    478 (218)

Index    Estimated Time   Temperature Celsius
 219    2021-10-04 08:24    44  *************************
 ...    ..(476 skipped).    ..  *************************
 218    2021-10-04 16:21    44  *************************

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            1  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            4  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x8000  4       797757  Vendor specific

Источник

smartctl --xall /dev/sdj
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-8-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD40EFAX-68JH4N0
Serial Number:    WD-...
LU WWN Device Id: ...
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Apr 17 12:44:20 2020 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (21404) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 525) minutes.
Conveyance self-test routine
recommended polling time:        (   3) minutes.
SCT capabilities:              (0x3039) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    0
  3 Spin_Up_Time            POS--K   100   253   021    -    0
  4 Start_Stop_Count        -O--CK   100   100   000    -    1
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   100   100   000    -    540
 10 Spin_Retry_Count        -O--CK   100   253   000    -    0
 11 Calibration_Retry_Count -O--CK   100   253   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    1
192 Power-Off_Retract_Count -O--CK   200   200   000    -    0
193 Load_Cycle_Count        -O--CK   200   200   000    -    7
194 Temperature_Celsius     -O---K   126   113   000    -    21
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      6  Ext. Comprehensive SMART error log
0x04       GPL     R/O    256  Device Statistics log
0x04       SL      R/O      8  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x0c       GPL     R/O   2048  Pending Defects log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x24       GPL     R/O    294  Current Device Internal Status Data log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa0-0xa7  GPL,SL  VS      16  Device vendor specific log
0xa8-0xb6  GPL,SL  VS       1  Device vendor specific log
0xb7       GPL,SL  VS      78  Device vendor specific log
0xb9       GPL,SL  VS       4  Device vendor specific log
0xbd       GPL,SL  VS       1  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL     VS      93  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 7487 (device log contains only the most recent 24 errors)
        CR     = Command Register
        FEATR  = Features Register
        COUNT  = Count (was: Sector Count) Register
        LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
        LH     = LBA High (was: Cylinder High) Register    ]   LBA
        LM     = LBA Mid (was: Cylinder Low) Register      ] Register
        LL     = LBA Low (was: Sector Number) Register     ]
        DV     = Device (was: Device/Head) Register
        DC     = Device Control Register
        ER     = Error register
        ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 7487 [22] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 01 d1 c0 bc 10 40 00  Error: IDNF at LBA = 0x1d1c0bc10 = 7814036496

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 38 00 20 00 01 a8 8f ca 78 40 00 22d+12:23:01.073  WRITE FPDMA QUEUED
  61 00 18 00 18 00 01 a8 8f ca b8 40 00 22d+12:23:01.073  WRITE FPDMA QUEUED
  61 00 10 00 10 00 00 02 32 ba 10 40 00 22d+12:23:01.073  WRITE FPDMA QUEUED
  61 00 10 00 08 00 01 d1 c0 bc 10 40 00 22d+12:23:01.073  WRITE FPDMA QUEUED
  61 00 10 00 00 00 01 d1 c0 ba 10 40 00 22d+12:23:01.073  WRITE FPDMA QUEUED

Error 7486 [21] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 01 a8 8f c9 78 40 00  Error: IDNF at LBA = 0x1a88fc978 = 7122962808

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 58 00 00 00 01 a8 8f c9 98 40 00 22d+12:22:53.109  WRITE FPDMA QUEUED
  61 00 18 00 20 00 01 a5 64 0d e8 40 00 22d+12:22:52.710  WRITE FPDMA QUEUED
  61 00 10 00 18 00 01 a5 64 0d d8 40 00 22d+12:22:52.383  WRITE FPDMA QUEUED
  61 00 10 00 10 00 01 a5 64 0d c8 40 00 22d+12:22:52.368  WRITE FPDMA QUEUED
  61 00 18 00 08 00 01 a8 8f c9 78 40 00 22d+12:22:52.368  WRITE FPDMA QUEUED

Error 7485 [20] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 01 a8 8e f9 78 40 00  Error: IDNF at LBA = 0x1a88ef978 = 7122909560

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 38 00 00 00 01 a8 8e f9 98 40 00 22d+12:22:15.406  WRITE FPDMA QUEUED
  61 00 40 00 38 00 01 a5 64 07 e8 40 00 22d+12:22:15.152  WRITE FPDMA QUEUED
  61 00 40 00 30 00 01 a5 64 07 a8 40 00 22d+12:22:14.895  WRITE FPDMA QUEUED
  61 00 20 00 28 00 01 a8 fc 7f 68 40 00 22d+12:22:14.895  WRITE FPDMA QUEUED
  61 00 28 00 20 00 01 a8 fc 7f 40 40 00 22d+12:22:14.895  WRITE FPDMA QUEUED

Error 7484 [19] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 01 80 96 22 e8 40 00  Error: IDNF at LBA = 0x1809622e8 = 6452290280

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 10 00 18 00 01 d1 c0 bc 10 40 00 22d+12:21:34.814  WRITE FPDMA QUEUED
  61 00 10 00 10 00 01 d1 c0 ba 10 40 00 22d+12:21:34.603  WRITE FPDMA QUEUED
  61 00 10 00 08 00 00 02 32 ba 10 40 00 22d+12:21:34.406  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 80 96 27 10 40 00 22d+12:21:34.362  WRITE FPDMA QUEUED
  61 00 08 00 20 00 01 80 96 22 e8 40 00 22d+12:21:33.737  WRITE FPDMA QUEUED

Error 7483 [18] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 01 a9 39 f2 e0 40 00  Error: IDNF at LBA = 0x1a939f2e0 = 7134114528

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 08 00 00 00 01 a9 3d 99 60 40 00 22d+12:21:26.189  WRITE FPDMA QUEUED
  61 00 08 00 08 00 01 a9 39 f2 e0 40 00 22d+12:21:26.189  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 a9 38 d8 10 40 00 22d+12:21:26.189  WRITE FPDMA QUEUED
  61 00 08 00 08 00 01 a9 38 81 68 40 00 22d+12:21:26.189  WRITE FPDMA QUEUED
  61 00 08 00 00 00 01 a8 d2 c5 30 40 00 22d+12:21:26.188  WRITE FPDMA QUEUED

Error 7482 [17] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 01 a8 8d 5f d8 40 00  Error: IDNF at LBA = 0x1a88d5fd8 = 7122804696

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 98 00 20 00 01 a8 8d 5f d8 40 00 22d+12:20:49.161  WRITE FPDMA QUEUED
  60 00 10 00 18 00 01 d1 c0 bc 10 40 00 22d+12:20:49.159  READ FPDMA QUEUED
  60 00 10 00 10 00 01 d1 c0 ba 10 40 00 22d+12:20:49.159  READ FPDMA QUEUED
  60 00 10 00 08 00 00 02 32 ba 10 40 00 22d+12:20:49.159  READ FPDMA QUEUED
  61 00 08 00 00 00 01 a8 8c 90 e8 40 00 22d+12:20:49.159  WRITE FPDMA QUEUED

Error 7481 [16] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 01 a8 8d 5f 98 40 00  Error: IDNF at LBA = 0x1a88d5f98 = 7122804632

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 10 00 18 00 01 a5 64 05 00 40 00 22d+12:20:46.550  WRITE FPDMA QUEUED
  61 00 18 00 10 00 01 a8 fa 57 60 40 00 22d+12:20:46.534  WRITE FPDMA QUEUED
  61 00 18 00 08 00 01 a8 8d 5f b8 40 00 22d+12:20:42.224  WRITE FPDMA QUEUED
  61 00 18 00 00 00 01 a8 8d 5f 98 40 00 22d+12:20:42.223  WRITE FPDMA QUEUED
  61 00 18 00 08 00 01 a8 8d 5f 78 40 00 22d+12:20:42.222  WRITE FPDMA QUEUED

Error 7480 [15] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  10 -- 51 00 00 00 01 d1 c0 bc 10 40 00  Error: IDNF at LBA = 0x1d1c0bc10 = 7814036496

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  61 00 18 00 20 00 01 a5 64 03 58 40 00 22d+12:19:13.755  WRITE FPDMA QUEUED
  61 00 28 00 18 00 01 a8 f8 79 58 40 00 22d+12:19:13.745  WRITE FPDMA QUEUED
  61 00 20 00 00 00 01 a8 f8 7a d0 40 00 22d+12:19:13.743  WRITE FPDMA QUEUED
  61 00 10 00 10 00 00 02 32 ba 10 40 00 22d+12:19:08.985  WRITE FPDMA QUEUED
  61 00 10 00 08 00 01 d1 c0 bc 10 40 00 22d+12:19:08.985  WRITE FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       258 (0x0102)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    21 Celsius
Power Cycle Min/Max Temperature:     16/31 Celsius
Lifetime    Min/Max Temperature:     16/31 Celsius
Under/Over Temperature Limit Count:   0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/65 Celsius
Min/Max Temperature Limit:           -41/85 Celsius
Temperature History Size (Index):    478 (46)

Index    Estimated Time   Temperature Celsius
  47    2020-04-17 04:47    26  *******
  48    2020-04-17 04:48    26  *******
  49    2020-04-17 04:49    26  *******
  50    2020-04-17 04:50    27  ********
  51    2020-04-17 04:51    26  *******
 ...    ..(  6 skipped).    ..  *******
  58    2020-04-17 04:58    26  *******
  59    2020-04-17 04:59    25  ******
 ...    ..(  9 skipped).    ..  ******
  69    2020-04-17 05:09    25  ******
  70    2020-04-17 05:10    24  *****
 ...    ..(115 skipped).    ..  *****
 186    2020-04-17 07:06    24  *****
 187    2020-04-17 07:07    25  ******
 ...    ..( 12 skipped).    ..  ******
 200    2020-04-17 07:20    25  ******
 201    2020-04-17 07:21    26  *******
 ...    ..(  7 skipped).    ..  *******
 209    2020-04-17 07:29    26  *******
 210    2020-04-17 07:30    27  ********
 ...    ..( 77 skipped).    ..  ********
 288    2020-04-17 08:48    27  ********
 289    2020-04-17 08:49    26  *******
 ...    ..( 11 skipped).    ..  *******
 301    2020-04-17 09:01    26  *******
 302    2020-04-17 09:02    25  ******
 ...    ..( 11 skipped).    ..  ******
 314    2020-04-17 09:14    25  ******
 315    2020-04-17 09:15    24  *****
 ...    ..( 19 skipped).    ..  *****
 335    2020-04-17 09:35    24  *****
 336    2020-04-17 09:36    23  ****
 ...    ..( 19 skipped).    ..  ****
 356    2020-04-17 09:56    23  ****
 357    2020-04-17 09:57    24  *****
 ...    ..( 11 skipped).    ..  *****
 369    2020-04-17 10:09    24  *****
 370    2020-04-17 10:10    25  ******
 371    2020-04-17 10:11    24  *****
 ...    ..(  4 skipped).    ..  *****
 376    2020-04-17 10:16    24  *****
 377    2020-04-17 10:17    23  ****
 ...    ..(  9 skipped).    ..  ****
 387    2020-04-17 10:27    23  ****
 388    2020-04-17 10:28    24  *****
 ...    ..( 15 skipped).    ..  *****
 404    2020-04-17 10:44    24  *****
 405    2020-04-17 10:45    25  ******
 406    2020-04-17 10:46    24  *****
 ...    ..( 13 skipped).    ..  *****
 420    2020-04-17 11:00    24  *****
 421    2020-04-17 11:01    25  ******
 422    2020-04-17 11:02    24  *****
 ...    ..( 16 skipped).    ..  *****
 439    2020-04-17 11:19    24  *****
 440    2020-04-17 11:20    23  ****
 ...    ..(  5 skipped).    ..  ****
 446    2020-04-17 11:26    23  ****
 447    2020-04-17 11:27    22  ***
 448    2020-04-17 11:28    23  ****
 449    2020-04-17 11:29    23  ****
 450    2020-04-17 11:30    23  ****
 451    2020-04-17 11:31    22  ***
 452    2020-04-17 11:32    23  ****
 453    2020-04-17 11:33    23  ****
 454    2020-04-17 11:34    23  ****
 455    2020-04-17 11:35    24  *****
 456    2020-04-17 11:36    23  ****
 457    2020-04-17 11:37    23  ****
 458    2020-04-17 11:38    24  *****
 459    2020-04-17 11:39    24  *****
 460    2020-04-17 11:40    24  *****
 461    2020-04-17 11:41    23  ****
 462    2020-04-17 11:42    24  *****
 ...    ..( 17 skipped).    ..  *****
   2    2020-04-17 12:00    24  *****
   3    2020-04-17 12:01    23  ****
 ...    ..( 12 skipped).    ..  ****
  16    2020-04-17 12:14    23  ****
  17    2020-04-17 12:15    22  ***
  18    2020-04-17 12:16    23  ****
 ...    ..(  5 skipped).    ..  ****
  24    2020-04-17 12:22    23  ****
  25    2020-04-17 12:23    24  *****
  26    2020-04-17 12:24    24  *****
  27    2020-04-17 12:25    23  ****
  28    2020-04-17 12:26    19  -
  29    2020-04-17 12:27    19  -
  30    2020-04-17 12:28    21  **
  31    2020-04-17 12:29    21  **
  32    2020-04-17 12:30    22  ***
 ...    ..(  9 skipped).    ..  ***
  42    2020-04-17 12:40    22  ***
  43    2020-04-17 12:41    23  ****
  44    2020-04-17 12:42    23  ****
  45    2020-04-17 12:43    23  ****
  46    2020-04-17 12:44    21  **

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) supported [please try: '-l defects']

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2           82  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2           83  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x8000  4      1945390  Vendor specific

Источник

Summary
Files
Reviews
Support
Code
Mailing Lists
Old MLs

Menu
▾
▴

From: Bruce Allen <ba…@gr…> — 2004-05-10 22:05:22

Tobias,

I'm referring to:
http://www.hitachigst.com/tech/techlib.nsf/techdocs/53989D390D44D88F86256D1F0058368D
which are the OEM specs for your disk.

> SMART Error Log Version: 1

> Error 98 occurred at disk power-on lifetime: 1415 hours (58 days + 23
> hours)
>   When the command that caused the error occurred, the device was
> active or idle.
> 
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   10 59 01 e0 00 d6 e6  Error: IDNF at LBA = 0x06d600e0 = 114688224

This means that the specified sector, in this case Logical Block Address
114688224, was not found.   According to the specs for your disk, the
total number of sectors  is 117,210,240 (see page 25 of URL above).  So
the failure to find block 114,688,224 can mean one of only two things, I
think:

(1) drive is faulty, or
(2) drive is has been set (using a software command) with a protected area
which you can't read.  This is done using the Read Native Max and Set Max
Address commands. There are documented in the manual above. There is
probably some standard utility for doing the 'Set Max Address' command.  
A few seconds of Google searching turned up this:
http://www.win.tue.nl/~aeb/linux/setmax.c

> The strange thing is that those IDNF at LBA errors appear to be in a
> space which isn't user-addressable.

According to the manual, it could/should be user addressable.

> My hard drive comes with a hidden Predesktop Area "partition" installed
> by the manufacturer - this area contains Win XP recovery data as well
> as some system tools and its default is to be invisible to operating
> systems (you can define this in the machine's BIOS).

That's the 'protected area'.  You're running into trouble, I think,
because of this.

> Here are the available hard drive sectors (hdparm's output):
> 
> LBA    user addressable sectors:  110833010 (Predesktop Area enabled)
> LBA    user addressable sectors:  117210244 (Predesktop Area disabled)
> 
> The IDNF at LBA errors are at 114688224

Exactly.  This may be because your disk partitioning utilities are writing
a partition table using ALL 117210244 sectors not the 110833010 that are
available.

> Are those errors a concern?

Yes.  There is some utility in your toolset (fdisk, cfdisk, mkfs?) that
appears to be trying to use more than 110833010 sectors.

> What about the invalid checksum of the Self-Test Log Structure?

It's the SELECTIVE Self-Test Log.  I wouldn't worry about it -- the disk
is ATA-6 not ATA-7, so the support for selective self-tests is not obeying
the ATA specs, which only introduced selective self-tests with ATA-7.

> Is there a way to correct those errors? I'm scared..

I think it's OK.

> The hard drive was in a perfect condition before I tried to configure
> my partitions with the SUSE installer. :/ There hasn't been any error
> before.

Use fdisk -lu to understand if the partition table extends beyond
110833010 sectors.  If so, fix it.  If not, try to understand what utility
is ignoring this limit and trying to read/write beyond it.

> Could you please CC me as I'm currently not subscribed to this list?

Likewise please continue to cc the list so that there is a record of this
in the mail archive.

Bruce

View entire thread

Источник

The drive firmware runs the tests.
The details of the tests can be read in eg www.t13.org/Documents/UploadedDocuments/technical/e01137r0.pdf, which summarises the elements of the short and long tests thus:
1. an electrical segment wherein the drive tests its own electronics. The particular tests in this segment
  are vendor specific, but as examples: this segment might include such tests as a buffer RAM test, a
  read/write circuitry test, and/or a test of the read/write head elements.
2. a seek/servo segment wherein the drive tests it capability to find and servo on data tracks. The
  particular methodology used in this test is also vendor specific.
3. a read/verify scan segment wherein the drive performs read scanning of some portion of the disk
  surface. The amount and location of the surface scanned are dependent on the completion time
  constraint and are vendor specific.
4. The criteria for the extended self-test are the same as the short self-test with two exceptions: segment
  (3) of the extended self-test shall be a read/verify scan of all of the user data area, and there is no
  maximum time limit for the drive to perform the test.
It is safe to perform non-destructive testing while the OS is running, though some performance impact is likely. As the smartctl man page says for both -t short and -t long,

This command can be given in normal system operation (unless run in captive mode)

If you invoke captive mode with -C, smartctl assumes the drive can be busied-out to unavailability. This should not be done on a drive the OS is using.

As the man page also suggests, the offline testing (which simply means periodic background testing) is not reliable, and never officially became part of the ATA specifications. I run mine from cron, instead; that way I know when they should happen, and I can stop it if I need to.

The results can be seen in the smartctl output. Here’s one with a test running:

[root@risby images]# smartctl -a /dev/sdb
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.1.6-201.fc22.x86_64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org
[...]
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)   LBA_of_first_error
# 1  Extended offline    Completed without error       00%     20567         -
# 2  Extended offline    Completed without error       00%       486         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Self_test_in_progress [90% left] (0-65535)
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing

Note two previous completed tests (at 486 and 20567 hours power-on, respectively) and the current running one (10% complete).

Источник

Is it a good practice to ignore IDNF errors in smart logs if no other
evidence of problems (including smart self test and badblocks run) has been
found? Does anyone have a experience with this? The details follows.

During checks to see if a 2 year old drive is still fine (for my notebook), and I noticed the following errors in smart log (excerpt from smartctl -a /dev/sdd):

SMART Error Log Version: 1
ATA Error Count: 33 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 33 occurred at disk power-on lifetime: 6 hours (0 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 00 00 00 00 00  Error:

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  a1 00 00 00 00 00 a0 00      00:05:34.511  IDENTIFY PACKET DEVICE
  25 00 00 00 00 00 e0 ff      00:05:34.500  READ DMA EXT
  25 00 01 00 00 00 e0 00      00:05:30.790  READ DMA EXT
  25 00 01 00 00 00 e0 00      00:05:29.550  READ DMA EXT
  25 00 01 00 00 00 e0 00      00:05:29.549  READ DMA EXT

Error 32 occurred at disk power-on lifetime: 6 hours (0 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 32 9c fd ff 0f  Error: IDNF 50 sectors at LBA = 0x0ffffd9c = 268434844

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 32 9c fd ff e0 00      00:00:51.163  READ DMA EXT
  25 00 32 9c fd ff 0f 04      00:00:51.156  READ DMA EXT
  25 00 32 9c fd ff e0 00      00:00:51.074  READ DMA EXT
  25 00 32 9c fd ff 0f 04      00:00:51.068  READ DMA EXT
  25 00 32 9c fd ff e0 00      00:00:50.985  READ DMA EXT

All remaining 3 errors stored in smart log are the same as the last one (IDNF error at 0x0ffffd9c).

As I understand it, the error means sector ID not found, so it does not directly mean that the drive is bad, but it’s still fishy.

No smart attributes shows problems (eg. no reallocated sectors):

# smartctl --attributes /dev/sdd
smartctl 6.2 2013-07-26 r3841 [i686-linux-3.13.10-200.fc20.i686] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0007   217   217   033    Pre-fail  Always       -       1
  4 Start_Stop_Count        0x0012   099   099   000    Old_age   Always       -       1659
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0012   092   092   000    Old_age   Always       -       3856
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       1659
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       29
193 Load_Cycle_Count        0x0012   056   056   000    Old_age   Always       -       441546
194 Temperature_Celsius     0x0002   206   206   000    Old_age   Always       -       29 (Min/Max 14/46)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -       0

Also smart selftest reports no problems:

# smartctl -t long /dev/sdd
... after a while ...
# smartctl -l selftest /dev/sdd
smartctl 6.2 2013-07-26 r3841 [i686-linux-3.13.10-200.fc20.i686] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      3853         -
# 2  Extended offline    Completed without error       00%      3851         -
# 3  Short offline       Completed without error       00%      3847         -

And just to be sure, I run badblocks overwriting the whole disk with random data to check if there are some bad sectors. And no problems found either:

# badblocks -s -w -v -t random /dev/sdd
Checking for bad blocks in read-write mode
From block 0 to 312571223
Testing with random pattern: done
Reading and comparing: done
Pass completed, 0 bad blocks found. (0/0/0 errors)

Источник

Update: I have added output from smartctl short tests for all drives. One seems to be generating errors, but overall tests have passed.

A few days ago I set up a fresh software RAID10 (mdadm) using 4 4TB hard drives.
According to the output from mdadm -d and cat /proc/mdstat, all RAID devices are functioning properly. However, straight away, I noticed that operations such as creating a directory would take a few seconds around half of the time.

Resyncing is also taking an incredibly long time (up and running since 25th of October, but only 10% synced).

And lastly, since today, I have had issues accessing the mounted drive array. When using cd the shell hangs once I use tab to autocomplete the path past the drive’s root directory. When using df -h on a remote host that mounts the RAID array as a Samba share, I get the message ‘Host is down’ or ‘Resource temporarily unavailable’ after a minute or two.

$ sudo mdadm -D /dev/md1
/dev/md1:
           Version : 1.2
     Creation Time : Mon Oct 25 20:18:15 2021
        Raid Level : raid10
        Array Size : 7813769216 (7451.79 GiB 8001.30 GB)
     Used Dev Size : 3906884608 (3725.90 GiB 4000.65 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Fri Oct 29 05:24:29 2021
             State : active, resyncing
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : near=2
        Chunk Size : 512K

Consistency Policy : bitmap

     Resync Status : 10% complete

              Name : i7-harvester:1  (local to host i7-harvester)
              UUID : eb089c26:4b9b1dc7:7d9481f9:69e5a3df
            Events : 42562

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync set-A   /dev/sdb1
       1       8       33        1      active sync set-B   /dev/sdc1
       2       8       49        2      active sync set-A   /dev/sdd1
       3       8       65        3      active sync set-B   /dev/sde1

(note the speed reported by mdstat used to be around 3000k, but has since today decreased to 3K/sec)

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid10 sde1[3] sdd1[2] sdc1[1] sdb1[0]
      7813769216 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      [==>..................]  resync = 10.7% (837141440/7813769216) finish=33547280.3min speed=3K/sec
      bitmap: 53/59 pages [212KB], 65536KB chunk

df on remote system

$ df -h
df: /mnt/data: Resource temporarily unavailable
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        59G   13G   44G  22% /
devtmpfs        1.8G     0  1.8G   0% /dev
tmpfs           1.9G  8.0K  1.9G   1% /dev/shm
tmpfs           1.9G  202M  1.7G  11% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/mmcblk0p1  253M   49M  204M  20% /boot
/dev/sda1       1.8T  131G  1.6T   8% /mnt/nas

$ sudo smartctl -a /dev/sdb
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi/HGST Ultrastar 7K4000
Device Model:     HGST HUS724040ALA640
LU WWN Device Id: 5 000cca 22bf6d532
Add. Product Id:  DELL(tm)
Firmware Version: MFAOAB50
User Capacity:    4.000.787.030.016 bytes [4,00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 30 09:11:05 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (   24) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 533) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   138   138   000    Old_age   Offline      -       75
  3 Spin_Up_Time            0x0007   127   127   024    Pre-fail  Always       -       609 (Average 607)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       30
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   000    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   140   140   000    Old_age   Offline      -       26
  9 Power_On_Hours          0x0012   092   092   000    Old_age   Always       -       62287
 10 Spin_Retry_Count        0x0012   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       30
192 Power-Off_Retract_Count 0x0032   099   099   000    Old_age   Always       -       1473
193 Load_Cycle_Count        0x0012   099   099   000    Old_age   Always       -       1473
194 Temperature_Celsius     0x0002   157   157   000    Old_age   Always       -       38 (Min/Max 18/47)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -       0
241 Total_LBAs_Written      0x0012   100   100   000    Old_age   Always       -       265905350843
242 Total_LBAs_Read         0x0012   100   100   000    Old_age   Always       -       1599707339787

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     62275         -
# 2  Short offline       Completed without error       00%     61911         -
# 3  Short offline       Completed without error       00%        12         -
# 4  Extended offline    Completed without error       00%        12         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

$ sudo smartctl -a /dev/sdc
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi/HGST Ultrastar 7K4000
Device Model:     HGST HUS724040ALA640
LU WWN Device Id: 5 000cca 23dc4c637
Add. Product Id:  DELL(tm)
Firmware Version: MFAOAB50
User Capacity:    4.000.787.030.016 bytes [4,00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 30 09:13:13 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (   24) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 553) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   139   139   000    Old_age   Offline      -       73
  3 Spin_Up_Time            0x0007   125   125   024    Pre-fail  Always       -       624 (Average 611)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       27
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   000    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   138   138   000    Old_age   Offline      -       27
  9 Power_On_Hours          0x0012   093   093   000    Old_age   Always       -       50333
 10 Spin_Retry_Count        0x0012   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       27
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       380
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       380
194 Temperature_Celsius     0x0002   162   162   000    Old_age   Always       -       37 (Min/Max 20/46)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -       0
241 Total_LBAs_Written      0x0012   100   100   000    Old_age   Always       -       186666664018
242 Total_LBAs_Read         0x0012   100   100   000    Old_age   Always       -       1482419943771

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     50321         -
# 2  Short offline       Completed without error       00%     49927         -
# 3  Short offline       Completed without error       00%        11         -
# 4  Extended offline    Completed without error       00%        11         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

$ sudo smartctl -a /dev/sdd
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
LU WWN Device Id: 5 0014ee 2b6822d7f
Firmware Version: 82.00A82
User Capacity:    4.000.787.030.016 bytes [4,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Oct 30 09:14:05 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (51720) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 517) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x703d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   209   177   021    Pre-fail  Always       -       6508
  4 Start_Stop_Count        0x0032   098   098   000    Old_age   Always       -       2532
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   060   060   000    Old_age   Always       -       29367
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       21
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       15
193 Load_Cycle_Count        0x0032   196   196   000    Old_age   Always       -       14465
194 Temperature_Celsius     0x0022   115   097   000    Old_age   Always       -       37
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
ATA Error Count: 2
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 28742 hours (1197 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 10 e0 32 1e e0  Error: IDNF at LBA = 0x001e32e0 = 1979104

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ca 00 10 e0 32 1e e0 08   3d+00:35:32.256  WRITE DMA
  ca 00 08 00 29 0c e0 08   3d+00:35:32.245  WRITE DMA
  ca 00 08 c0 0f 19 e0 08   3d+00:35:32.234  WRITE DMA
  ca 00 40 a8 44 1f e0 08   3d+00:35:32.175  WRITE DMA

Error 1 occurred at disk power-on lifetime: 28670 hours (1194 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 08 80 06 4c e0  Error: IDNF at LBA = 0x004c0680 = 4982400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ca 00 08 80 06 4c e0 08  13d+13:29:55.786  WRITE DMA
  b0 d0 01 00 4f c2 00 08  13d+13:29:47.191  SMART READ DATA
  ec 00 00 00 00 00 00 08  13d+13:29:47.191  IDENTIFY DEVICE

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     29354         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

$ sudo smartctl -a /dev/sde
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Desktop HDD.15
Device Model:     ST4000DM000-1F2168
LU WWN Device Id: 5 000c50 05070b3a5
Firmware Version: CC52
User Capacity:    4.000.787.030.016 bytes [4,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 30 09:15:01 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  623) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 540) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   111   099   006    Pre-fail  Always       -       33492856
  3 Spin_Up_Time            0x0003   093   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       256
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   067   060   030    Pre-fail  Always       -       5757534
  9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       7873
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       131
183 Runtime_Bad_Block       0x0032   087   087   000    Old_age   Always       -       13
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   099   087   000    Old_age   Always       -       1002 1042 1081
189 High_Fly_Writes         0x003a   099   099   000    Old_age   Always       -       1
190 Airflow_Temperature_Cel 0x0022   066   045   045    Old_age   Always   In_the_past 34 (Min/Max 25/36)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       101
193 Load_Cycle_Count        0x0032   090   090   000    Old_age   Always       -       21749
194 Temperature_Celsius     0x0022   034   055   000    Old_age   Always       -       34 (0 1 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   199   194   000    Old_age   Always       -       790247
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       1330h+40m+55.940s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       39004091013
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       16225042845

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      7861         -
# 2  Short offline       Interrupted (host reset)      90%      7762         -
# 3  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

I’m at a loss on how to troubleshoot or fix (I’m new to RAIDs), so any help would be greatly appreciated.

Источник

Is this a TrueNAS error or a Hardware issue?

NASbox

NASbox

Samuel Tai

Never underestimate your own stupidity

NASbox

Samuel Tai

Never underestimate your own stupidity

NASbox

Cam status: SCSI Status Error

Dabbler

Dabbler

Dabbler

Degraded pool: Please help me fix it.

Mike G

Dabbler

Cam status: SCSI Status Error

Dabbler

Dabbler

Dabbler

cam status ATA status error

Aschtra

Dabbler

kdragon75

Wizard

Aschtra

Dabbler

kdragon75

Wizard

Aschtra

Dabbler

kdragon75

Wizard

Aschtra

Dabbler

kdragon75

Wizard

Aschtra

Dabbler

Jailer

Not strong, but bad

kdragon75

Wizard

Johnnie Black

kdragon75

Wizard

FreeNAS Generalissimo

Chris Moore

Hall of Famer

Читайте также: