Содержание
- Is this a TrueNAS error or a Hardware issue?
- NASbox
- NASbox
- Samuel Tai
- NASbox
- Samuel Tai
- NASbox
- Cam status: SCSI Status Error
- Degraded pool: Please help me fix it.
- Mike G
- Cam status: SCSI Status Error
- cam status ATA status error
- Aschtra
- kdragon75
- Aschtra
- kdragon75
- Aschtra
- kdragon75
- Aschtra
- kdragon75
- Aschtra
- Jailer
- kdragon75
- Johnnie Black
- kdragon75
- Chris Moore
Is this a TrueNAS error or a Hardware issue?
NASbox
I have had errors showing up shortly after I upgraded from FreeNAS 11.3-U5 to TrueNAS-12.0-U5. (I plan on upgrading to TrueNAS-12.0-U6 next weekend assuming that the release goes of smoothly.) I don’t know if there is a connection or if it is just coincidence.
I tried searching, but I couldn’t find a readable explanation of what causes/what is happening when an IDNF error is reported. (Is this a non-existent LBA or is it something else?)
I got message about an unrecoverable error on my main pool, so I had a look at the smart parameters, and it appears that the issue had to do with the error message: Error: IDNF at LBA = . . The first time it happened, I was able to clear the error and I had about 100kb resilvered. I ran a badblocks (read only) on the disk as well as a full smart surface test and a zpool scrub—other than the IDNF errors there is no other indication of a problem.
I had a recurrence of the pool error condition, and that was easily cleared, and the passed a scrub without incident.
Any insight as as to if this Is this a TrueNAS error (ZFS issue), a hardware issue (Drive DA3, the HBA, a Memory Fault-NonECC) or just corruption (System experienced a bad shutdown due to UPS failure)?
I have included a grep of the relevent messages from /var/log/messages as well as the output from smartctl below. Any insight/assistance would be much appreciated. Thanks in advance.
Log Entries Related to First incident
Log Entries Related to Second incident
Latest Smart Output:
Happy FreeNAS User since 2012
NASbox
Can anyone tell me what an IDNF error is? I did a camcontrol identify on the drive. IIUC these IDNF errors seem to indicate that the command requested an LBA larger than the size of the drive? Do I have that correct? If so, then I assume that means fauty HBA Card/Cable, memory corruption, or TrueNAS Software Error. If someone can offer any clues it would be much appreciated.
Happy FreeNAS User since 2012
Samuel Tai
Never underestimate your own stupidity
NASbox
IIUC it seems like a corrupt/damaged spot on the platter could also be responsible — or am I misinterpreting this quote from the article?
IDNF- Sector ID Not Found . If the sector that holds this information is corrupt there is no way for the hard drive to locate this sector and it will return the result IDNF.
It appears that the error is referring to an LBA48 address, if so, then the addresses should both be valid, otherwise they are both invalid.
Any thoughts/comments? (The pool is RAID-Z2, and I have a burned in spare on the shelf if necessary, but I don’t want to condemn the drive if it’s something else.)
Happy FreeNAS User since 2012
Samuel Tai
Never underestimate your own stupidity
NASbox
@Samuel Tai — This one is kind of confusing because:
5 Reallocated_Sector_Ct, 196 Reallocated_Event_Count, 197 Current_Pending_Sector, and 198 Offline_Uncorrectable are all 0!
AND the drive passed a long test. I’m really surprised that the long self test didn’t flag it.
I’ll keep watching closely, but I wondering if damage caused by a bad shutdown is a likely cause? I have also had a number of crashes on my workstation which has a R/W NFS Share on the pool. There might have been some activity during that crash — don’t know.
Also had 43 errors picked up by scrub after a UPS failure that offlined the disk. I did a badblocks and long smart — nothing — all OK so I resilvered.
If these areas were not in use at the time, or they contained deleted data, they wouldn’t get picked up by scrub. I don’t know how smart test determines bad surface. if it’s by reading ECC, or if the drive actively reads/writes/compares. The pool is a library, so there are a lot of files that have been sitting untouched for years. a bit of fade/bit rot. refresh the surface by re-silvering, and we are good to go. The drive is definitely getting up in age, but usually surface errors show as surface errors, and if there is a real problem there are a few of them. I had a drive go several months ago, and it was pretty obvious it was going. I’ve never had a drive go in such a «sneaky» way.
Is my thinking on the subject sound? Thoughts?
Источник
Cam status: SCSI Status Error
Dabbler
I built my TrueNAS a couple moinths ago, space is (temporarily) about 80% full, but the NAS is only used lighly with SMB.
I get this error about once a week:
smartctl -a /dev/da5:
zpool status tank:
dmesg | grep mps:
It is always disk5.
Read write cksum is either 0 0 0 or 0 1 0 — never more.
After a scrub it is fine — for a few days.
PSU and SAS controller (Dell H310) are new.
What I’ve tried:
— mounted a fan on the SAS controller
— cheked and swapped power and SATA connectors (still same cables) for disk5
I will change power and SATA cables next.
Could the SAS controller be faulty and lead to such an error?
Should I replace disk5? SMART has 2 stored errors for that disk.
Dabbler
Dabbler
After replacing the SAS/SATA cable, the machine ran smooth for 12 days.
But now the error is back!
SMART results are good. zpool status has 1 Write error, resilvered 84K.
I just realized the following:
The last few times the error / command was very similar:
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 01 01 c8 e8 08 00 00 00 08 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 7b 77 df 08 00 00 00 08 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 68 66 7d 98 00 00 00 10 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 68 61 9e b0 00 00 00 08 00 00
Could it be a defect sector on the drive?
Can I leave the disk installed as long as it’s just those single write errors? Or do you recommend replacing that drive?
Источник
Degraded pool: Please help me fix it.
Mike G
Dabbler
Hello, and thank you all in advance for any help you can provide for me.
My Freenas system is composed of the following hardware built in January 2016:
Six (6) of WD Red 4TB WD40EFRX NAS Hard Drive that are NOT connected to the Marvel SATA ports. Instead they are on the four blue and two white sata ports all adjacent to each other.
Four (4) of MICRON MT18KSF1G72AZ-1G6E1ZE 8GB (1X8GB)1600MHZ PC3-12800 CL11 ECC REGISTERED DUAL RANK DDR3 SDRAM 240 PIN DIMM
Power Supply: Antec Earthwatts Green 380W EA-380E HT
I don’t know if my Asrock C2750D4I sata controlling hardware is bad, or if I have a bad WD Red drive, or something else.
Starting on February 21st my daily system report emails showed long strings that repeat this:
(ada3:ahcich13:0:0:0): READ_DMA48. ACB: 25 00 68 38 7c 40 84 01 00 00 d0 00
> (ada3:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada3:ahcich13:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada3:ahcich13:0:0:0): RES: 51 40 af 38 7c 40 84 01 00 7f 00
> (ada3:ahcich13:0:0:0): Retrying command
or
> (ada3:ahcich13:0:0:0): Retrying command
> (ada3:ahcich13:0:0:0): READ_FPDMA_QUEUED. ACB: 60 e8 48 40 3f 40 83 01 00 00 00 00
> (ada3:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada3:ahcich13:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada3:ahcich13:0:0:0): RES: 41 40 9f 40 3f 40 83 01 00 00 00
> (ada3:ahcich13:0:0:0): Error 5, Retries exhausted
So I assumed my ADA3 device has an issue, and after looking at forum threads I tried to do some short and long SMART tests, although I don’t know how to really look at the long test results. I saved most of the output of whatever test i ran and I didn’t judge it a bad result, but I really am not well educated on this to properly troubleshoot. I checked my sata and power cable connections on each end, and rebooted, but when the ATA status errors persisted, I swapped out the SATA cable. The errors remained. I have not yet tried to move any sata connections on the motherboard; just scared to try anything without expert guidance and I read that I shouldn’t use the Marvel ports.
Starting on March 26th I got an email report that gives this information, so I turned it off and unplugged it.:
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
freenas-boot 29.8G 562M 29.2G — — 1% 1.00x ONLINE —
pool1 21.8T 13.4T 8.39T — 10% 61% 1.00x DEGRADED /mnt
pool: pool1
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using ‘zpool online’ or replace the device with
‘zpool replace’.
scan: scrub in progress since Sun Mar 25 00:00:04 2018
7.02T scanned out of 13.3T at 75.7M/s, 24h19m to go
0 repaired, 52.61% done
config:
NAME STATE READ WRITE CKSUM
pool1 DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
gptid/fd6d1689-88f5-11e5-ae91-d05099c00684 ONLINE 0 0 0
gptid/fe521d35-88f5-11e5-ae91-d05099c00684 ONLINE 0 0 0
gptid/ff38e329-88f5-11e5-ae91-d05099c00684 ONLINE 0 0 0
14083642073395126713 REMOVED 0 0 0 was /dev/gptid/0021127e-88f6-11e5-ae91-d05099c00684
gptid/010698d4-88f6-11e5-ae91-d05099c00684 ONLINE 0 0 0
gptid/01ebee95-88f6-11e5-ae91-d05099c00684 ONLINE 0 0 0
Just today I booted this NAS back up, and looked in the BIOS because I knew the system time had been wrong for a while an wanted to fix it, and then realized that in the BIOS it did not say SMART was enabled. I enabled it. I had been trying to set the SMART schedules to run for quite some time, but I couldn’t tell if it was working or how to make an output file to review. Anyway, once it booted up, I saw this:
This is where I ask for help and wait to be told that my SMART isn’t turned on/get yelled at for not doing things right and not reading the manual. I tried to read the manual, and I feel like I made a good attempt at selecting hardware based on the advice on this board at the time. If I need to buy a new drive, let me know. If you have any instructions for me, please give me as much detail in the steps as possible, since I have to do internet searches to find the proper commands.
Thanks.
Источник
Cam status: SCSI Status Error
Dabbler
I built my TrueNAS a couple moinths ago, space is (temporarily) about 80% full, but the NAS is only used lighly with SMB.
I get this error about once a week:
smartctl -a /dev/da5:
zpool status tank:
dmesg | grep mps:
It is always disk5.
Read write cksum is either 0 0 0 or 0 1 0 — never more.
After a scrub it is fine — for a few days.
PSU and SAS controller (Dell H310) are new.
What I’ve tried:
— mounted a fan on the SAS controller
— cheked and swapped power and SATA connectors (still same cables) for disk5
I will change power and SATA cables next.
Could the SAS controller be faulty and lead to such an error?
Should I replace disk5? SMART has 2 stored errors for that disk.
Dabbler
Dabbler
After replacing the SAS/SATA cable, the machine ran smooth for 12 days.
But now the error is back!
SMART results are good. zpool status has 1 Write error, resilvered 84K.
I just realized the following:
The last few times the error / command was very similar:
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 01 01 c8 e8 08 00 00 00 08 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 7b 77 df 08 00 00 00 08 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 68 66 7d 98 00 00 00 10 00 00
(da5:mps0:0:5:0): WRITE(16). CDB: 8a 00 00 00 00 02 68 61 9e b0 00 00 00 08 00 00
Could it be a defect sector on the drive?
Can I leave the disk installed as long as it’s just those single write errors? Or do you recommend replacing that drive?
Источник
cam status ATA status error
Aschtra
Dabbler
I’ve been experiencing some problems with my Freenas server. When I run dmesg from the shell then I see alot of cam status ATA status errors which I don’t know what they mean. Here is a screenshot of it when it occured again. It does not always occur:
I have the following hardware:
HP Microserver Gen8
16GB ECC RAM
4x4TB WD RED NAS drives (which I bought new, 2,5 weeks ago and replaces all my previous drives which were 4x2TB drives)
Freenas11.1-U4
The error message says something about ADA1, but all hard drives are new so I don’t expect something to be wrong with it.
I also got this error sometimes before I replaced my 4 drives. I searched this error on google but everyones situation is different so that’s why I made a new topic for this.
I read about replacing sata cables, but the Gen8 server does not use normal sata cables. But this one:
Zpool status -v output:
Anyone has a idea what I can do to fix this?
kdragon75
Wizard
Aschtra
Dabbler
I will reseat them first. I did not burn my drives in. Don’t know how.
I just did a short smartctl test and I think these are the results?:
I am afraid I can’t really reseat the sata cables. They look like this (yes I put the screws back in again)
kdragon75
Wizard
Aschtra
Dabbler
Done that. I’ll wait.
My server is running the following:
— Plex
— Radarr
— Sonarr
— Sabnzbd
4x4TB in a raidz1.
Anything else I can do for now to provide more information maybe?
It happend again, right after my girlfriend tried to watch a show on Plex. Will try something with that file, delete it or something. Play another show, see if that one works
kdragon75
Wizard
Aschtra
Dabbler
Yes always on ada1. I’ve replaced the episode she was watching with a different one and now it doesn’t occur anymore when she watches it. Weird issue, it crashed my whole freenas.. :p
Will keep an eye on it though.
kdragon75
Wizard
Aschtra
Dabbler
The drives are brand new so I don’t suspect the HDD right now. Will switch them up though just to see what happens.
Also will do more scrubs. Had them on each 35 days but will do 7 days now
Jailer
Not strong, but bad
kdragon75
Wizard
Exactly. A brand spanking new drive is more likely to fail than a drive that’s been running for 6 months.
As @Jailer mentioned, smart tests long and short need to be scheduled to help monitor and test your drives.
Johnnie Black
It may be new but this is not a good sign:
kdragon75
Wizard
Yeah its normal to have some raw read errors and have it count up over the life of the drive but if its shooting up. Better get a new drive.
Do some searching on the forum about drive testing and burn-in. It’s important for all drives but especially new drives.
FreeNAS Generalissimo
Chris Moore
Hall of Famer
I have a server at work that has 60 hard drives in it and in the first 6 months four of them failed. One of them catastrophically.
Just because a drive is new, that doesn’t mean it can’t fail.
Sent from my SAMSUNG-SGH-I537 using Tapatalk
This one was built in 2018, but I reused the name from a previous build. This is the 8th FreeNAS unit I have built for home. Eight systems in ten years. I made some mistakes along the way, learned some and I try to share some of those lessons learned experiences here in the forum. I have even put together some hardware just to test things out a time or two.
For a while I had three systems, all at once, at home but I am making some hardware changes right now and only one NAS is online.
The three pools in this one system represent the three NAS systems I had before the consolidation. For a home NAS, this chassis is huge, able to hold 48 data drives and two boot drives with a couple spaces internally for non-hot-swap drives.
Источник
I have had errors showing up shortly after I upgraded from FreeNAS 11.3-U5 to TrueNAS-12.0-U5. (I plan on upgrading to TrueNAS-12.0-U6 next weekend assuming that the release goes of smoothly.) I don’t know if there is a connection or if it is just coincidence.
I tried searching, but I couldn’t find a readable explanation of what causes/what is happening when an IDNF error is reported. (Is this a non-existent LBA or is it something else?)
I got message about an unrecoverable error on my main pool, so I had a look at the smart parameters, and it appears that the issue had to do with the error message: Error: IDNF at LBA = ...
. The first time it happened, I was able to clear the error and I had about 100kb resilvered. I ran a badblocks (read only) on the disk as well as a full smart surface test and a zpool scrub—other than the IDNF errors there is no other indication of a problem.
I had a recurrence of the pool error condition, and that was easily cleared, and the passed a scrub without incident.
Any insight as as to if this Is this a TrueNAS error (ZFS issue), a hardware issue (Drive DA3, the HBA, a Memory Fault-NonECC) or just corruption (System experienced a bad shutdown due to UPS failure)?
I have included a grep of the relevent messages from /var/log/messages as well as the output from smartctl below. Any insight/assistance would be much appreciated. Thanks in advance.
Log Entries Related to First incident
Code:
Sep 24 05:31:36 freenas (da3:mps0:0:5:0): CAM status: SCSI Status Error Sep 24 05:31:36 freenas (da3:mps0:0:5:0): SCSI status: Check Condition Sep 24 05:31:36 freenas (da3:mps0:0:5:0): SCSI sense: ILLEGAL REQUEST asc:21,0 (Logical block address out of range) Sep 24 05:31:36 freenas (da3:mps0:0:5:0): Info: 0x214e59dc0 Sep 24 05:31:36 freenas (da3:mps0:0:5:0): Error 22, Unretryable error
Log Entries Related to Second incident
Code:
Oct 2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 1646 loginfo 31080000 Oct 2 14:02:00 freenas mps0: (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6a b0 00 00 30 00 Oct 2 14:02:00 freenas Controller reported scsi ioc terminated tgt 5 SMID 909 loginfo 31080000 Oct 2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 1816 loginfo 31080000 Oct 2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 1934 loginfo 31080000 Oct 2 14:02:00 freenas mps0: (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:00 freenas Controller reported scsi ioc terminated tgt 5 SMID 147 loginfo 31080000 Oct 2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 812 loginfo 31080000 Oct 2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 521 loginfo 31080000 Oct 2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 2072 loginfo 31080000 Oct 2 14:02:00 freenas mps0: Controller reported scsi ioc terminated tgt 5 SMID 1786 loginfo 31080000 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6a 58 00 00 28 00 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b 08 00 00 30 00 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6a e0 00 00 28 00 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b 38 00 00 28 00 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b 90 00 00 30 00 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b c0 00 00 28 00 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b 68 00 00 28 00 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:00 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:00 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6b e8 00 00 30 00 Oct 2 14:02:00 freenas (da3:mps0:0:5:0): CAM status: CCB request completed with an error Oct 2 14:02:01 freenas (da3:mps0:0:5:0): Retrying command, 3 more tries remain Oct 2 14:02:01 freenas (da3:mps0:0:5:0): WRITE(10). CDB: 2a 00 4f 20 6a 80 00 00 30 00 Oct 2 14:02:01 freenas (da3:mps0:0:5:0): CAM status: SCSI Status Error Oct 2 14:02:01 freenas (da3:mps0:0:5:0): SCSI status: Check Condition Oct 2 14:02:01 freenas (da3:mps0:0:5:0): SCSI sense: ILLEGAL REQUEST asc:21,0 (Logical block address out of range) Oct 2 14:02:01 freenas (da3:mps0:0:5:0): Info: 0x4f206a80 Oct 2 14:02:01 freenas (da3:mps0:0:5:0): Error 22, Unretryable error
Latest Smart Output:
Code:
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Red Device Model: WDC WD60EFRX-68MYMN1 Serial Number: WD-REDACTED LU WWN Device Id: 5 0014ee 20b559a84 Firmware Version: 82.00A82 User Capacity: 6,001,175,126,016 bytes [6.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5700 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon Oct 4 16:21:41 2021 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, NOT FROZEN [SEC1] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 4784) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 702) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0 3 Spin_Up_Time POS--K 221 190 021 - 7941 4 Start_Stop_Count -O--CK 100 100 000 - 472 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 031 031 000 - 50928 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 113 192 Power-Off_Retract_Count -O--CK 200 200 000 - 77 193 Load_Cycle_Count -O--CK 187 187 000 - 40408 194 Temperature_Celsius -O---K 108 095 000 - 44 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 0 198 Offline_Uncorrectable ----CK 100 253 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning General Purpose Log Directory Version 1 SMART Log Directory Version 1 [multi-sector log support] Address Access R/W Size Description 0x00 GPL,SL R/O 1 Log Directory 0x01 SL R/O 1 Summary SMART error log 0x02 SL R/O 5 Comprehensive SMART error log 0x03 GPL R/O 6 Ext. Comprehensive SMART error log 0x06 SL R/O 1 SMART self-test log 0x07 GPL R/O 1 Extended self-test log 0x09 SL R/W 1 Selective self-test log 0x10 GPL R/O 1 NCQ Command Error log 0x11 GPL R/O 1 SATA Phy Event Counters log 0x21 GPL R/O 1 Write stream error log 0x22 GPL R/O 1 Read stream error log 0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log 0x80-0x9f GPL,SL R/W 16 Host vendor specific log 0xa0-0xa7 GPL,SL VS 16 Device vendor specific log 0xa8-0xb6 GPL,SL VS 1 Device vendor specific log 0xb7 GPL,SL VS 40 Device vendor specific log 0xbd GPL,SL VS 1 Device vendor specific log 0xc0 GPL,SL VS 1 Device vendor specific log 0xc1 GPL VS 93 Device vendor specific log 0xe0 GPL,SL R/W 1 SCT Command/Status 0xe1 GPL,SL R/W 1 SCT Data Transfer SMART Extended Comprehensive Error Log Version: 1 (6 sectors) Device Error Count: 6 CR = Command Register FEATR = Features Register COUNT = Count (was: Sector Count) Register LBA_48 = Upper bytes of LBA High/Mid/Low Registers ] ATA-8 LH = LBA High (was: Cylinder High) Register ] LBA LM = LBA Mid (was: Cylinder Low) Register ] Register LL = LBA Low (was: Sector Number) Register ] DV = Device (was: Device/Head) Register DC = Device Control Register ER = Error register ST = Status register Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 6 [5] occurred at disk power-on lifetime: 50878 hours (2119 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 10 -- 51 00 00 00 00 4f 20 6a 80 40 00 Error: IDNF at LBA = 0x4f206a80 = 1327524480 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 61 00 30 00 00 00 00 4f 20 6a 80 40 00 7d+03:19:07.767 WRITE FPDMA QUEUED ea 00 00 00 00 00 00 00 00 00 00 40 00 7d+03:19:06.998 FLUSH CACHE EXT 61 00 08 00 10 00 02 ba a0 f4 00 40 00 7d+03:19:06.997 WRITE FPDMA QUEUED 61 00 08 00 00 00 02 ba a0 f2 00 40 00 7d+03:19:06.997 WRITE FPDMA QUEUED 61 00 08 00 08 00 00 00 40 04 00 40 00 7d+03:19:06.997 WRITE FPDMA QUEUED Error 5 [4] occurred at disk power-on lifetime: 50677 hours (2111 days + 13 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 10 -- 51 00 00 00 02 14 e5 9d c0 40 00 Error: IDNF at LBA = 0x214e59dc0 = 8940527040 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 61 00 08 00 00 00 02 14 e5 9d c0 40 00 41d+02:52:12.844 WRITE FPDMA QUEUED ea 00 00 00 00 00 00 00 00 00 00 40 00 41d+02:51:52.870 FLUSH CACHE EXT 61 00 08 00 00 00 02 ba a0 f4 48 40 00 41d+02:51:52.869 WRITE FPDMA QUEUED 61 00 08 00 10 00 02 ba a0 f2 48 40 00 41d+02:51:52.869 WRITE FPDMA QUEUED 61 00 08 00 08 00 00 00 40 04 48 40 00 41d+02:51:52.869 WRITE FPDMA QUEUED Error 4 [3] occurred at disk power-on lifetime: 45881 hours (1911 days + 17 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 02 16 47 8b f0 40 00 Error: UNC at LBA = 0x216478bf0 = 8963722224 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 30 00 00 00 02 16 47 8b f0 40 00 3d+05:45:24.873 READ FPDMA QUEUED 60 00 28 00 00 00 02 16 47 8b 98 40 00 3d+05:45:24.832 READ FPDMA QUEUED 60 00 30 00 00 00 02 16 47 8b 68 40 00 3d+05:45:24.832 READ FPDMA QUEUED 60 00 28 00 00 00 02 16 47 8b 40 40 00 3d+05:45:24.832 READ FPDMA QUEUED 60 00 28 00 00 00 02 16 47 8b 10 40 00 3d+05:45:24.831 READ FPDMA QUEUED Error 3 [2] occurred at disk power-on lifetime: 44438 hours (1851 days + 14 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 02 5a 4c 71 00 40 00 Error: UNC at LBA = 0x25a4c7100 = 10104893696 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 30 00 00 00 02 5a 4c 71 00 40 00 7d+06:54:25.014 READ FPDMA QUEUED 60 00 28 00 00 00 02 5a 4c 70 a8 40 00 7d+06:54:25.014 READ FPDMA QUEUED 60 00 30 00 00 00 02 5a 4c 70 78 40 00 7d+06:54:25.013 READ FPDMA QUEUED 60 00 28 00 08 00 02 5a 4c 70 50 40 00 7d+06:54:25.013 READ FPDMA QUEUED 60 00 28 00 00 00 02 5a 4c 70 20 40 00 7d+06:54:25.013 READ FPDMA QUEUED Error 2 [1] occurred at disk power-on lifetime: 43391 hours (1807 days + 23 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 01 70 9d 21 c0 40 00 Error: UNC at LBA = 0x1709d21c0 = 6184313280 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 30 00 00 00 01 70 9d 21 c0 40 00 8d+04:08:24.792 READ FPDMA QUEUED 60 00 28 00 00 00 01 70 9d 21 98 40 00 8d+04:08:24.777 READ FPDMA QUEUED 60 00 28 00 00 00 01 70 9d 21 68 40 00 8d+04:08:24.777 READ FPDMA QUEUED 60 00 28 00 00 00 01 70 9d 1f 20 40 00 8d+04:08:24.776 READ FPDMA QUEUED 60 00 28 00 00 00 01 70 9d 21 10 40 00 8d+04:08:24.776 READ FPDMA QUEUED Error 1 [0] occurred at disk power-on lifetime: 40486 hours (1686 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 02 b7 0b 30 d8 40 00 Error: UNC at LBA = 0x2b70b30d8 = 11660898520 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 58 00 08 00 02 b7 0b 89 f8 40 00 33d+00:38:34.307 READ FPDMA QUEUED 60 00 58 00 00 00 02 b7 0b 30 d0 40 00 33d+00:38:34.296 READ FPDMA QUEUED 60 00 58 00 08 00 02 b7 0a d7 c0 40 00 33d+00:38:34.284 READ FPDMA QUEUED 60 00 60 00 00 00 02 b7 0a 7e 88 40 00 33d+00:38:34.284 READ FPDMA QUEUED 60 00 58 00 00 00 02 b7 0a 25 48 40 00 33d+00:38:34.270 READ FPDMA QUEUED SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 49665 - # 2 Extended offline Completed without error 00% 49598 - # 3 Extended offline Completed without error 00% 46102 - # 4 Extended offline Completed without error 00% 18012 - # 5 Extended offline Completed without error 00% 177 - # 6 Extended offline Completed without error 00% 127 - # 7 Extended offline Completed without error 00% 90 - # 8 Extended offline Completed without error 00% 55 - # 9 Extended offline Completed without error 00% 12 - #10 Short offline Completed without error 00% 0 - #11 Conveyance offline Completed without error 00% 0 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. SCT Status Version: 3 SCT Version (vendor specific): 258 (0x0102) Device State: Active (0) Current Temperature: 44 Celsius Power Cycle Min/Max Temperature: 38/47 Celsius Lifetime Min/Max Temperature: 2/57 Celsius Under/Over Temperature Limit Count: 0/0 Vendor specific: 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 SCT Temperature History Version: 2 Temperature Sampling Period: 1 minute Temperature Logging Interval: 1 minute Min/Max recommended Temperature: 0/60 Celsius Min/Max Temperature Limit: -41/85 Celsius Temperature History Size (Index): 478 (218) Index Estimated Time Temperature Celsius 219 2021-10-04 08:24 44 ************************* ... ..(476 skipped). .. ************************* 218 2021-10-04 16:21 44 ************************* SCT Error Recovery Control: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) Device Statistics (GP/SMART Log 0x04) not supported Pending Defects log (GP Log 0x0c) not supported SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 2 0 Command failed due to ICRC error 0x0002 2 0 R_ERR response for data FIS 0x0003 2 0 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0005 2 0 R_ERR response for non-data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x0008 2 0 Device-to-host non-data FIS retries 0x0009 2 1 Transition from drive PhyRdy to drive PhyNRdy 0x000a 2 4 Device-to-host register FISes sent due to a COMRESET 0x000b 2 0 CRC errors within host-to-device FIS 0x000f 2 0 R_ERR response for host-to-device data FIS, CRC 0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC 0x8000 4 797757 Vendor specific
smartctl --xall /dev/sdj
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-8-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: WDC WD40EFAX-68JH4N0
Serial Number: WD-...
LU WWN Device Id: ...
Firmware Version: 82.00A82
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Apr 17 12:44:20 2020 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
DSN feature is: Unavailable
ATA Security is: Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (21404) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 525) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x3039) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0
3 Spin_Up_Time POS--K 100 253 021 - 0
4 Start_Stop_Count -O--CK 100 100 000 - 1
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
7 Seek_Error_Rate -OSR-K 200 200 000 - 0
9 Power_On_Hours -O--CK 100 100 000 - 540
10 Spin_Retry_Count -O--CK 100 253 000 - 0
11 Calibration_Retry_Count -O--CK 100 253 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 1
192 Power-Off_Retract_Count -O--CK 200 200 000 - 0
193 Load_Cycle_Count -O--CK 200 200 000 - 7
194 Temperature_Celsius -O---K 126 113 000 - 21
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 0
198 Offline_Uncorrectable ----CK 100 253 000 - 0
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 5 Comprehensive SMART error log
0x03 GPL R/O 6 Ext. Comprehensive SMART error log
0x04 GPL R/O 256 Device Statistics log
0x04 SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x0c GPL R/O 2048 Pending Defects log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters log
0x24 GPL R/O 294 Current Device Internal Status Data log
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xa0-0xa7 GPL,SL VS 16 Device vendor specific log
0xa8-0xb6 GPL,SL VS 1 Device vendor specific log
0xb7 GPL,SL VS 78 Device vendor specific log
0xb9 GPL,SL VS 4 Device vendor specific log
0xbd GPL,SL VS 1 Device vendor specific log
0xc0 GPL,SL VS 1 Device vendor specific log
0xc1 GPL VS 93 Device vendor specific log
0xe0 GPL,SL R/W 1 SCT Command/Status
0xe1 GPL,SL R/W 1 SCT Data Transfer
SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 7487 (device log contains only the most recent 24 errors)
CR = Command Register
FEATR = Features Register
COUNT = Count (was: Sector Count) Register
LBA_48 = Upper bytes of LBA High/Mid/Low Registers ] ATA-8
LH = LBA High (was: Cylinder High) Register ] LBA
LM = LBA Mid (was: Cylinder Low) Register ] Register
LL = LBA Low (was: Sector Number) Register ]
DV = Device (was: Device/Head) Register
DC = Device Control Register
ER = Error register
ST = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 7487 [22] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 01 d1 c0 bc 10 40 00 Error: IDNF at LBA = 0x1d1c0bc10 = 7814036496
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 38 00 20 00 01 a8 8f ca 78 40 00 22d+12:23:01.073 WRITE FPDMA QUEUED
61 00 18 00 18 00 01 a8 8f ca b8 40 00 22d+12:23:01.073 WRITE FPDMA QUEUED
61 00 10 00 10 00 00 02 32 ba 10 40 00 22d+12:23:01.073 WRITE FPDMA QUEUED
61 00 10 00 08 00 01 d1 c0 bc 10 40 00 22d+12:23:01.073 WRITE FPDMA QUEUED
61 00 10 00 00 00 01 d1 c0 ba 10 40 00 22d+12:23:01.073 WRITE FPDMA QUEUED
Error 7486 [21] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 01 a8 8f c9 78 40 00 Error: IDNF at LBA = 0x1a88fc978 = 7122962808
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 58 00 00 00 01 a8 8f c9 98 40 00 22d+12:22:53.109 WRITE FPDMA QUEUED
61 00 18 00 20 00 01 a5 64 0d e8 40 00 22d+12:22:52.710 WRITE FPDMA QUEUED
61 00 10 00 18 00 01 a5 64 0d d8 40 00 22d+12:22:52.383 WRITE FPDMA QUEUED
61 00 10 00 10 00 01 a5 64 0d c8 40 00 22d+12:22:52.368 WRITE FPDMA QUEUED
61 00 18 00 08 00 01 a8 8f c9 78 40 00 22d+12:22:52.368 WRITE FPDMA QUEUED
Error 7485 [20] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 01 a8 8e f9 78 40 00 Error: IDNF at LBA = 0x1a88ef978 = 7122909560
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 38 00 00 00 01 a8 8e f9 98 40 00 22d+12:22:15.406 WRITE FPDMA QUEUED
61 00 40 00 38 00 01 a5 64 07 e8 40 00 22d+12:22:15.152 WRITE FPDMA QUEUED
61 00 40 00 30 00 01 a5 64 07 a8 40 00 22d+12:22:14.895 WRITE FPDMA QUEUED
61 00 20 00 28 00 01 a8 fc 7f 68 40 00 22d+12:22:14.895 WRITE FPDMA QUEUED
61 00 28 00 20 00 01 a8 fc 7f 40 40 00 22d+12:22:14.895 WRITE FPDMA QUEUED
Error 7484 [19] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 01 80 96 22 e8 40 00 Error: IDNF at LBA = 0x1809622e8 = 6452290280
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 10 00 18 00 01 d1 c0 bc 10 40 00 22d+12:21:34.814 WRITE FPDMA QUEUED
61 00 10 00 10 00 01 d1 c0 ba 10 40 00 22d+12:21:34.603 WRITE FPDMA QUEUED
61 00 10 00 08 00 00 02 32 ba 10 40 00 22d+12:21:34.406 WRITE FPDMA QUEUED
61 00 08 00 00 00 01 80 96 27 10 40 00 22d+12:21:34.362 WRITE FPDMA QUEUED
61 00 08 00 20 00 01 80 96 22 e8 40 00 22d+12:21:33.737 WRITE FPDMA QUEUED
Error 7483 [18] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 01 a9 39 f2 e0 40 00 Error: IDNF at LBA = 0x1a939f2e0 = 7134114528
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 08 00 00 00 01 a9 3d 99 60 40 00 22d+12:21:26.189 WRITE FPDMA QUEUED
61 00 08 00 08 00 01 a9 39 f2 e0 40 00 22d+12:21:26.189 WRITE FPDMA QUEUED
61 00 08 00 00 00 01 a9 38 d8 10 40 00 22d+12:21:26.189 WRITE FPDMA QUEUED
61 00 08 00 08 00 01 a9 38 81 68 40 00 22d+12:21:26.189 WRITE FPDMA QUEUED
61 00 08 00 00 00 01 a8 d2 c5 30 40 00 22d+12:21:26.188 WRITE FPDMA QUEUED
Error 7482 [17] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 01 a8 8d 5f d8 40 00 Error: IDNF at LBA = 0x1a88d5fd8 = 7122804696
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 98 00 20 00 01 a8 8d 5f d8 40 00 22d+12:20:49.161 WRITE FPDMA QUEUED
60 00 10 00 18 00 01 d1 c0 bc 10 40 00 22d+12:20:49.159 READ FPDMA QUEUED
60 00 10 00 10 00 01 d1 c0 ba 10 40 00 22d+12:20:49.159 READ FPDMA QUEUED
60 00 10 00 08 00 00 02 32 ba 10 40 00 22d+12:20:49.159 READ FPDMA QUEUED
61 00 08 00 00 00 01 a8 8c 90 e8 40 00 22d+12:20:49.159 WRITE FPDMA QUEUED
Error 7481 [16] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 01 a8 8d 5f 98 40 00 Error: IDNF at LBA = 0x1a88d5f98 = 7122804632
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 10 00 18 00 01 a5 64 05 00 40 00 22d+12:20:46.550 WRITE FPDMA QUEUED
61 00 18 00 10 00 01 a8 fa 57 60 40 00 22d+12:20:46.534 WRITE FPDMA QUEUED
61 00 18 00 08 00 01 a8 8d 5f b8 40 00 22d+12:20:42.224 WRITE FPDMA QUEUED
61 00 18 00 00 00 01 a8 8d 5f 98 40 00 22d+12:20:42.223 WRITE FPDMA QUEUED
61 00 18 00 08 00 01 a8 8d 5f 78 40 00 22d+12:20:42.222 WRITE FPDMA QUEUED
Error 7480 [15] occurred at disk power-on lifetime: 540 hours (22 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 01 d1 c0 bc 10 40 00 Error: IDNF at LBA = 0x1d1c0bc10 = 7814036496
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 18 00 20 00 01 a5 64 03 58 40 00 22d+12:19:13.755 WRITE FPDMA QUEUED
61 00 28 00 18 00 01 a8 f8 79 58 40 00 22d+12:19:13.745 WRITE FPDMA QUEUED
61 00 20 00 00 00 01 a8 f8 7a d0 40 00 22d+12:19:13.743 WRITE FPDMA QUEUED
61 00 10 00 10 00 00 02 32 ba 10 40 00 22d+12:19:08.985 WRITE FPDMA QUEUED
61 00 10 00 08 00 01 d1 c0 bc 10 40 00 22d+12:19:08.985 WRITE FPDMA QUEUED
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 0 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
SCT Status Version: 3
SCT Version (vendor specific): 258 (0x0102)
SCT Support Level: 1
Device State: Active (0)
Current Temperature: 21 Celsius
Power Cycle Min/Max Temperature: 16/31 Celsius
Lifetime Min/Max Temperature: 16/31 Celsius
Under/Over Temperature Limit Count: 0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 1 minute
Min/Max recommended Temperature: 0/65 Celsius
Min/Max Temperature Limit: -41/85 Celsius
Temperature History Size (Index): 478 (46)
Index Estimated Time Temperature Celsius
47 2020-04-17 04:47 26 *******
48 2020-04-17 04:48 26 *******
49 2020-04-17 04:49 26 *******
50 2020-04-17 04:50 27 ********
51 2020-04-17 04:51 26 *******
... ..( 6 skipped). .. *******
58 2020-04-17 04:58 26 *******
59 2020-04-17 04:59 25 ******
... ..( 9 skipped). .. ******
69 2020-04-17 05:09 25 ******
70 2020-04-17 05:10 24 *****
... ..(115 skipped). .. *****
186 2020-04-17 07:06 24 *****
187 2020-04-17 07:07 25 ******
... ..( 12 skipped). .. ******
200 2020-04-17 07:20 25 ******
201 2020-04-17 07:21 26 *******
... ..( 7 skipped). .. *******
209 2020-04-17 07:29 26 *******
210 2020-04-17 07:30 27 ********
... ..( 77 skipped). .. ********
288 2020-04-17 08:48 27 ********
289 2020-04-17 08:49 26 *******
... ..( 11 skipped). .. *******
301 2020-04-17 09:01 26 *******
302 2020-04-17 09:02 25 ******
... ..( 11 skipped). .. ******
314 2020-04-17 09:14 25 ******
315 2020-04-17 09:15 24 *****
... ..( 19 skipped). .. *****
335 2020-04-17 09:35 24 *****
336 2020-04-17 09:36 23 ****
... ..( 19 skipped). .. ****
356 2020-04-17 09:56 23 ****
357 2020-04-17 09:57 24 *****
... ..( 11 skipped). .. *****
369 2020-04-17 10:09 24 *****
370 2020-04-17 10:10 25 ******
371 2020-04-17 10:11 24 *****
... ..( 4 skipped). .. *****
376 2020-04-17 10:16 24 *****
377 2020-04-17 10:17 23 ****
... ..( 9 skipped). .. ****
387 2020-04-17 10:27 23 ****
388 2020-04-17 10:28 24 *****
... ..( 15 skipped). .. *****
404 2020-04-17 10:44 24 *****
405 2020-04-17 10:45 25 ******
406 2020-04-17 10:46 24 *****
... ..( 13 skipped). .. *****
420 2020-04-17 11:00 24 *****
421 2020-04-17 11:01 25 ******
422 2020-04-17 11:02 24 *****
... ..( 16 skipped). .. *****
439 2020-04-17 11:19 24 *****
440 2020-04-17 11:20 23 ****
... ..( 5 skipped). .. ****
446 2020-04-17 11:26 23 ****
447 2020-04-17 11:27 22 ***
448 2020-04-17 11:28 23 ****
449 2020-04-17 11:29 23 ****
450 2020-04-17 11:30 23 ****
451 2020-04-17 11:31 22 ***
452 2020-04-17 11:32 23 ****
453 2020-04-17 11:33 23 ****
454 2020-04-17 11:34 23 ****
455 2020-04-17 11:35 24 *****
456 2020-04-17 11:36 23 ****
457 2020-04-17 11:37 23 ****
458 2020-04-17 11:38 24 *****
459 2020-04-17 11:39 24 *****
460 2020-04-17 11:40 24 *****
461 2020-04-17 11:41 23 ****
462 2020-04-17 11:42 24 *****
... ..( 17 skipped). .. *****
2 2020-04-17 12:00 24 *****
3 2020-04-17 12:01 23 ****
... ..( 12 skipped). .. ****
16 2020-04-17 12:14 23 ****
17 2020-04-17 12:15 22 ***
18 2020-04-17 12:16 23 ****
... ..( 5 skipped). .. ****
24 2020-04-17 12:22 23 ****
25 2020-04-17 12:23 24 *****
26 2020-04-17 12:24 24 *****
27 2020-04-17 12:25 23 ****
28 2020-04-17 12:26 19 -
29 2020-04-17 12:27 19 -
30 2020-04-17 12:28 21 **
31 2020-04-17 12:29 21 **
32 2020-04-17 12:30 22 ***
... ..( 9 skipped). .. ***
42 2020-04-17 12:40 22 ***
43 2020-04-17 12:41 23 ****
44 2020-04-17 12:42 23 ****
45 2020-04-17 12:43 23 ****
46 2020-04-17 12:44 21 **
SCT Error Recovery Control:
Read: 70 (7.0 seconds)
Write: 70 (7.0 seconds)
Device Statistics (GP/SMART Log 0x04) not supported
Pending Defects log (GP Log 0x0c) supported [please try: '-l defects']
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 2 82 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 83 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 0 Non-CRC errors within host-to-device FIS
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x8000 4 1945390 Vendor specific
-
Summary
-
Files
-
Reviews
-
Support
-
Code
-
Mailing Lists
-
Old MLs
Menu
▾
▴
From: Bruce Allen <ba…@gr…> — 2004-05-10 22:05:22 |
Tobias, I'm referring to: http://www.hitachigst.com/tech/techlib.nsf/techdocs/53989D390D44D88F86256D1F0058368D which are the OEM specs for your disk. > SMART Error Log Version: 1 > Error 98 occurred at disk power-on lifetime: 1415 hours (58 days + 23 > hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 10 59 01 e0 00 d6 e6 Error: IDNF at LBA = 0x06d600e0 = 114688224 This means that the specified sector, in this case Logical Block Address 114688224, was not found. According to the specs for your disk, the total number of sectors is 117,210,240 (see page 25 of URL above). So the failure to find block 114,688,224 can mean one of only two things, I think: (1) drive is faulty, or (2) drive is has been set (using a software command) with a protected area which you can't read. This is done using the Read Native Max and Set Max Address commands. There are documented in the manual above. There is probably some standard utility for doing the 'Set Max Address' command. A few seconds of Google searching turned up this: http://www.win.tue.nl/~aeb/linux/setmax.c > The strange thing is that those IDNF at LBA errors appear to be in a > space which isn't user-addressable. According to the manual, it could/should be user addressable. > My hard drive comes with a hidden Predesktop Area "partition" installed > by the manufacturer - this area contains Win XP recovery data as well > as some system tools and its default is to be invisible to operating > systems (you can define this in the machine's BIOS). That's the 'protected area'. You're running into trouble, I think, because of this. > Here are the available hard drive sectors (hdparm's output): > > LBA user addressable sectors: 110833010 (Predesktop Area enabled) > LBA user addressable sectors: 117210244 (Predesktop Area disabled) > > The IDNF at LBA errors are at 114688224 Exactly. This may be because your disk partitioning utilities are writing a partition table using ALL 117210244 sectors not the 110833010 that are available. > Are those errors a concern? Yes. There is some utility in your toolset (fdisk, cfdisk, mkfs?) that appears to be trying to use more than 110833010 sectors. > What about the invalid checksum of the Self-Test Log Structure? It's the SELECTIVE Self-Test Log. I wouldn't worry about it -- the disk is ATA-6 not ATA-7, so the support for selective self-tests is not obeying the ATA specs, which only introduced selective self-tests with ATA-7. > Is there a way to correct those errors? I'm scared.. I think it's OK. > The hard drive was in a perfect condition before I tried to configure > my partitions with the SUSE installer. :/ There hasn't been any error > before. Use fdisk -lu to understand if the partition table extends beyond 110833010 sectors. If so, fix it. If not, try to understand what utility is ignoring this limit and trying to read/write beyond it. > Could you please CC me as I'm currently not subscribed to this list? Likewise please continue to cc the list so that there is a record of this in the mail archive. Bruce |
View entire thread
-
The drive firmware runs the tests.
-
The details of the tests can be read in eg www.t13.org/Documents/UploadedDocuments/technical/e01137r0.pdf, which summarises the elements of the short and long tests thus:
-
an electrical segment wherein the drive tests its own electronics. The particular tests in this segment
are vendor specific, but as examples: this segment might include such tests as a buffer RAM test, a
read/write circuitry test, and/or a test of the read/write head elements. -
a seek/servo segment wherein the drive tests it capability to find and servo on data tracks. The
particular methodology used in this test is also vendor specific. -
a read/verify scan segment wherein the drive performs read scanning of some portion of the disk
surface. The amount and location of the surface scanned are dependent on the completion time
constraint and are vendor specific. -
The criteria for the extended self-test are the same as the short self-test with two exceptions: segment
(3) of the extended self-test shall be a read/verify scan of all of the user data area, and there is no
maximum time limit for the drive to perform the test.
-
-
It is safe to perform non-destructive testing while the OS is running, though some performance impact is likely. As the
smartctl
man page says for both-t short
and-t long
,
This command can be given in normal system operation (unless run in captive mode)
If you invoke captive mode with -C
, smartctl
assumes the drive can be busied-out to unavailability. This should not be done on a drive the OS is using.
As the man page also suggests, the offline testing (which simply means periodic background testing) is not reliable, and never officially became part of the ATA specifications. I run mine from cron, instead; that way I know when they should happen, and I can stop it if I need to.
- The results can be seen in the
smartctl
output. Here’s one with a test running:
[root@risby images]# smartctl -a /dev/sdb smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.1.6-201.fc22.x86_64] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org [...] SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 20567 - # 2 Extended offline Completed without error 00% 486 - SMART Selective self-test log data structure revision number 0 Note: revision number not 1 implies that no selective self-test has ever been run SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Self_test_in_progress [90% left] (0-65535) 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing
Note two previous completed tests (at 486 and 20567 hours power-on, respectively) and the current running one (10% complete).
Is it a good practice to ignore IDNF errors in smart logs if no other
evidence of problems (including smart self test and badblocks run) has been
found? Does anyone have a experience with this? The details follows.
During checks to see if a 2 year old drive is still fine (for my notebook), and I noticed the following errors in smart log (excerpt from smartctl -a /dev/sdd
):
SMART Error Log Version: 1
ATA Error Count: 33 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 33 occurred at disk power-on lifetime: 6 hours (0 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
10 51 00 00 00 00 00 Error:
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
a1 00 00 00 00 00 a0 00 00:05:34.511 IDENTIFY PACKET DEVICE
25 00 00 00 00 00 e0 ff 00:05:34.500 READ DMA EXT
25 00 01 00 00 00 e0 00 00:05:30.790 READ DMA EXT
25 00 01 00 00 00 e0 00 00:05:29.550 READ DMA EXT
25 00 01 00 00 00 e0 00 00:05:29.549 READ DMA EXT
Error 32 occurred at disk power-on lifetime: 6 hours (0 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
10 51 32 9c fd ff 0f Error: IDNF 50 sectors at LBA = 0x0ffffd9c = 268434844
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 32 9c fd ff e0 00 00:00:51.163 READ DMA EXT
25 00 32 9c fd ff 0f 04 00:00:51.156 READ DMA EXT
25 00 32 9c fd ff e0 00 00:00:51.074 READ DMA EXT
25 00 32 9c fd ff 0f 04 00:00:51.068 READ DMA EXT
25 00 32 9c fd ff e0 00 00:00:50.985 READ DMA EXT
All remaining 3 errors stored in smart log are the same as the last one (IDNF error at 0x0ffffd9c
).
As I understand it, the error means sector ID not found, so it does not directly mean that the drive is bad, but it’s still fishy.
No smart attributes shows problems (eg. no reallocated sectors):
# smartctl --attributes /dev/sdd
smartctl 6.2 2013-07-26 r3841 [i686-linux-3.13.10-200.fc20.i686] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 0
3 Spin_Up_Time 0x0007 217 217 033 Pre-fail Always - 1
4 Start_Stop_Count 0x0012 099 099 000 Old_age Always - 1659
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0
9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 3856
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1659
191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 29
193 Load_Cycle_Count 0x0012 056 056 000 Old_age Always - 441546
194 Temperature_Celsius 0x0002 206 206 000 Old_age Always - 29 (Min/Max 14/46)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
223 Load_Retry_Count 0x000a 100 100 000 Old_age Always - 0
Also smart selftest reports no problems:
# smartctl -t long /dev/sdd
... after a while ...
# smartctl -l selftest /dev/sdd
smartctl 6.2 2013-07-26 r3841 [i686-linux-3.13.10-200.fc20.i686] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 3853 -
# 2 Extended offline Completed without error 00% 3851 -
# 3 Short offline Completed without error 00% 3847 -
And just to be sure, I run badblocks
overwriting the whole disk with random data to check if there are some bad sectors. And no problems found either:
# badblocks -s -w -v -t random /dev/sdd
Checking for bad blocks in read-write mode
From block 0 to 312571223
Testing with random pattern: done
Reading and comparing: done
Pass completed, 0 bad blocks found. (0/0/0 errors)
Update: I have added output from smartctl
short tests for all drives. One seems to be generating errors, but overall tests have passed.
A few days ago I set up a fresh software RAID10 (mdadm) using 4 4TB hard drives.
According to the output from mdadm -d
and cat /proc/mdstat
, all RAID devices are functioning properly. However, straight away, I noticed that operations such as creating a directory would take a few seconds around half of the time.
Resyncing is also taking an incredibly long time (up and running since 25th of October, but only 10% synced).
And lastly, since today, I have had issues accessing the mounted drive array. When using cd
the shell hangs once I use tab to autocomplete the path past the drive’s root directory. When using df -h
on a remote host that mounts the RAID array as a Samba share, I get the message ‘Host is down’ or ‘Resource temporarily unavailable’ after a minute or two.
$ sudo mdadm -D /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Mon Oct 25 20:18:15 2021
Raid Level : raid10
Array Size : 7813769216 (7451.79 GiB 8001.30 GB)
Used Dev Size : 3906884608 (3725.90 GiB 4000.65 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Fri Oct 29 05:24:29 2021
State : active, resyncing
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : near=2
Chunk Size : 512K
Consistency Policy : bitmap
Resync Status : 10% complete
Name : i7-harvester:1 (local to host i7-harvester)
UUID : eb089c26:4b9b1dc7:7d9481f9:69e5a3df
Events : 42562
Number Major Minor RaidDevice State
0 8 17 0 active sync set-A /dev/sdb1
1 8 33 1 active sync set-B /dev/sdc1
2 8 49 2 active sync set-A /dev/sdd1
3 8 65 3 active sync set-B /dev/sde1
(note the speed reported by mdstat
used to be around 3000k, but has since today decreased to 3K/sec)
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid10 sde1[3] sdd1[2] sdc1[1] sdb1[0]
7813769216 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
[==>..................] resync = 10.7% (837141440/7813769216) finish=33547280.3min speed=3K/sec
bitmap: 53/59 pages [212KB], 65536KB chunk
df
on remote system
$ df -h
df: /mnt/data: Resource temporarily unavailable
Filesystem Size Used Avail Use% Mounted on
/dev/root 59G 13G 44G 22% /
devtmpfs 1.8G 0 1.8G 0% /dev
tmpfs 1.9G 8.0K 1.9G 1% /dev/shm
tmpfs 1.9G 202M 1.7G 11% /run
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/mmcblk0p1 253M 49M 204M 20% /boot
/dev/sda1 1.8T 131G 1.6T 8% /mnt/nas
$ sudo smartctl -a /dev/sdb
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi/HGST Ultrastar 7K4000
Device Model: HGST HUS724040ALA640
LU WWN Device Id: 5 000cca 22bf6d532
Add. Product Id: DELL(tm)
Firmware Version: MFAOAB50
User Capacity: 4.000.787.030.016 bytes [4,00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sat Oct 30 09:11:05 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 24) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 533) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0004 138 138 000 Old_age Offline - 75
3 Spin_Up_Time 0x0007 127 127 024 Pre-fail Always - 609 (Average 607)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 30
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000a 100 100 000 Old_age Always - 0
8 Seek_Time_Performance 0x0004 140 140 000 Old_age Offline - 26
9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 62287
10 Spin_Retry_Count 0x0012 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 30
192 Power-Off_Retract_Count 0x0032 099 099 000 Old_age Always - 1473
193 Load_Cycle_Count 0x0012 099 099 000 Old_age Always - 1473
194 Temperature_Celsius 0x0002 157 157 000 Old_age Always - 38 (Min/Max 18/47)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
223 Load_Retry_Count 0x000a 100 100 000 Old_age Always - 0
241 Total_LBAs_Written 0x0012 100 100 000 Old_age Always - 265905350843
242 Total_LBAs_Read 0x0012 100 100 000 Old_age Always - 1599707339787
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 62275 -
# 2 Short offline Completed without error 00% 61911 -
# 3 Short offline Completed without error 00% 12 -
# 4 Extended offline Completed without error 00% 12 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
$ sudo smartctl -a /dev/sdc
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi/HGST Ultrastar 7K4000
Device Model: HGST HUS724040ALA640
LU WWN Device Id: 5 000cca 23dc4c637
Add. Product Id: DELL(tm)
Firmware Version: MFAOAB50
User Capacity: 4.000.787.030.016 bytes [4,00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sat Oct 30 09:13:13 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 24) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 553) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0004 139 139 000 Old_age Offline - 73
3 Spin_Up_Time 0x0007 125 125 024 Pre-fail Always - 624 (Average 611)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 27
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000a 100 100 000 Old_age Always - 0
8 Seek_Time_Performance 0x0004 138 138 000 Old_age Offline - 27
9 Power_On_Hours 0x0012 093 093 000 Old_age Always - 50333
10 Spin_Retry_Count 0x0012 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 27
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 380
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 380
194 Temperature_Celsius 0x0002 162 162 000 Old_age Always - 37 (Min/Max 20/46)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
223 Load_Retry_Count 0x000a 100 100 000 Old_age Always - 0
241 Total_LBAs_Written 0x0012 100 100 000 Old_age Always - 186666664018
242 Total_LBAs_Read 0x0012 100 100 000 Old_age Always - 1482419943771
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 50321 -
# 2 Short offline Completed without error 00% 49927 -
# 3 Short offline Completed without error 00% 11 -
# 4 Extended offline Completed without error 00% 11 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
$ sudo smartctl -a /dev/sdd
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD40EFRX-68WT0N0
LU WWN Device Id: 5 0014ee 2b6822d7f
Firmware Version: 82.00A82
User Capacity: 4.000.787.030.016 bytes [4,00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Oct 30 09:14:05 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (51720) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 517) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 209 177 021 Pre-fail Always - 6508
4 Start_Stop_Count 0x0032 098 098 000 Old_age Always - 2532
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0
9 Power_On_Hours 0x0032 060 060 000 Old_age Always - 29367
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 21
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 15
193 Load_Cycle_Count 0x0032 196 196 000 Old_age Always - 14465
194 Temperature_Celsius 0x0022 115 097 000 Old_age Always - 37
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0
SMART Error Log Version: 1
ATA Error Count: 2
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 2 occurred at disk power-on lifetime: 28742 hours (1197 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
10 51 10 e0 32 1e e0 Error: IDNF at LBA = 0x001e32e0 = 1979104
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ca 00 10 e0 32 1e e0 08 3d+00:35:32.256 WRITE DMA
ca 00 08 00 29 0c e0 08 3d+00:35:32.245 WRITE DMA
ca 00 08 c0 0f 19 e0 08 3d+00:35:32.234 WRITE DMA
ca 00 40 a8 44 1f e0 08 3d+00:35:32.175 WRITE DMA
Error 1 occurred at disk power-on lifetime: 28670 hours (1194 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
10 51 08 80 06 4c e0 Error: IDNF at LBA = 0x004c0680 = 4982400
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ca 00 08 80 06 4c e0 08 13d+13:29:55.786 WRITE DMA
b0 d0 01 00 4f c2 00 08 13d+13:29:47.191 SMART READ DATA
ec 00 00 00 00 00 00 08 13d+13:29:47.191 IDENTIFY DEVICE
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 29354 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
$ sudo smartctl -a /dev/sde
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Desktop HDD.15
Device Model: ST4000DM000-1F2168
LU WWN Device Id: 5 000c50 05070b3a5
Firmware Version: CC52
User Capacity: 4.000.787.030.016 bytes [4,00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5900 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is: Sat Oct 30 09:15:01 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 623) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 540) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 111 099 006 Pre-fail Always - 33492856
3 Spin_Up_Time 0x0003 093 091 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 256
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 067 060 030 Pre-fail Always - 5757534
9 Power_On_Hours 0x0032 092 092 000 Old_age Always - 7873
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 131
183 Runtime_Bad_Block 0x0032 087 087 000 Old_age Always - 13
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 099 087 000 Old_age Always - 1002 1042 1081
189 High_Fly_Writes 0x003a 099 099 000 Old_age Always - 1
190 Airflow_Temperature_Cel 0x0022 066 045 045 Old_age Always In_the_past 34 (Min/Max 25/36)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 101
193 Load_Cycle_Count 0x0032 090 090 000 Old_age Always - 21749
194 Temperature_Celsius 0x0022 034 055 000 Old_age Always - 34 (0 1 0 0 0)
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 199 194 000 Old_age Always - 790247
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 1330h+40m+55.940s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 39004091013
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 16225042845
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 7861 -
# 2 Short offline Interrupted (host reset) 90% 7762 -
# 3 Short offline Completed without error 00% 0 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
I’m at a loss on how to troubleshoot or fix (I’m new to RAIDs), so any help would be greatly appreciated.