- Печать
Страницы: [1] 2 Все Вниз
Тема: Ошибка диска: I/O error, dev sda, sector XXXXX (Прочитано 10010 раз)
0 Пользователей и 2 Гостей просматривают эту тему.

p4sh
При старте ПК наблюдаю множество ошибок в dmesg:
https://paste.ubuntu.com/p/YKY74JTwsD/
Если же прочитать любой отдельный сектор вручную получаю иногда
root@mail:~# hdparm --read-sector 25523880 /dev/sda
/dev/sda:
reading sector 25523880: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e1 01 11 04 00 00 00 a8 00 00 00 00 00 00 00 00 00 00 00 00 00 00
succeeded
Но иногда это просто succeeded, то есть сектора читаются.
Я проверил SMART — пишет что ошибок на диске нет.
Проблема в том, что по истечении некоторого времени одна из файловых систем (/var) становится в read-only и перестаёт работает множество программ.
Что посоветуете сделать?
ТС не появлялся на Форуме более полугода по состоянию на 22/07/2019 (последняя явка: 23/11/2018). Модератором раздела принято решение закрыть тему.
—zg_nico
« Последнее редактирование: 22 Июля 2019, 15:23:03 от zg_nico »

ALiEN175
« Последнее редактирование: 13 Августа 2018, 13:11:14 от ALiEN175 »
ASUS P5K-C :: Intel Xeon E5450 @ 3.00GHz :: 8 GB DDR2 :: Radeon R7 260X :: XFCE
ACER 5750G :: Intel Core i5-2450M @ 2.50GHz :: 6 GB DDR3 :: GeForce GT 630M :: XFCE

p4sh

bearpuh
Я проверил SMART — пишет что ошибок на диске нет.
А это ни о чем не говорит?
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 20772 25523880
# 2 Short offline Completed: read failure 90% 20677 1057345043
# 3 Short offline Completed: read failure 90% 20677 1057345043
# 4 Short offline Completed: read failure 90% 20677 1057345043
# 5 Short offline Completed: read failure 90% 20677 1057345043
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 29074 24928270
На дисках присутствуют нечитаемые сектора.
Бэкап в первую очередь, потом проверка с помощью виктории или badblocks
sudo /usr/sbin/badblocks -o /path/to/file/badblocks.list -b 4096 -s -v -t random /dev/sdX

p4sh
В том и дело, сектора читаются (или я ошибаюсь, прошу поправить):
root@mail:~# hdparm --read-sector 1873032872 /dev/sda
/dev/sda:
reading sector 1873032872: succeeded
0000 0000 f40f 0c01 4442 4537 4136 3534
3937 3335 6857 5806 1400 0c01 4433 3639
.......
root@mail:~# hdparm --read-sector 148453280 /dev/sda
/dev/sda:
reading sector 148453280: succeeded
bb10 5600 0c00 0102 2e00 0000 ba10 5600
3000 0202 2e2e 0000 bc10 5600 2400 1c01
root@mail:~# hdparm --read-sector 1285929908 /dev/sda
/dev/sda:
reading sector 1285929908: SG_IO: bad/missing sense data, sb[]: 70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 b4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
succeeded
0000 0000 0000 0000 0000 0000 0000 0000
Спасибо!

bearpuh
В том и дело, сектора читаются
Чтобы в этом убедиться, необходимо проверить.
Как, я уже написал.
Я бы еще подключил к другому контроллеру/компу для проверки.

Sly_tom_cat
Беды в SMART заменой контроллера не решить.
Контроллер это обычно вылезает в :
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
Но тут чисто.

bearpuh
Беды в SMART заменой контроллера не решить.
Согласен. Смутило просто.
SG_IO: bad/missing sense data

snowin
В том и дело, сектора читаются (или я ошибаюсь, прошу поправить)
ты ошибаешься

ReNzRv
Проверять и лечить лучше с загрузочного образа Seagate Tools for DOS
командами Zero All (затирает все сектора) и Long Test (DST) — полная проверка всех секторов с переопределением бэдов на уровне контроллера диска.

p4sh
Z
man hdparm
--read-sector
Reads from the specified sector number, and dumps the contents in hex to standard output. The sector number must be given (base10) after this option. hdparm will issue a
low-level read (completely bypassing the usual block layer read/write mechanisms) for the specified sector. This can be used to definitively check whether a given sector
is bad (media error) or not (doing so through the usual mechanisms can sometimes give false positives).
ты ошибаешься
Мне не понятно, могли бы объяснить подробнее, почему при чтении hdparm получаем «SUCCESS», но сектора «не читаемые»? Это негожий софт?
« Последнее редактирование: 15 Августа 2018, 09:49:18 от p4sh »

bearpuh
hdparm получаем «SUCCESS», но сектора «не читаемые»?
А сколько времени затрачивается на чтение этого сектора?
По какому принципу та же victoria hdd определяет статус сектора — «bad»?
Прочтите это сектор викторией, возможно станет понятней.
Хотите теории, вот она, от автора smartmontools — https://www.smartmontools.org/wiki/BadBlockHowto#ext2ext3secondexample
Пользователь добавил сообщение 15 Августа 2018, 10:13:40:
Вот еще обратите внимание.
У вас несколько секторов на обоих дисках кандидаты на перемещение.
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 4
Пользователь добавил сообщение 15 Августа 2018, 10:15:17:
Им можно «дать пинка» — force rellocation.
Инфа есть в вышеуказанной ссылке по smartmontools.
« Последнее редактирование: 15 Августа 2018, 10:15:17 от bearpuh »

snowin
Им можно «дать пинка» — force rellocation.
достаточно просто записать в них и заново считать, можно несколько раз
если это бэды, винч сам их переместит, в противном случае это просто так называемые «софтовые бэды» и они должны будут исчезнуть из смарта

p4sh
Что я сделал:
загрузился с live usb, собрал массив и проверил ФС:
e2fsck -ct /dev/…
Прогнал тесты еще раз.
Перезагрузился и мониторю состояние ФС.
Также обновился smart:
Изменился Multi_Zone_Error_Rate
Остался на sda 1 сектор на перемещение: Current_Pending_Sector 1
/dev/sda
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 200 200 051 - 24
3 Spin_Up_Time POS--K 179 172 021 - 4033
4 Start_Stop_Count -O--CK 099 099 000 - 1401
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
7 Seek_Error_Rate -OSR-K 200 200 000 - 0
9 Power_On_Hours -O--CK 072 072 000 - 20819
10 Spin_Retry_Count -O--CK 100 100 000 - 0
11 Calibration_Retry_Count -O--CK 100 100 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 876
192 Power-Off_Retract_Count -O--CK 199 199 000 - 758
193 Load_Cycle_Count -O--CK 200 200 000 - 642
194 Temperature_Celsius -O---K 102 081 000 - 45
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 1
198 Offline_Uncorrectable ----CK 200 200 000 - 1
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 1
/dev/sdb
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 200 200 051 - 209
3 Spin_Up_Time POS--K 191 173 021 - 3416
4 Start_Stop_Count -O--CK 099 099 000 - 1680
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
7 Seek_Error_Rate -OSR-K 100 253 000 - 0
9 Power_On_Hours -O--CK 060 060 000 - 29215
10 Spin_Retry_Count -O--CK 100 100 000 - 0
11 Calibration_Retry_Count -O--CK 100 100 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 773
192 Power-Off_Retract_Count -O--CK 200 200 000 - 650
193 Load_Cycle_Count -O--CK 200 200 000 - 1029
194 Temperature_Celsius -O---K 103 091 000 - 44
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 0
198 Offline_Uncorrectable ----CK 200 200 000 - 4
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 3
Сейчас проверю с помощью Victoria (она же делала авторемап вроде).
Спасибо всем за ответы — очень полезный топ для меня!

snowin
Сейчас проверю с помощью Victoria (она же делала авторемап вроде).
ремап тебе не нужен
кабель, для начало, поменяй
на обоих винтах
по поводу
В том и дело, сектора читаются (или я ошибаюсь, прошу поправить):
кажется, что ты совсем не понимаешь, что делаешь и для чего
ты берешь случайный сектор на диске и проверяешь его утилитой hdparm на чтение и утверждаешь что он читается
в то время как проблемные сектора ты не проверяешь
тем не менее твои случайные, безрассудные действия (переборка рейда) привели к более хорошим результатам
но это топорный метод
« Последнее редактирование: 16 Августа 2018, 15:17:24 от snowin »
- Печать
Страницы: [1] 2 Все Вверх
Здравствуйте. Подскажите пожалуйста, что можно сделать с этим диском?(кроме как поменять)
Критичны ли эти ошибки?
[531787.817056] ata4.00: configured for UDMA/133
[531787.817074] sd 3:0:0:0: [sdd] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[531787.817078] sd 3:0:0:0: [sdd] tag#4 Sense Key : Medium Error [current]
[531787.817081] sd 3:0:0:0: [sdd] tag#4 Add. Sense: Unrecovered read error
[531787.817085] sd 3:0:0:0: [sdd] tag#4 CDB: Read(10) 28 00 05 a8 c0 80 00 04 00 00
[531787.817088] print_req_error: I/O error, dev sdd, sector 94945457
[531787.817844] ata4: EH complete
Вывод smartctl -a /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.18-15-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Re
Device Model: WDC WD2004FBYZ-01YCBB1
Serial Number: WD-WMC6N0D0Y4MT
LU WWN Device Id: 5 0014ee 05994de88
Firmware Version: RR04
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Jun 11 15:10:59 2019 +07
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 220) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 199 051 Pre-fail Always - 1
3 Spin_Up_Time 0x0027 188 182 021 Pre-fail Always - 3583
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 229
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 100 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 073 073 000 Old_age Always - 20394
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 227
16 Unknown_Attribute 0x0022 003 197 000 Old_age Always - 191014087496
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 182
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 111
194 Temperature_Celsius 0x0022 117 107 000 Old_age Always - 30
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 4
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 19429 -
# 2 Short offline Completed without error 00% 19379 -
# 3 Short offline Completed without error 00% 19330 -
# 4 Short offline Completed without error 00% 19280 -
# 5 Short offline Completed without error 00% 19230 -
# 6 Short offline Completed without error 00% 19180 -
# 7 Short offline Completed without error 00% 9228 -
# 8 Short offline Completed without error 00% 9205 -
# 9 Short offline Completed without error 00% 9181 -
#10 Short offline Completed without error 00% 9157 -
#11 Short offline Completed without error 00% 9133 -
#12 Short offline Completed without error 00% 9109 -
#13 Short offline Completed without error 00% 9085 -
#14 Short offline Completed without error 00% 9061 -
#15 Short offline Completed without error 00% 9037 -
#16 Short offline Completed without error 00% 9013 -
#17 Short offline Completed without error 00% 8989 -
#18 Short offline Completed without error 00% 8965 -
#19 Short offline Completed without error 00% 8941 -
#20 Short offline Completed without error 00% 8917 -
#21 Short offline Completed without error 00% 8893 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
-
Вопрос заданболее трёх лет назад
-
1960 просмотров
Диск вроде как нормальный.
Для начала проверьте сам кабель и разъемы.
Хотя довольно странно что он указывает конкретный sector. Попробуйте забэкапить важное и провести тест викторией.
Если диск рабочий, делайте копию диска а сам диск тестируйте MHDD
Ну, сбой при чтении. Данные слить, если есть ценное и не в рейде, диск вытащить, на стендовую машину и тестить MHDD. Если ошибка исчезнет при форматировании сектора в MHDD — ну, живы будем, не помрем. Иначе можно через MHDD же перемапить сектор, если получится конечно.
Если же все ухищрения все равно будут приводить к сбою при чтении — ну тогда R.I.P.
Пригласить эксперта
Это похоже на софтбэд. При потере питания сектор не записался полностью, от чего контрольная сумма в нем не правильная.
Нужно принудительно записать в этот сектор что-нибудь. Например нули с помощью hdparm
-
Показать ещё
Загружается…
10 февр. 2023, в 02:20
3000 руб./за проект
10 февр. 2023, в 01:33
1500 руб./за проект
10 февр. 2023, в 00:54
2000 руб./в час
Минуточку внимания
Здравствуйте! Знаю что тема HDD не раз поднималась. Но, за две недели поисков решения проблемы не нашел. Короче имеется комп «собранный с миру по нитке». Два жд 500 и 250 гб соответственно. На 500 гб поставили Win7 на 250 гб Mint19. После установки (да и во время установки) Mint не видит диск 500гб и соответственно Win7. Вернее при загрузке Mint, диск иногда появляется, но при попытке его примонтировать, выводится сообщение «Не удалось примонтировать диск 500гб. Операция отменена» и диск пропадает. В те удачные моменты когда диск виден я натравил на него fdisk. Вот вывод команды sudo fdisk -l
vladimir@vladimir-PC:~$ sudo fdisk -l
[sudo] пароль для vladimir:
Диск /dev/sda: 465,8 GiB, 500107862016 байт, 976773168 секторов
Единицы: секторов по 1 * 512 = 512 байт
Размер сектора (логический/физический): 512 байт / 4096 байт
Размер I/O (минимальный/оптимальный): 4096 байт / 4096 байт
Тип метки диска: dos
Идентификатор диска: 0x11a6c81a
Устр-во Загрузочный начало Конец Секторы Размер
Идентификатор Тип
/dev/sda1 * 2048 206847 204800 100M
7 HPFS/NTFS
/dev/sda2 206848 976771071 976564224 465,7G
7 HPFS/NTFS
Диск /dev/sdb: 232,9 GiB, 250059350016 байт, 488397168 секторов
Единицы: секторов по 1 * 512 = 512 байт
Размер сектора (логический/физический): 512 байт / 512 байт
Размер I/O (минимальный/оптимальный): 512 байт / 512 байт
Тип метки диска: dos
Идентификатор диска: 0x6e4283a3
Устр-во Загрузочный начало Конец Секторы Размер
Идентификатор Тип
/dev/sdb1 * 2048 195311615 195309568 93,1G
83 Linux
/dev/sdb2 195313662 210935807 15622146 7,5G
5 Расшир
/dev/sdb3 210935808 488396799 277460992 132,3G
83 Linux
/dev/sdb5 195313664 210935807 15622144 7,5G
82 Linux
Элементы таблицы разделов упорядочены не так, как на диске.
vladimir@vladimir-PC:~$
Gparted (когда диск виден) долго думает, а потом сообщает что-то вроде «ошибка сохранения файлов или синхронизации на /dev/sda».
Однажды Gparted увидел диск 500гб, но ничего подозрительного я не узрел, кроме раздела NTFS 100 мб в начале диска и маленького 18мб не размеченного раздела в конце диска.
Что я предпринимал:
в Win7 запретил гибридный режим сна.
пробовал в BIOS менять режимы контроллера дисков.
Прогнал диск через chkdisk на
Win7
Я предполагаю,что диск был когда-то в RAIDмассиве и хранит об этом теплые воспоминания по сей день. Но не уверен.
Подскажите пожалуйста в какую сторону смотреть и как?
Hello,
I am setting up nextcloudpi on a Raspberry pi 4 4GB. Unfortunately, after a few minutes/hours the mounted external SSD drive, plugged on a USB 3 port, fails.
returns
Code: Select all
[ 114.481827] usb 2-1: USB disconnect, device number 2
[ 114.487346] print_req_error: I/O error, dev sda, sector 1951951376
[ 114.487377] EXT4-fs warning (device sda1): ext4_end_bio:323: I/O error 10 writing to inode 60950252 (offset 0 size 520192 starting block 243994049)
[ 114.487392] Buffer I/O error on device sda1, logical block 243993666
[ 114.487417] Buffer I/O error on device sda1, logical block 243993667
[ 114.487440] Buffer I/O error on device sda1, logical block 243993668
[ 114.487459] Buffer I/O error on device sda1, logical block 243993669
[ 114.487475] Buffer I/O error on device sda1, logical block 243993670
[ 114.487491] Buffer I/O error on device sda1, logical block 243993671
[ 114.487507] Buffer I/O error on device sda1, logical block 243993672
[ 114.487522] Buffer I/O error on device sda1, logical block 243993673
[ 114.487538] Buffer I/O error on device sda1, logical block 243993674
[ 114.487554] Buffer I/O error on device sda1, logical block 243993675
[ 114.490420] print_req_error: I/O error, dev sda, sector 1950870336
[ 114.490444] EXT4-fs warning (device sda1): ext4_end_bio:323: I/O error 10 writing to inode 60950253 (offset 0 size 40960 starting block 243858802)
[ 114.490549] print_req_error: I/O error, dev sda, sector 1951952392
[ 114.490573] EXT4-fs warning (device sda1): ext4_end_bio:323: I/O error 10 writing to inode 60950254 (offset 0 size 724992 starting block 243994226)
[ 114.500871] print_req_error: I/O error, dev sda, sector 1946159120
[ 114.500890] Buffer I/O error on dev sda1, logical block 243269634, lost async page write
[ 114.500933] print_req_error: I/O error, dev sda, sector 1946160184
[ 114.500946] Buffer I/O error on dev sda1, logical block 243269767, lost async page write
[ 114.501030] JBD2: Detected IO errors while flushing file data on sda1-8
[ 114.501057] print_req_error: I/O error, dev sda, sector 973346008
[ 114.501091] Aborting journal on device sda1-8.
[ 114.501152] print_req_error: I/O error, dev sda, sector 973342720
[ 114.501166] print_req_error: I/O error, dev sda, sector 973342720
[ 114.501179] Buffer I/O error on dev sda1, logical block 121667584, lost sync page write
[ 114.501203] JBD2: Error -5 detected when updating journal superblock for sda1-8.
[ 114.501243] print_req_error: I/O error, dev sda, sector 1946636536
[ 114.501257] Buffer I/O error on dev sda1, logical block 243329311, lost async page write
[ 114.501287] print_req_error: I/O error, dev sda, sector 1950353440
[ 114.501299] Buffer I/O error on dev sda1, logical block 243793924, lost async page write
I tried to fix the error by running fsck from another computer, and the disk seems to be fine:
Code: Select all
$ sudo fsck /dev/sdc1
fsck from util-linux 2.35.1
e2fsck 1.45.6 (20-Mar-2020)
NEXTCLOUD: clean, 3691/61054976 files, 4856017/244190385 blocks
After running fsck, plugging the external SSD back to the pi and rebooting the pi, the drive behaves normally for some time before failing (sometimes a few minutes, sometimes a few hours).
I thought it was a power issue so I tried 3 different connectors:
- 1 powered usb 3 to sata HDD enclosure (Orico)
- 1 powered usb3 to sata adapter (Unitek)
- 1 unpowered usb3 to sata hdd enclosure (Orico)
The problem same problem occur every time.
I also tried to plug some another drive:
- the SSD : Crucial MX500 1TB 2.5″ SSD
- an HDD : a toshiba drive I salvaged from another laptop
The problem happened to the HDD after a longer period of time. I was using the HDD as a backup of the SSD and I noticed that it failed only after a few days.
At that point, my guesses are:
- either there was a power issue that corrupted the drives and they are now unusable (but then how comes fsck mention it «clean»? I’m not an expert I might be reading it wrong)
- either all my cheap enclosures are not working at all
- either the usb ports of the pi are failing
I’d like to test those guesses but I’m lost trying to figure out how.
Thank you for your help.
EDIT: . The members of this forum just helped me fix and repair a nasty hard disk error. I had run file system checks before, but what I never knew was that the default check does not update the bad block inode list.
p.H wrote:e2fsck detects and marks bad blocks only when run with the -c option.
With that one sentence, p.H saved my computer. And the advice that he and L_V gave me in this thread was priceless.
What ultimately worked for me was checking both my / (root) and /home partitions with the non-destructive read-write option, -cc from a Live CD:
Code: Select all
e2fsck -f -y -cc -C0 /dev/sda5
e2fsck -f -y -cc -C0 /dev/sda7
That check identified and repaired the affected inodes. It also wrote over the damaged files. Keep a list of those files. You will have to replace them (as explained below).
Next, I ran the checks again with the read-only option -c:
Code: Select all
e2fsck -f -y -c -C0 /dev/sda5
e2fsck -f -y -c -C0 /dev/sda7
Running the check a second time was an important step because it added a few more blocks to the bad blocks list.
Having repaired the file system, the next step was to repair the affected files:
p.H wrote:Note that e2fsck can remap bad blocks but cannot restore the unreadable contents of the affected files, so these files must be reinstalled from their respective packages.
In my case, I had a fresh install of Debian Buster and a Debian Buster Live CD, so I just copied them from the Live CD:
Code: Select all
mkdir /media/inspiron
mount /dev/sda5 /media/inspiron
cp /usr/bin/$FILE01 /media/inspiron/usr/bin/$FILE01
cp /usr/bin/$FILE02 /media/inspiron/usr/bin/$FILE02
...
umount /dev/sda5
After that, the computer booted like a charm. Importantly, it shutdown like a charm too. There were no priority 0 or 1 messages in my journalctl.
Thank you to p.H and L_V for helping me rescue this old machine! .
—————————————-
ORIGINAL POST:
After a fresh installation of Debian Buster on an old machine, the partition that contains my /home partition does not unmount at shutdown. The problem seems to be caused by an I/O error. At first glance, smartctl does not show any errors, but a deeper looks shows that the disk experienced a few errors on the / (root) partition a few years ago.
If I followed Linux Admins’ «Fixing disk problems» guide would that resolve the issue?
Thanks in advance,
— Soul
Code: Select all
$ journalctl -r -b -1 -p3
-- Logs begin at Sun 2019-05-19 13:22:05 EDT, end at Sun 2019-05-19 15:26:53 EDT. --
May 19 14:51:02 inspiron systemd[1]: Failed unmounting /home.
May 19 14:51:02 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 14:51:02 inspiron kernel: ata1.00: error: { UNC }
May 19 14:51:02 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 14:51:02 inspiron kernel: ata1.00: cmd 60/08:88:c8:a3:b6/00:00:09:00:00/40 tag 17 ncq dma 4096 in
res 41/40:08:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 14:51:02 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 14:51:02 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 14:51:02 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 action 0x0
May 19 14:50:59 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 14:50:59 inspiron kernel: ata1.00: error: { UNC }
May 19 14:50:59 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 14:50:59 inspiron kernel: ata1.00: cmd 60/20:a8:c0:a3:b6/00:00:09:00:00/40 tag 21 ncq dma 16384 in
res 41/40:20:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 14:50:59 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 14:50:59 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 14:50:59 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x200000 SErr 0x0 action 0x0
May 19 14:50:43 inspiron wpa_supplicant[509]: dbus: wpa_dbus_property_changed: no property SessionLength in object /fi/w1/wpa_supplicant1/Interfaces/1
May 19 14:47:06 inspiron root[7585]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:40:19 inspiron root[7277]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:34:55 inspiron root[7129]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:27:20 inspiron root[6970]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:19:26 inspiron root[6425]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:13:30 inspiron root[6164]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:07:00 inspiron root[4631]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:01:11 inspiron root[3004]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:53:40 inspiron root[2451]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:46:29 inspiron root[2355]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:41:26 inspiron root[2260]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:33:31 inspiron root[1803]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:28:13 inspiron root[1633]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:26:48 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:48 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:c8:f0:00:08/00:00:0c:00:00/40 tag 25 ncq dma 4096 in
res 41/40:08:f6:00:08/00:00:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:98:90:f6:3c/00:00:0a:00:00/40 tag 19 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:90:08:3e:d1/00:00:30:00:00/40 tag 18 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:88:08:2d:8d/00:00:15:00:00/40 tag 17 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:80:c0:3e:59/00:00:09:00:00/40 tag 16 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 61/18:30:18:7f:c5/00:00:2f:00:00/40 tag 6 ncq dma 12288 out
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: WRITE FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:28:c8:6a:71/00:00:15:00:00/40 tag 5 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:48 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x20f0060 SErr 0x0 action 0x0
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 361573512
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 156843584
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 359754432
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/d0:f0:88:2c:8d/00:00:15:00:00/40 tag 30 ncq dma 106496 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/00:e8:40:3e:59/01:00:09:00:00/40 tag 29 ncq dma 131072 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:58:08:3e:d1/00:00:30:00:00/40 tag 11 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:48:f0:00:08/00:00:0c:00:00/40 tag 9 ncq dma 4096 in
res 41/40:08:f6:00:08/00:00:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:18:90:f6:3c/00:00:0a:00:00/40 tag 3 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/40:10:c0:6a:71/00:00:15:00:00/40 tag 2 ncq dma 32768 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 61/08:00:00:70:cc/00:00:31:00:00/40 tag 0 ncq dma 4096 out
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: WRITE FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:45 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x60000a0d SErr 0x0 action 0x0
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 804169080
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 361572440
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851904
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 813441024
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201848320
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/40:e0:78:a5:ee/00:00:2f:00:00/40 tag 28 ncq dma 32768 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/08:c8:30:59:d1/00:00:30:00:00/40 tag 25 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/08:c0:58:60:92/00:00:31:00:00/40 tag 24 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/58:60:58:28:8d/00:00:15:00:00/40 tag 12 ncq dma 45056 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/d8:58:00:04:08/06:00:0c:00:00/40 tag 11 ncq dma 897024 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/00:50:00:00:08/04:00:0c:00:00/40 tag 10 ncq dma 524288 in
res 41/40:00:f6:00:08/00:04:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/40:48:00:20:7c/00:00:30:00:00/40 tag 9 ncq dma 32768 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/00:40:00:f6:07/06:00:0c:00:00/40 tag 8 ncq dma 786432 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:39 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x13001f00 SErr 0x0 action 0x0
May 19 13:22:09 inspiron kernel: mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: FW version command failed -5
May 19 13:22:09 inspiron kernel: mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: Could not read FW version
May 19 13:22:05 inspiron kernel: ACPI: SPCR: Unexpected SPCR Access Width. Defaulting to byte size
Code: Select all
# fdisk -l
Disk /dev/sda: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Disk model: ST9500325AS
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x07f2837e
Device Boot Start End Sectors Size Id Type
/dev/sda1 63 208844 208782 102M de Dell Utility
/dev/sda2 * 208845 30928844 30720000 14.7G 7 HPFS/NTFS/exFAT
/dev/sda3 30928845 155775023 124846179 59.5G 7 HPFS/NTFS/exFAT
/dev/sda4 155782305 976768064 820985760 391.5G 5 Extended
/dev/sda5 * 155782368 177305599 21523232 10.3G 83 Linux
/dev/sda6 177307648 199903231 22595584 10.8G 82 Linux swap / Solaris
/dev/sda7 199905280 976766975 776861696 370.4G 83 Linux
Code: Select all
# smartctl -l selftest /dev/sda
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 0 -
Code: Select all
# smartctl -a /dev/sda
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Momentus 5400.6
Device Model: ST9500325AS
Serial Number: 6VEGMVRP
LU WWN Device Id: 5 000c50 03067dd6f
Firmware Version: D005DEM1
User Capacity: 500,107,862,016 bytes [500 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Sun May 19 15:05:07 2019 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 139) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 101 089 006 Pre-fail Always - 29958806
3 Spin_Up_Time 0x0003 099 099 085 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 091 091 020 Old_age Always - 9917
5 Reallocated_Sector_Ct 0x0033 088 088 036 Pre-fail Always - 246
7 Seek_Error_Rate 0x000f 083 060 030 Pre-fail Always - 207791365
9 Power_On_Hours 0x0032 073 073 000 Old_age Always - 23876
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 094 094 020 Old_age Always - 6861
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 1097
188 Command_Timeout 0x0032 100 096 000 Old_age Always - 3759
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 051 036 045 Old_age Always In_the_past 49 (Min/Max 49/49 #998)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 20
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 78
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 578157
194 Temperature_Celsius 0x0022 049 064 000 Old_age Always - 49 (0 18 0 0 0)
195 Hardware_ECC_Recovered 0x001a 053 045 000 Old_age Always - 29958806
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 4
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 4
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 22868 (153 213 0)
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 3790333358
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 1937597633
254 Free_Fall_Sensor 0x0032 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 987 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 987 occurred at disk power-on lifetime: 23876 hours (994 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 cb a3 b6 09 Error: UNC at LBA = 0x09b6a3cb = 162964427
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 c8 a3 b6 49 00 05:50:04.649 READ FPDMA QUEUED
60 00 28 e0 a3 b6 49 00 05:50:04.617 READ FPDMA QUEUED
60 00 08 c0 a3 b6 49 00 05:50:04.515 READ FPDMA QUEUED
27 00 00 00 00 00 e0 00 05:50:04.513 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 05:50:04.512 IDENTIFY DEVICE
Error 986 occurred at disk power-on lifetime: 23876 hours (994 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 cb a3 b6 09 Error: UNC at LBA = 0x09b6a3cb = 162964427
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 20 c0 a3 b6 49 00 05:50:02.009 READ FPDMA QUEUED
60 00 08 10 50 bb 49 00 05:50:01.961 READ FPDMA QUEUED
ea 00 00 00 00 00 a0 00 05:49:55.547 FLUSH CACHE EXT
61 00 08 a0 33 4a 49 00 05:49:55.547 WRITE FPDMA QUEUED
ea 00 00 00 00 00 a0 00 05:49:55.538 FLUSH CACHE EXT
Error 985 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 f6 00 08 0c Error: UNC at LBA = 0x0c0800f6 = 201851126
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 04:25:50.279 READ FPDMA QUEUED
60 00 08 f0 00 08 4c 00 04:25:50.257 READ FPDMA QUEUED
61 00 08 ff ff ff 4f 00 04:25:50.256 WRITE FPDMA QUEUED
60 00 08 90 f6 3c 4a 00 04:25:50.256 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 04:25:50.256 READ FPDMA QUEUED
Error 984 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 f6 00 08 0c Error: UNC at LBA = 0x0c0800f6 = 201851126
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 40 ff ff ff 4f 00 04:25:47.981 READ FPDMA QUEUED
60 00 80 28 9e 57 49 00 04:25:47.954 READ FPDMA QUEUED
60 00 40 ff ff ff 4f 00 04:25:47.953 READ FPDMA QUEUED
60 00 40 ff ff ff 4f 00 04:25:47.945 READ FPDMA QUEUED
60 00 40 ff ff ff 4f 00 04:25:47.941 READ FPDMA QUEUED
Error 983 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 f6 00 08 0c Error: UNC at LBA = 0x0c0800f6 = 201851126
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 04:25:42.214 READ FPDMA QUEUED
60 00 d8 00 04 08 4c 00 04:25:42.209 READ FPDMA QUEUED
60 00 00 00 00 08 4c 00 04:25:42.207 READ FPDMA QUEUED
60 00 00 00 f6 07 4c 00 04:25:42.207 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 04:25:42.163 READ FPDMA QUEUED
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 0 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
-
#1
Everything works fine, but when executing shutdown, I see I/O errors (see attached screenshot):
print_req_error: I/O error, dev sda (sdb) sector … sda and sdb are in rpool
root@telemachus:~# zpool status rpool
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
wwn-0x50000398c84b19ad-part2 ONLINE 0 0 0
wwn-0x50000398c84b12e9-part2 ONLINE 0 0 0
errors: No known data errors
root@telemachus:~#
Why do I get this errors?
Attachments
-
20181115_152802-s.jpg
247.3 KB
· Views: 11
rhonda
Proxmox Retired Staff
Retired Staff
-
#2
That sounds like that you might have an hardware issue related to your sda and sdb disk. Look at the output of «lsblk» what this might affect, and related to what partitions/file systems you have on there you might want to look into repair tools for potentially finding out if it’s just some fallen bits or real hardware related issues.
-
#3
Hi.
So in what there was a problem?