Print req error i o error dev sda sector - Исправление ошибок и поиск оптимальных решений проблем

Печать

Страницы: [1] 2 Все Вниз

Тема: Ошибка диска: I/O error, dev sda, sector XXXXX (Прочитано 10010 раз)

0 Пользователей и 2 Гостей просматривают эту тему.

p4sh

При старте ПК наблюдаю множество ошибок в dmesg:

https://paste.ubuntu.com/p/YKY74JTwsD/

Если же прочитать любой отдельный сектор вручную получаю иногда

root@mail:~# hdparm --read-sector 25523880 /dev/sda


/dev/sda:
reading sector 25523880: SG_IO: bad/missing sense data, sb[]:  70 00 03 00 00 00 00 0a 40 51 e1 01 11 04 00 00 00 a8 00 00 00 00 00 00 00 00 00 00 00 00 00 00
succeeded

Но иногда это просто succeeded, то есть сектора читаются.
Я проверил SMART — пишет что ошибок на диске нет.

Проблема в том, что по истечении некоторого времени одна из файловых систем (/var) становится в read-only и перестаёт работает множество программ.
Что посоветуете сделать?

ТС не появлялся на Форуме более полугода по состоянию на 22/07/2019 (последняя явка: 23/11/2018). Модератором раздела принято решение закрыть тему.
—zg_nico

« Последнее редактирование: 22 Июля 2019, 15:23:03 от zg_nico »

ALiEN175

« Последнее редактирование: 13 Августа 2018, 13:11:14 от ALiEN175 »

ASUS P5K-C :: Intel Xeon E5450 @ 3.00GHz :: 8 GB DDR2 :: Radeon R7 260X :: XFCE
ACER 5750G :: Intel Core i5-2450M @ 2.50GHz :: 6 GB DDR3 :: GeForce GT 630M :: XFCE

p4sh

bearpuh

Я проверил SMART — пишет что ошибок на диске нет.

А это ни о чем не говорит?

SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 20772 25523880 # 2 Short offline Completed: read failure 90% 20677 1057345043 # 3 Short offline Completed: read failure 90% 20677 1057345043 # 4 Short offline Completed: read failure 90% 20677 1057345043 # 5 Short offline Completed: read failure 90% 20677 1057345043

SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 29074 24928270
На дисках присутствуют нечитаемые сектора.
Бэкап в первую очередь, потом проверка с помощью виктории или badblocks

sudo /usr/sbin/badblocks -o /path/to/file/badblocks.list -b 4096 -s -v -t random /dev/sdX

p4sh

В том и дело, сектора читаются (или я ошибаюсь, прошу поправить):

root@mail:~# hdparm --read-sector 1873032872 /dev/sda /dev/sda: reading sector 1873032872: succeeded 0000 0000 f40f 0c01 4442 4537 4136 3534 3937 3335 6857 5806 1400 0c01 4433 3639 .......


root@mail:~# hdparm --read-sector 148453280 /dev/sda
/dev/sda:
reading sector 148453280: succeeded
bb10 5600 0c00 0102 2e00 0000 ba10 5600
3000 0202 2e2e 0000 bc10 5600 2400 1c01
root@mail:~# hdparm --read-sector 1285929908 /dev/sda
/dev/sda:
reading sector 1285929908: SG_IO: bad/missing sense data, sb[]:  70 00 03 00 00 00 00 0a 40 51 e0 01 11 04 00 00 a0 b4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
succeeded
0000 0000 0000 0000 0000 0000 0000 0000

Спасибо!

bearpuh

В том и дело, сектора читаются

Чтобы в этом убедиться, необходимо проверить.
Как, я уже написал.
Я бы еще подключил к другому контроллеру/компу для проверки.

Sly_tom_cat

Беды в SMART заменой контроллера не решить.

Контроллер это обычно вылезает в :

199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0Но тут чисто.

bearpuh

Беды в SMART заменой контроллера не решить.

Согласен. Смутило просто.

SG_IO: bad/missing sense data

snowin

В том и дело, сектора читаются (или я ошибаюсь, прошу поправить)

ты ошибаешься

ReNzRv

Проверять и лечить лучше с загрузочного образа Seagate Tools for DOS
командами Zero All (затирает все сектора) и Long Test (DST) — полная проверка всех секторов с переопределением бэдов на уровне контроллера диска.

p4sh

Z

man hdparm

--read-sector Reads from the specified sector number, and dumps the contents in hex to standard output. The sector number must be given (base10) after this option. hdparm will issue a low-level read (completely bypassing the usual block layer read/write mechanisms) for the specified sector. This can be used to definitively check whether a given sector is bad (media error) or not (doing so through the usual mechanisms can sometimes give false positives).

ты ошибаешься

Мне не понятно, могли бы объяснить подробнее, почему при чтении hdparm получаем «SUCCESS», но сектора «не читаемые»? Это негожий софт?

« Последнее редактирование: 15 Августа 2018, 09:49:18 от p4sh »

bearpuh

hdparm получаем «SUCCESS», но сектора «не читаемые»?

А сколько времени затрачивается на чтение этого сектора?
По какому принципу та же victoria hdd определяет статус сектора — «bad»?
Прочтите это сектор викторией, возможно станет понятней.
Хотите теории, вот она, от автора smartmontools — https://www.smartmontools.org/wiki/BadBlockHowto#ext2ext3secondexample

Пользователь добавил сообщение 15 Августа 2018, 10:13:40:

Вот еще обратите внимание.
У вас несколько секторов на обоих дисках кандидаты на перемещение.

197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 4

Пользователь добавил сообщение 15 Августа 2018, 10:15:17:

Им можно «дать пинка» — force rellocation.
Инфа есть в вышеуказанной ссылке по smartmontools.

« Последнее редактирование: 15 Августа 2018, 10:15:17 от bearpuh »

snowin

Им можно «дать пинка» — force rellocation.

достаточно просто записать в них и заново считать, можно несколько раз
если это бэды, винч сам их переместит, в противном случае это просто так называемые «софтовые бэды» и они должны будут исчезнуть из смарта

p4sh

Что я сделал:
загрузился с live usb, собрал массив и проверил ФС:
e2fsck -ct /dev/…
Прогнал тесты еще раз.
Перезагрузился и мониторю состояние ФС.
Также обновился smart:
Изменился Multi_Zone_Error_Rate
Остался на sda 1 сектор на перемещение: Current_Pending_Sector 1

/dev/sda





  


                       
  




SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    24
  3 Spin_Up_Time            POS--K   179   172   021    -    4033
  4 Start_Stop_Count        -O--CK   099   099   000    -    1401
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   072   072   000    -    20819
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    876
192 Power-Off_Retract_Count -O--CK   199   199   000    -    758
193 Load_Cycle_Count        -O--CK   200   200   000    -    642
194 Temperature_Celsius     -O---K   102   081   000    -    45
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    1
198 Offline_Uncorrectable   ----CK   200   200   000    -    1
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    1
/dev/sdb
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    209
  3 Spin_Up_Time            POS--K   191   173   021    -    3416
  4 Start_Stop_Count        -O--CK   099   099   000    -    1680
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   100   253   000    -    0
  9 Power_On_Hours          -O--CK   060   060   000    -    29215
 10 Spin_Retry_Count        -O--CK   100   100   000    -    0
 11 Calibration_Retry_Count -O--CK   100   100   000    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    773
192 Power-Off_Retract_Count -O--CK   200   200   000    -    650
193 Load_Cycle_Count        -O--CK   200   200   000    -    1029
194 Temperature_Celsius     -O---K   103   091   000    -    44
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   200   200   000    -    4
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    3

Сейчас проверю с помощью Victoria (она же делала авторемап вроде).
Спасибо всем за ответы — очень полезный топ для меня!

snowin

Сейчас проверю с помощью Victoria (она же делала авторемап вроде).

ремап тебе не нужен
кабель, для начало, поменяй
на обоих винтах
по поводу

В том и дело, сектора читаются (или я ошибаюсь, прошу поправить):

кажется, что ты совсем не понимаешь, что делаешь и для чего
ты берешь случайный сектор на диске и проверяешь его утилитой hdparm на чтение и утверждаешь что он читается
в то время как проблемные сектора ты не проверяешь
тем не менее твои случайные, безрассудные действия (переборка рейда) привели к более хорошим результатам
но это топорный метод

« Последнее редактирование: 16 Августа 2018, 15:17:24 от snowin »

Печать

Страницы: [1] 2 Все Вверх

Источник

Здравствуйте. Подскажите пожалуйста, что можно сделать с этим диском?(кроме как поменять)
Критичны ли эти ошибки?

[531787.817056] ata4.00: configured for UDMA/133
[531787.817074] sd 3:0:0:0: [sdd] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[531787.817078] sd 3:0:0:0: [sdd] tag#4 Sense Key : Medium Error [current]
[531787.817081] sd 3:0:0:0: [sdd] tag#4 Add. Sense: Unrecovered read error
[531787.817085] sd 3:0:0:0: [sdd] tag#4 CDB: Read(10) 28 00 05 a8 c0 80 00 04 00 00
[531787.817088] print_req_error: I/O error, dev sdd, sector 94945457
[531787.817844] ata4: EH complete

Вывод smartctl -a /dev/sdd

smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.18-15-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Re
Device Model:     WDC WD2004FBYZ-01YCBB1
Serial Number:    WD-WMC6N0D0Y4MT
LU WWN Device Id: 5 0014ee 05994de88
Firmware Version: RR04
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Jun 11 15:10:59 2019 +07
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 220) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   199   051    Pre-fail  Always       -       1
  3 Spin_Up_Time            0x0027   188   182   021    Pre-fail  Always       -       3583
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       229
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always       -       20394
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       227
 16 Unknown_Attribute       0x0022   003   197   000    Old_age   Always       -       191014087496
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       182
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       111
194 Temperature_Celsius     0x0022   117   107   000    Old_age   Always       -       30
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       4
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     19429         -
# 2  Short offline       Completed without error       00%     19379         -
# 3  Short offline       Completed without error       00%     19330         -
# 4  Short offline       Completed without error       00%     19280         -
# 5  Short offline       Completed without error       00%     19230         -
# 6  Short offline       Completed without error       00%     19180         -
# 7  Short offline       Completed without error       00%      9228         -
# 8  Short offline       Completed without error       00%      9205         -
# 9  Short offline       Completed without error       00%      9181         -
#10  Short offline       Completed without error       00%      9157         -
#11  Short offline       Completed without error       00%      9133         -
#12  Short offline       Completed without error       00%      9109         -
#13  Short offline       Completed without error       00%      9085         -
#14  Short offline       Completed without error       00%      9061         -
#15  Short offline       Completed without error       00%      9037         -
#16  Short offline       Completed without error       00%      9013         -
#17  Short offline       Completed without error       00%      8989         -
#18  Short offline       Completed without error       00%      8965         -
#19  Short offline       Completed without error       00%      8941         -
#20  Short offline       Completed without error       00%      8917         -
#21  Short offline       Completed without error       00%      8893         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Вопрос задан

более трёх лет назад
1960 просмотров

Диск вроде как нормальный.
Для начала проверьте сам кабель и разъемы.
Хотя довольно странно что он указывает конкретный sector. Попробуйте забэкапить важное и провести тест викторией.

Если диск рабочий, делайте копию диска а сам диск тестируйте MHDD

Ну, сбой при чтении. Данные слить, если есть ценное и не в рейде, диск вытащить, на стендовую машину и тестить MHDD. Если ошибка исчезнет при форматировании сектора в MHDD — ну, живы будем, не помрем. Иначе можно через MHDD же перемапить сектор, если получится конечно.
Если же все ухищрения все равно будут приводить к сбою при чтении — ну тогда R.I.P.

Пригласить эксперта

Это похоже на софтбэд. При потере питания сектор не записался полностью, от чего контрольная сумма в нем не правильная.

Нужно принудительно записать в этот сектор что-нибудь. Например нули с помощью hdparm

Показать ещё
Загружается…

10 февр. 2023, в 02:20

3000 руб./за проект

10 февр. 2023, в 01:33

1500 руб./за проект

10 февр. 2023, в 00:54

2000 руб./в час

Минуточку внимания

Источник

Здравствуйте! Знаю что тема HDD не раз поднималась. Но, за две недели поисков решения проблемы не нашел. Короче имеется комп «собранный с миру по нитке». Два жд 500 и 250 гб соответственно. На 500 гб поставили Win7 на 250 гб Mint19. После установки (да и во время установки) Mint не видит диск 500гб и соответственно Win7. Вернее при загрузке Mint, диск иногда появляется, но при попытке его примонтировать, выводится сообщение «Не удалось примонтировать диск 500гб. Операция отменена» и диск пропадает. В те удачные моменты когда диск виден я натравил на него fdisk. Вот вывод команды sudo fdisk -l

vladimir@vladimir-PC:~$ sudo fdisk -l

[sudo] пароль для vladimir:

Диск /dev/sda: 465,8 GiB, 500107862016 байт, 976773168 секторов

Единицы: секторов по 1 * 512 = 512 байт

Размер сектора (логический/физический): 512 байт / 4096 байт

Размер I/O (минимальный/оптимальный): 4096 байт / 4096 байт

Тип метки диска: dos

Идентификатор диска: 0x11a6c81a

Устр-во Загрузочный начало Конец Секторы Размер
Идентификатор Тип

/dev/sda1 * 2048 206847 204800 100M
7 HPFS/NTFS

/dev/sda2 206848 976771071 976564224 465,7G
7 HPFS/NTFS

Диск /dev/sdb: 232,9 GiB, 250059350016 байт, 488397168 секторов

Единицы: секторов по 1 * 512 = 512 байт

Размер сектора (логический/физический): 512 байт / 512 байт

Размер I/O (минимальный/оптимальный): 512 байт / 512 байт

Тип метки диска: dos

Идентификатор диска: 0x6e4283a3

Устр-во Загрузочный начало Конец Секторы Размер
Идентификатор Тип

/dev/sdb1 * 2048 195311615 195309568 93,1G
83 Linux

/dev/sdb2 195313662 210935807 15622146 7,5G
5 Расшир

/dev/sdb3 210935808 488396799 277460992 132,3G
83 Linux

/dev/sdb5 195313664 210935807 15622144 7,5G
82 Linux

Элементы таблицы разделов упорядочены не так, как на диске.
vladimir@vladimir-PC:~$

Gparted (когда диск виден) долго думает, а потом сообщает что-то вроде «ошибка сохранения файлов или синхронизации на /dev/sda».
Однажды Gparted увидел диск 500гб, но ничего подозрительного я не узрел, кроме раздела NTFS 100 мб в начале диска и маленького 18мб не размеченного раздела в конце диска.

Что я предпринимал:

в Win7 запретил гибридный режим сна.

пробовал в BIOS менять режимы контроллера дисков.

Прогнал диск через chkdisk на
Win7

Я предполагаю,что диск был когда-то в RAIDмассиве и хранит об этом теплые воспоминания по сей день. Но не уверен.

Подскажите пожалуйста в какую сторону смотреть и как?

Источник

Hello,

I am setting up nextcloudpi on a Raspberry pi 4 4GB. Unfortunately, after a few minutes/hours the mounted external SSD drive, plugged on a USB 3 port, fails.

returns

Code: Select all

[  114.481827] usb 2-1: USB disconnect, device number 2
[  114.487346] print_req_error: I/O error, dev sda, sector 1951951376
[  114.487377] EXT4-fs warning (device sda1): ext4_end_bio:323: I/O error 10 writing to inode 60950252 (offset 0 size 520192 starting block 243994049)
[  114.487392] Buffer I/O error on device sda1, logical block 243993666
[  114.487417] Buffer I/O error on device sda1, logical block 243993667
[  114.487440] Buffer I/O error on device sda1, logical block 243993668
[  114.487459] Buffer I/O error on device sda1, logical block 243993669
[  114.487475] Buffer I/O error on device sda1, logical block 243993670
[  114.487491] Buffer I/O error on device sda1, logical block 243993671
[  114.487507] Buffer I/O error on device sda1, logical block 243993672
[  114.487522] Buffer I/O error on device sda1, logical block 243993673
[  114.487538] Buffer I/O error on device sda1, logical block 243993674
[  114.487554] Buffer I/O error on device sda1, logical block 243993675
[  114.490420] print_req_error: I/O error, dev sda, sector 1950870336
[  114.490444] EXT4-fs warning (device sda1): ext4_end_bio:323: I/O error 10 writing to inode 60950253 (offset 0 size 40960 starting block 243858802)
[  114.490549] print_req_error: I/O error, dev sda, sector 1951952392
[  114.490573] EXT4-fs warning (device sda1): ext4_end_bio:323: I/O error 10 writing to inode 60950254 (offset 0 size 724992 starting block 243994226)
[  114.500871] print_req_error: I/O error, dev sda, sector 1946159120
[  114.500890] Buffer I/O error on dev sda1, logical block 243269634, lost async page write
[  114.500933] print_req_error: I/O error, dev sda, sector 1946160184
[  114.500946] Buffer I/O error on dev sda1, logical block 243269767, lost async page write
[  114.501030] JBD2: Detected IO errors while flushing file data on sda1-8
[  114.501057] print_req_error: I/O error, dev sda, sector 973346008
[  114.501091] Aborting journal on device sda1-8.
[  114.501152] print_req_error: I/O error, dev sda, sector 973342720
[  114.501166] print_req_error: I/O error, dev sda, sector 973342720
[  114.501179] Buffer I/O error on dev sda1, logical block 121667584, lost sync page write
[  114.501203] JBD2: Error -5 detected when updating journal superblock for sda1-8.
[  114.501243] print_req_error: I/O error, dev sda, sector 1946636536
[  114.501257] Buffer I/O error on dev sda1, logical block 243329311, lost async page write
[  114.501287] print_req_error: I/O error, dev sda, sector 1950353440
[  114.501299] Buffer I/O error on dev sda1, logical block 243793924, lost async page write

I tried to fix the error by running fsck from another computer, and the disk seems to be fine:

Code: Select all

$ sudo fsck /dev/sdc1
fsck from util-linux 2.35.1
e2fsck 1.45.6 (20-Mar-2020)
NEXTCLOUD: clean, 3691/61054976 files, 4856017/244190385 blocks

After running fsck, plugging the external SSD back to the pi and rebooting the pi, the drive behaves normally for some time before failing (sometimes a few minutes, sometimes a few hours).

I thought it was a power issue so I tried 3 different connectors:

1 powered usb 3 to sata HDD enclosure (Orico)
1 powered usb3 to sata adapter (Unitek)
1 unpowered usb3 to sata hdd enclosure (Orico)

The problem same problem occur every time.

I also tried to plug some another drive:

the SSD : Crucial MX500 1TB 2.5″ SSD
an HDD : a toshiba drive I salvaged from another laptop

The problem happened to the HDD after a longer period of time. I was using the HDD as a backup of the SSD and I noticed that it failed only after a few days.

At that point, my guesses are:

either there was a power issue that corrupted the drives and they are now unusable (but then how comes fsck mention it «clean»? I’m not an expert I might be reading it wrong)
either all my cheap enclosures are not working at all
either the usb ports of the pi are failing

I’d like to test those guesses but I’m lost trying to figure out how.
Thank you for your help.

Источник

EDIT: . The members of this forum just helped me fix and repair a nasty hard disk error. I had run file system checks before, but what I never knew was that the default check does not update the bad block inode list.

p.H wrote:e2fsck detects and marks bad blocks only when run with the -c option.

With that one sentence, p.H saved my computer. And the advice that he and L_V gave me in this thread was priceless.

What ultimately worked for me was checking both my / (root) and /home partitions with the non-destructive read-write option, -cc from a Live CD:

Code: Select all

e2fsck -f -y -cc -C0 /dev/sda5
e2fsck -f -y -cc -C0 /dev/sda7

That check identified and repaired the affected inodes. It also wrote over the damaged files. Keep a list of those files. You will have to replace them (as explained below).

Next, I ran the checks again with the read-only option -c:

Code: Select all

e2fsck -f -y -c -C0 /dev/sda5
e2fsck -f -y -c -C0 /dev/sda7

Running the check a second time was an important step because it added a few more blocks to the bad blocks list.

Having repaired the file system, the next step was to repair the affected files:

p.H wrote:Note that e2fsck can remap bad blocks but cannot restore the unreadable contents of the affected files, so these files must be reinstalled from their respective packages.

In my case, I had a fresh install of Debian Buster and a Debian Buster Live CD, so I just copied them from the Live CD:

Code: Select all

mkdir /media/inspiron
mount /dev/sda5 /media/inspiron
cp /usr/bin/$FILE01  /media/inspiron/usr/bin/$FILE01
cp /usr/bin/$FILE02  /media/inspiron/usr/bin/$FILE02
...
umount /dev/sda5

After that, the computer booted like a charm. Importantly, it shutdown like a charm too. There were no priority 0 or 1 messages in my journalctl.

Thank you to p.H and L_V for helping me rescue this old machine! .

—————————————-

ORIGINAL POST:

After a fresh installation of Debian Buster on an old machine, the partition that contains my /home partition does not unmount at shutdown. The problem seems to be caused by an I/O error. At first glance, smartctl does not show any errors, but a deeper looks shows that the disk experienced a few errors on the / (root) partition a few years ago.

If I followed Linux Admins’ «Fixing disk problems» guide would that resolve the issue?

Thanks in advance,
— Soul

Code: Select all

$ journalctl -r -b -1 -p3

-- Logs begin at Sun 2019-05-19 13:22:05 EDT, end at Sun 2019-05-19 15:26:53 EDT. --
May 19 14:51:02 inspiron systemd[1]: Failed unmounting /home.
May 19 14:51:02 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 14:51:02 inspiron kernel: ata1.00: error: { UNC }
May 19 14:51:02 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 14:51:02 inspiron kernel: ata1.00: cmd 60/08:88:c8:a3:b6/00:00:09:00:00/40 tag 17 ncq dma 4096 in
                                          res 41/40:08:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 14:51:02 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 14:51:02 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 14:51:02 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 action 0x0
May 19 14:50:59 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 14:50:59 inspiron kernel: ata1.00: error: { UNC }
May 19 14:50:59 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 14:50:59 inspiron kernel: ata1.00: cmd 60/20:a8:c0:a3:b6/00:00:09:00:00/40 tag 21 ncq dma 16384 in
                                          res 41/40:20:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 14:50:59 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 14:50:59 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 14:50:59 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x200000 SErr 0x0 action 0x0
May 19 14:50:43 inspiron wpa_supplicant[509]: dbus: wpa_dbus_property_changed: no property SessionLength in object /fi/w1/wpa_supplicant1/Interfaces/1
May 19 14:47:06 inspiron root[7585]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:40:19 inspiron root[7277]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:34:55 inspiron root[7129]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:27:20 inspiron root[6970]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:19:26 inspiron root[6425]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:13:30 inspiron root[6164]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:07:00 inspiron root[4631]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:01:11 inspiron root[3004]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:53:40 inspiron root[2451]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:46:29 inspiron root[2355]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:41:26 inspiron root[2260]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:33:31 inspiron root[1803]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:28:13 inspiron root[1633]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:26:48 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:48 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:c8:f0:00:08/00:00:0c:00:00/40 tag 25 ncq dma 4096 in
                                          res 41/40:08:f6:00:08/00:00:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:98:90:f6:3c/00:00:0a:00:00/40 tag 19 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:90:08:3e:d1/00:00:30:00:00/40 tag 18 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:88:08:2d:8d/00:00:15:00:00/40 tag 17 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:80:c0:3e:59/00:00:09:00:00/40 tag 16 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 61/18:30:18:7f:c5/00:00:2f:00:00/40 tag 6 ncq dma 12288 out
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: WRITE FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:28:c8:6a:71/00:00:15:00:00/40 tag 5 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:48 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x20f0060 SErr 0x0 action 0x0
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 361573512
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 156843584
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 359754432
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/d0:f0:88:2c:8d/00:00:15:00:00/40 tag 30 ncq dma 106496 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/00:e8:40:3e:59/01:00:09:00:00/40 tag 29 ncq dma 131072 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:58:08:3e:d1/00:00:30:00:00/40 tag 11 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:48:f0:00:08/00:00:0c:00:00/40 tag 9 ncq dma 4096 in
                                          res 41/40:08:f6:00:08/00:00:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:18:90:f6:3c/00:00:0a:00:00/40 tag 3 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/40:10:c0:6a:71/00:00:15:00:00/40 tag 2 ncq dma 32768 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 61/08:00:00:70:cc/00:00:31:00:00/40 tag 0 ncq dma 4096 out
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: WRITE FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:45 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x60000a0d SErr 0x0 action 0x0
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 804169080
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 361572440
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851904
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 813441024
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201848320
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/40:e0:78:a5:ee/00:00:2f:00:00/40 tag 28 ncq dma 32768 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/08:c8:30:59:d1/00:00:30:00:00/40 tag 25 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/08:c0:58:60:92/00:00:31:00:00/40 tag 24 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/58:60:58:28:8d/00:00:15:00:00/40 tag 12 ncq dma 45056 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/d8:58:00:04:08/06:00:0c:00:00/40 tag 11 ncq dma 897024 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/00:50:00:00:08/04:00:0c:00:00/40 tag 10 ncq dma 524288 in
                                          res 41/40:00:f6:00:08/00:04:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/40:48:00:20:7c/00:00:30:00:00/40 tag 9 ncq dma 32768 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/00:40:00:f6:07/06:00:0c:00:00/40 tag 8 ncq dma 786432 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:39 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x13001f00 SErr 0x0 action 0x0
May 19 13:22:09 inspiron kernel: mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: FW version command failed -5
May 19 13:22:09 inspiron kernel: mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: Could not read FW version
May 19 13:22:05 inspiron kernel: ACPI: SPCR: Unexpected SPCR Access Width.  Defaulting to byte size

Code: Select all

# fdisk -l

Disk /dev/sda: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Disk model: ST9500325AS     
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x07f2837e

Device     Boot     Start       End   Sectors   Size Id Type
/dev/sda1              63    208844    208782   102M de Dell Utility
/dev/sda2  *       208845  30928844  30720000  14.7G  7 HPFS/NTFS/exFAT
/dev/sda3        30928845 155775023 124846179  59.5G  7 HPFS/NTFS/exFAT
/dev/sda4       155782305 976768064 820985760 391.5G  5 Extended
/dev/sda5  *    155782368 177305599  21523232  10.3G 83 Linux
/dev/sda6       177307648 199903231  22595584  10.8G 82 Linux swap / Solaris
/dev/sda7       199905280 976766975 776861696 370.4G 83 Linux

Code: Select all

# smartctl -l selftest /dev/sda

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         0         -

Code: Select all

# smartctl -a /dev/sda

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Momentus 5400.6
Device Model:     ST9500325AS
Serial Number:    6VEGMVRP
LU WWN Device Id: 5 000c50 03067dd6f
Firmware Version: D005DEM1
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sun May 19 15:05:07 2019 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(    0) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 139) minutes.
Conveyance self-test routine
recommended polling time: 	 (   3) minutes.
SCT capabilities: 	       (0x103f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   101   089   006    Pre-fail  Always       -       29958806
  3 Spin_Up_Time            0x0003   099   099   085    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   091   091   020    Old_age   Always       -       9917
  5 Reallocated_Sector_Ct   0x0033   088   088   036    Pre-fail  Always       -       246
  7 Seek_Error_Rate         0x000f   083   060   030    Pre-fail  Always       -       207791365
  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always       -       23876
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   094   094   020    Old_age   Always       -       6861
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       1097
188 Command_Timeout         0x0032   100   096   000    Old_age   Always       -       3759
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   051   036   045    Old_age   Always   In_the_past 49 (Min/Max 49/49 #998)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       20
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       78
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       578157
194 Temperature_Celsius     0x0022   049   064   000    Old_age   Always       -       49 (0 18 0 0 0)
195 Hardware_ECC_Recovered  0x001a   053   045   000    Old_age   Always       -       29958806
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       4
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       4
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       22868 (153 213 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       3790333358
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1937597633
254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 987 (device log contains only the most recent five errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 987 occurred at disk power-on lifetime: 23876 hours (994 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb a3 b6 09  Error: UNC at LBA = 0x09b6a3cb = 162964427

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 c8 a3 b6 49 00      05:50:04.649  READ FPDMA QUEUED
  60 00 28 e0 a3 b6 49 00      05:50:04.617  READ FPDMA QUEUED
  60 00 08 c0 a3 b6 49 00      05:50:04.515  READ FPDMA QUEUED
  27 00 00 00 00 00 e0 00      05:50:04.513  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00      05:50:04.512  IDENTIFY DEVICE

Error 986 occurred at disk power-on lifetime: 23876 hours (994 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb a3 b6 09  Error: UNC at LBA = 0x09b6a3cb = 162964427

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 20 c0 a3 b6 49 00      05:50:02.009  READ FPDMA QUEUED
  60 00 08 10 50 bb 49 00      05:50:01.961  READ FPDMA QUEUED
  ea 00 00 00 00 00 a0 00      05:49:55.547  FLUSH CACHE EXT
  61 00 08 a0 33 4a 49 00      05:49:55.547  WRITE FPDMA QUEUED
  ea 00 00 00 00 00 a0 00      05:49:55.538  FLUSH CACHE EXT

Error 985 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 f6 00 08 0c  Error: UNC at LBA = 0x0c0800f6 = 201851126

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      04:25:50.279  READ FPDMA QUEUED
  60 00 08 f0 00 08 4c 00      04:25:50.257  READ FPDMA QUEUED
  61 00 08 ff ff ff 4f 00      04:25:50.256  WRITE FPDMA QUEUED
  60 00 08 90 f6 3c 4a 00      04:25:50.256  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      04:25:50.256  READ FPDMA QUEUED

Error 984 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 f6 00 08 0c  Error: UNC at LBA = 0x0c0800f6 = 201851126

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 40 ff ff ff 4f 00      04:25:47.981  READ FPDMA QUEUED
  60 00 80 28 9e 57 49 00      04:25:47.954  READ FPDMA QUEUED
  60 00 40 ff ff ff 4f 00      04:25:47.953  READ FPDMA QUEUED
  60 00 40 ff ff ff 4f 00      04:25:47.945  READ FPDMA QUEUED
  60 00 40 ff ff ff 4f 00      04:25:47.941  READ FPDMA QUEUED

Error 983 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 f6 00 08 0c  Error: UNC at LBA = 0x0c0800f6 = 201851126

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      04:25:42.214  READ FPDMA QUEUED
  60 00 d8 00 04 08 4c 00      04:25:42.209  READ FPDMA QUEUED
  60 00 00 00 00 08 4c 00      04:25:42.207  READ FPDMA QUEUED
  60 00 00 00 f6 07 4c 00      04:25:42.207  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      04:25:42.163  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Источник

Everything works fine, but when executing shutdown, I see I/O errors (see attached screenshot):

print_req_error: I/O error, dev sda (sdb) sector … sda and sdb are in rpool

root@telemachus:~# zpool status rpool
pool: rpool
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
wwn-0x50000398c84b19ad-part2 ONLINE 0 0 0
wwn-0x50000398c84b12e9-part2 ONLINE 0 0 0

errors: No known data errors
root@telemachus:~#

Why do I get this errors?

Attachments

20181115_152802-s.jpg

247.3 KB

· Views: 11

rhonda

Proxmox Retired Staff

Retired Staff

That sounds like that you might have an hardware issue related to your sda and sdb disk. Look at the output of «lsblk» what this might affect, and related to what partitions/file systems you have on there you might want to look into repair tools for potentially finding out if it’s just some fallen bits or real hardware related issues.

Hi.
So in what there was a problem?

Источник

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> p4sh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useron.png" alt="Онлайн" /> ALiEN175

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> p4sh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> bearpuh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> p4sh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> bearpuh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> Sly_tom_cat

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> bearpuh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> snowin

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> ReNzRv

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> p4sh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> bearpuh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> snowin

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> p4sh

<img decoding="async" onError="javascript: wp_broken_images = window.wp_broken_images || function(){}; wp_broken_images(this);" src="https://forum.ubuntu.ru/Themes/ubuntu-portal/images/png/useroff.png" alt="Оффлайн" /> snowin

Минуточку внимания

Attachments

rhonda

Proxmox Retired Staff

Читайте также:

p4sh

ALiEN175

p4sh

bearpuh

p4sh

bearpuh

Sly_tom_cat

bearpuh

snowin

ReNzRv

p4sh

bearpuh

snowin

p4sh

snowin