Ext4 fs error count since last fsck

Привет. Есть два вопроса. Первый - О чём говорят данные ошибки при загрузке роутера? Второй - Как исправить? EXT4-fs (sda2): error count since last fsck: 64 Июн 17 17:29:43 kernel EXT4-fs (sda2): initial error at time 1533467161: mb_free_blocks:1303: inode 110226: block 233994 Июн 17 17:29:43 ker...

Question

Mihail_Boyanskiy

Advanced Member

    • Share

Привет. Есть два вопроса. Первый — О чём говорят данные ошибки при загрузке роутера? Второй — Как исправить?

EXT4-fs (sda2): error count since last fsck: 64
Июн 17 17:29:43 kernel
EXT4-fs (sda2): initial error at time 1533467161: mb_free_blocks:1303: inode 110226: block 233994
Июн 17 17:29:43 kernel
EXT4-fs (sda2): last error at time 1623671029: ext4_mb_generate_buddy:756

  • Quote

Link to comment
Share on other sites

Recommended Posts

  • 0

Илья Картавенко

Honored Flooder

    • Share

1 минуту назад, Mihail_Boyanskiy сказал:

Привет. Есть два вопроса. Первый — О чём говорят данные ошибки при загрузке роутера? Второй — Как исправить?

EXT4-fs (sda2): error count since last fsck: 64
Июн 17 17:29:43 kernel
EXT4-fs (sda2): initial error at time 1533467161: mb_free_blocks:1303: inode 110226: block 233994
Июн 17 17:29:43 kernel
EXT4-fs (sda2): last error at time 1623671029: ext4_mb_generate_buddy:756

У вас подключен жесткий диск или флешка с файловой системой ext4 к роутеру?

  • Quote

Link to comment
Share on other sites

  • 0

Mamay

Honored Flooder

    • Share

Только что, Mihail_Boyanskiy сказал:

EXT4-fs (sda2): error count since last fsck: 64

Поломался ext4. Нужно прочекать. 

Гуглите, к примеру, gparted, оно умеет быть live flash…

Установить любой GNU/Linux не предлагаю.  

Link to comment
Share on other sites

  • 0

Mihail_Boyanskiy

Advanced Member

  • Author
    • Share

16 минут назад, Mamay сказал:

Поломался ext4. Нужно прочекать. 

Гуглите, к примеру, gparted, оно умеет быть live flash…

Установить любой GNU/Linux не предлагаю.  

В том то и дело что я форматировал флешку ещё пару лет назад в ext2,  Установил Entware и вот не так давно начала появляться эта проблема.

Не вытаскивая флешку из роутера починить нельзя? Если нет, посоветуйте если это возможно софт под Windows.

  • Quote

Link to comment
Share on other sites

  • 0

Mamay

Honored Flooder

    • Share

1 минуту назад, Mihail_Boyanskiy сказал:

Не вытаскивая флешку из роутера починить нельзя? Если нет, посоветуйте если это возможно софт под Windows.

Ещё бы знать версию вашей прошивки. ЕМНИП в 2.16 оно есть консольное. А вот в 3.6 не помню, вроде нет.

Я уже вам указал самый короткий путь выше. 

  • Quote

Link to comment
Share on other sites

  • 0

Mihail_Boyanskiy

Advanced Member

  • Author
    • Share

2 минуты назад, Mamay сказал:

Ещё бы знать версию вашей прошивки. ЕМНИП в 2.16 оно есть консольное. А вот в 3.6 не помню, вроде нет.

Я уже вам указал самый короткий путь выше. 

прошивка роутера 3.5.10, entware — Обновлял её в марте сего года, соответственно 3.какая-то.

Путь выше к сожалению не подходит, т.к. всё это находится за пару тыщ КМ от меня.

ОК, на виртуалке есть Ubuntu server 18.04. Подскажите утилиту для диагностики которая будет работать на серверной версии?

  • Quote

Link to comment
Share on other sites

  • 0

Mamay

Honored Flooder

    • Share
Link to comment
Share on other sites

  • 0

Mihail_Boyanskiy

Advanced Member

  • Author
    • Share

UP. Появился ещё вопрос по данной теме:

При загрузке роутера имеем такие записи:

EXT4-fs (sda2): mounting ext2 file system using the ext4 subsystem
Июн 14 14:43:49 kernel
EXT4-fs (sda2): warning: mounting unchecked fs, running e2fsck is recommended

Это что он рекомендует?

Ну и дальше вот:

Opkg::Manager: /tmp/mnt/e159-f240-b6e3-b521bb97ad2e initialized.
Июн 14 14:43:50 kernel
EXT4-fs error (device sda2): ext4_mb_generate_buddy:756: group 7, block bitmap and bg descriptor inconsistent: 32216 vs 32218 free clusters

И  флешка в принципе то работает. Вопрос что за рекомендация выше?

Такое  точно началось после обновления прошивки роутера неделю назад до 3.5.10

  • Quote

Link to comment
Share on other sites

  • 0

AndreBA

Honored Flooder

    • Share

20 минут назад, Mihail_Boyanskiy сказал:

EXT4-fs (sda2): mounting ext2 file system using the ext4 subsystem
Июн 14 14:43:49 kernel
EXT4-fs (sda2): warning: mounting unchecked fs, running e2fsck is recommended

Не верная система. Рекомендовано проверить через e2fsck .

Проверьте флешку, как рекомендовали.


Edited June 17, 2021 by AndreBA

Link to comment
Share on other sites

  • 0

Mamay

Honored Flooder

    • Share
Link to comment
Share on other sites

Join the conversation

You can post now and register later.

If you have an account, sign in now to post with your account.

Note: Your post will require moderator approval before it will be visible.

Well that disk started as sdo:

Oct 14 20:07:28 Brunnhilde kernel: usb-storage 4-2.2:1.0: USB Mass Storage device detected
Oct 14 20:07:28 Brunnhilde kernel: scsi host10: usb-storage 4-2.2:1.0
Oct 14 20:07:29 Brunnhilde kernel: scsi 10:0:0:0: Direct-Access     Elite    Pro USB          0    PQ: 0 ANSI: 6
Oct 14 20:07:29 Brunnhilde kernel: sd 10:0:0:0: Attached scsi generic sg12 type 0
Oct 14 20:07:29 Brunnhilde kernel: sd 10:0:0:0: [sdo] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Oct 14 20:07:29 Brunnhilde kernel: sd 10:0:0:0: [sdo] Write Protect is off
Oct 14 20:07:29 Brunnhilde kernel: sd 10:0:0:0: [sdo] Mode Sense: 43 00 00 00
Oct 14 20:07:29 Brunnhilde kernel: sd 10:0:0:0: [sdo] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 14 20:07:33 Brunnhilde kernel: sdo: sdo1

The it got disconnect and reconnect as sdp:

Oct 14 20:44:23 Brunnhilde kernel: usb 4-2.2: USB disconnect, device number 7
Oct 14 20:44:23 Brunnhilde kernel: blk_update_request: I/O error, dev sdo, sector 0
Oct 14 20:44:23 Brunnhilde kernel: sd 10:0:0:0: [sdo] Synchronizing SCSI cache
Oct 14 20:44:23 Brunnhilde kernel: sd 10:0:0:0: [sdo] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
Oct 14 20:44:23 Brunnhilde rc.diskinfo[17670]: PHP Warning: Missing argument 2 for force_reload() in /etc/rc.d/rc.diskinfo on line 691
Oct 14 20:44:23 Brunnhilde rc.diskinfo[17670]: SIGHUP received, forcing refresh of disks info.
Oct 14 20:44:23 Brunnhilde kernel: usb 4-2: new SuperSpeed USB device number 8 using xhci_hcd
Oct 14 20:44:23 Brunnhilde kernel: hub 4-2:1.0: USB hub found
Oct 14 20:44:23 Brunnhilde kernel: hub 4-2:1.0: 4 ports detected
Oct 14 20:44:23 Brunnhilde kernel: usb 4-2.2: new SuperSpeed USB device number 9 using xhci_hcd
Oct 14 20:44:23 Brunnhilde kernel: usb-storage 4-2.2:1.0: USB Mass Storage device detected
Oct 14 20:44:23 Brunnhilde kernel: scsi host11: usb-storage 4-2.2:1.0
Oct 14 20:44:24 Brunnhilde rc.diskinfo[17670]: PHP Warning: Missing argument 2 for force_reload() in /etc/rc.d/rc.diskinfo on line 691
Oct 14 20:44:24 Brunnhilde rc.diskinfo[17670]: SIGHUP received, forcing refresh of disks info.
Oct 14 20:44:25 Brunnhilde kernel: scsi 11:0:0:0: Direct-Access     Elite    Pro USB          0    PQ: 0 ANSI: 6
Oct 14 20:44:25 Brunnhilde kernel: sd 11:0:0:0: Attached scsi generic sg12 type 0
Oct 14 20:44:25 Brunnhilde kernel: sd 11:0:0:0: [sdp] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Oct 14 20:44:25 Brunnhilde kernel: sd 11:0:0:0: [sdp] Write Protect is off
Oct 14 20:44:25 Brunnhilde kernel: sd 11:0:0:0: [sdp] Mode Sense: 43 00 00 00
Oct 14 20:44:25 Brunnhilde kernel: sd 11:0:0:0: [sdp] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 14 20:44:25 Brunnhilde kernel: sdp: sdp1

Same thing repeatedly happened again and it get’s reconnect with a different letter.

Your log is full of USB disconnects, one of the reasons USB devices are not really recommend.


Edited November 5, 2017 by johnnie.black

I have an 8TiB disk attached via UBS3 and formatted into 3 EXT3 partitions which I use as a backup drive (it’s plugged into a SATA cradle).

The disk has been attached and mounted for several days without being explicitly written to (I backed up some data a couple of days ago).

I happened to take a look at dmesg and spotted the following (this is filtered to show only entries matching the disk name, sdg):

[393945.628890] EXT4-fs (sdg2): error count since last fsck: 4
[393945.628894] EXT4-fs (sdg2): initial error at time 1589268773: ext4_validate_block_bitmap:406
[393945.628897] EXT4-fs (sdg2): last error at time 1589336019: ext4_validate_block_bitmap:406
[394076.698059] EXT4-fs (sdg1): error count since last fsck: 103
[394076.698063] EXT4-fs (sdg1): initial error at time 1589216157: ext4_validate_block_bitmap:406
[394076.698066] EXT4-fs (sdg1): last error at time 1589372294: ext4_lookup:1590: inode 186081476

I’ve not run fsck on this disk since it was partitioned and formatted. Given that fsck has not been run what is finding the errors and how concerned should I be?

When I rebooted the system this morning I checked dmesg again and found (again filtered to show only entries matching sdg)

[  261.721822] sd 9:0:0:0: [sdg] Spinning up disk...
[  274.051062] sd 9:0:0:0: [sdg] 15628053168 512-byte logical blocks: (8.00 TB/7.28 TiB)
[  274.051065] sd 9:0:0:0: [sdg] 4096-byte physical blocks
[  274.051137] sd 9:0:0:0: [sdg] Write Protect is off
[  274.051140] sd 9:0:0:0: [sdg] Mode Sense: 43 00 00 00
[  274.051297] sd 9:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  274.051498] sd 9:0:0:0: [sdg] Optimal transfer size 33553920 bytes not a multiple of physical block size (4096 bytes)
[  274.134309]  sdg: sdg1 sdg2 sdg3
[  274.135296] sd 9:0:0:0: [sdg] Attached SCSI disk
[  274.654835] EXT4-fs (sdg3): mounting ext3 file system using the ext4 subsystem
[  274.696860] EXT4-fs (sdg3): warning: mounting fs with errors, running e2fsck is recommended
[  274.766709] EXT4-fs (sdg1): mounting ext3 file system using the ext4 subsystem
[  274.795109] EXT4-fs (sdg1): warning: mounting fs with errors, running e2fsck is recommended
[  274.825210] EXT4-fs (sdg2): mounting ext3 file system using the ext4 subsystem
[  274.891191] EXT4-fs (sdg2): warning: mounting fs with errors, running e2fsck is recommended
[  275.713323] EXT4-fs (sdg2): mounted filesystem with ordered data mode. Opts: (null)
[  276.460528] EXT4-fs (sdg3): mounted filesystem with ordered data mode. Opts: (null)
[  276.499085] EXT4-fs (sdg1): mounted filesystem with ordered data mode. Opts: (null)
[  578.549827] EXT4-fs (sdg1): error count since last fsck: 103
[  578.549830] EXT4-fs (sdg1): initial error at time 1589216157: ext4_validate_block_bitmap:406
[  578.549832] EXT4-fs (sdg1): last error at time 1589372294: ext4_lookup:1590: inode 186081476
[  578.549836] EXT4-fs (sdg3): error count since last fsck: 47
[  578.549837] EXT4-fs (sdg3): initial error at time 1589268525: htree_dirblock_to_tree:1022: inode 31604737: block 126419458
[  578.549840] EXT4-fs (sdg3): last error at time 1589380312: ext4_lookup:1594: inode 33701921
[  578.549844] EXT4-fs (sdg2): error count since last fsck: 4
[  578.549845] EXT4-fs (sdg2): initial error at time 1589268773: ext4_validate_block_bitmap:406
[  578.549847] EXT4-fs (sdg2): last error at time 1589336019: ext4_validate_block_bitmap:406
[  639.938843] EXT4-fs (sdg1): mounting ext3 file system using the ext4 subsystem
[  640.950738] EXT4-fs (sdg1): mounted filesystem with ordered data mode. Opts: (null)
[  650.900006] EXT4-fs (sdg2): mounting ext3 file system using the ext4 subsystem
[  651.207658] EXT4-fs (sdg2): mounted filesystem with ordered data mode. Opts: (null)
[  658.836040] EXT4-fs (sdg3): mounting ext3 file system using the ext4 subsystem
[  659.084558] EXT4-fs (sdg3): mounted filesystem with ordered data mode. Opts: (null)

So the system knows there are errors and has still mounted the disk without displaying any warnings other than the entries in dmesg.

Roughly 30 minutes later I checked again because I was curious now and found:

[  955.353027] EXT4-fs (sdg2): error count since last fsck: 3248
[  955.353031] EXT4-fs (sdg2): initial error at time 1589268773: ext4_validate_block_bitmap:406
[  955.353033] EXT4-fs (sdg2): last error at time 1589437923: ext4_map_blocks:604: inode 103686210: block 1947002998
[  955.353039] EXT4-fs (sdg1): error count since last fsck: 103
[  955.353040] EXT4-fs (sdg1): initial error at time 1589216157: ext4_validate_block_bitmap:406
[  955.353042] EXT4-fs (sdg1): last error at time 1589372294: ext4_lookup:1590: inode 186081476
[  956.751484] EXT4-fs error (device sdg2): ext4_map_blocks:604: inode #103686210: block 1947002998: comm updatedb.mlocat: lblock 12 mapped to illegal pblock 1947002998 (length 1)
[  956.767496] EXT4-fs error (device sdg2): ext4_map_blocks:604: inode #103686210: block 1947002998: comm updatedb.mlocat: lblock 12 mapped to illegal pblock 1947002998 (length 1)
[  956.782683] EXT4-fs warning (device sdg2): htree_dirblock_to_tree:994: inode #103686210: lblock 12: comm updatedb.mlocat: error -117 reading directory block

Eeek! The error count has increased for sdg2!

Again I’ve not explicitly written to the disk all this time.

Before partitioning & formatting the drive with gparted I used fsck to run a bad block scan (took several days) and no errors were found. This is also a new disk. For this reason, I’m reasonably confident that the hardware is good.

What is possibly going on here? How worried should I be about the integrity of filesystems on this disk? What should my next steps be?

Hi kind people,

First of all: I do not have physical access to the pi. just ssh.

The problem:
I have a Pi 3b+ booting form an 1tb usb hdd with stretch.
Someone there unplugged the power and the Pi was gone for some days. Now its back online but i found some kernel messages at kern.log:

Jul 8 06:25:51 kernel: [56256.560073] EXT4-fs error (device sda2): ext4_lookup:1578: inode #1516286: comm updatedb.mlocat: deleted inode referenced: 1517646
Jul 8 06:25:51kernel: [56256.575081] EXT4-fs error (device sda2): ext4_lookup:1578: inode #1516286: comm updatedb.mlocat: deleted inode referenced: 1517644
Jul 8 06:25:51 kernel: [56256.586311] EXT4-fs error (device sda2): ext4_lookup:1578: inode #1516286: comm updatedb.mlocat: deleted inode referenced: 1517648
Jul 8 06:25:54 kernel: [56259.776821] EXT4-fs error (device sda2): ext4_lookup:1578: inode #37191: comm updatedb.mlocat: deleted inode referenced: 1517629
Jul 9 06:25:03 kernel: [142608.634548] EXT4-fs error (device sda2): ext4_lookup:1578: inode #1516286: comm updatedb.mlocat: deleted inode referenced: 1517646
Jul 9 06:25:03 kernel: [142608.656301] EXT4-fs error (device sda2): ext4_lookup:1578: inode #1516286: comm updatedb.mlocat: deleted inode referenced: 1517644
Jul 9 06:25:03 kernel: [142608.667414] EXT4-fs error (device sda2): ext4_lookup:1578: inode #1516286: comm updatedb.mlocat: deleted inode referenced: 1517648
Jul 9 06:25:03 kernel: [142608.898870] EXT4-fs error (device sda2): ext4_lookup:1578: inode #37191: comm updatedb.mlocat: deleted inode referenced: 1517629
Jul 9 06:56:14 kernel: [144479.464526] EXT4-fs (sda2): error count since last fsck: 8
Jul 9 06:56:14 kernel: [144479.464535] EXT4-fs (sda2): initial error at time 1562559951: ext4_lookup:1578: inode 1516286
Jul 9 06:56:14 kernel: [144479.464545] EXT4-fs (sda2): last error at time 1562646303: ext4_lookup:1578: inode 37191

I have no idea why its back online (if it was doing something while booting or someone attached power to it again).

Whats the best practice now?

Internet says running fsck at sda2 is a bad idea because its mounted as /.

Adding fsck.mode=force to cmdline.txt is suggested but the official docs doesn’t say anything about this parameter.

I asked at the chat and someone said since its not documented he would run fsck even if the partition is mounted. But im not feeling good with it, to be honest. So im asking for some help here.

My current cmdline.txt is the following:

Code: Select all

dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=PARTUUID=0862402d-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait ipv6.disable=1

Thank you in advance!

Thanks. I tried that but I’m still getting the same error.

Nov 13 09:54:49 pve kernel: EXT4-fs (dm-9): warning: mounting fs with errors, running e2fsck is recommended
Nov 13 09:54:49 pve kernel: EXT4-fs (dm-9): mounted filesystem with ordered data mode. Opts: (null)

Nov 13 09:59:49 pve kernel: EXT4-fs (dm-9): error count since last fsck: 4
Nov 13 09:59:49 pve kernel: EXT4-fs (dm-9): initial error at time 1478905387: ext4_journal_check_start:56
Nov 13 09:59:49 pve kernel: EXT4-fs (dm-9): last error at time 1478908776: ext4_put_super:813

It’s a pretty new SSD drive, not that that means anything…

root@pve:~# fdisk -l

Disk /dev/ram0: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram1: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram2: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram3: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram4: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram5: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram6: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram7: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram8: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram9: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram10: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram11: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram12: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram13: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram14: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/ram15: 64 MiB, 67108864 bytes, 131072 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/sda: 119.2 GiB, 128035676160 bytes, 250069680 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F6E39634-6FBF-4BCE-ADDD-F30AB03DA489

Device Start End Sectors Size Type
/dev/sda1 34 2047 2014 1007K BIOS boot
/dev/sda2 2048 262143 260096 127M EFI System
/dev/sda3 262144 250069646 249807503 119.1G Linux LVM

Disk /dev/mapper/pve-root: 29.8 GiB, 31943819264 bytes, 62390272 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/pve-swap: 7 GiB, 7516192768 bytes, 14680064 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/pve-vm—100—state—Oct_2016: 4.5 GiB, 4819255296 bytes, 9412608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disk /dev/mapper/pve-vm—100—state—Nov_2016: 4.5 GiB, 4819255296 bytes, 9412608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disk /dev/mapper/pve-vm—102—disk—1: 8 GiB, 8589934592 bytes, 16777216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disk /dev/mapper/pve-vm—101—disk—1: 18 GiB, 19327352832 bytes, 37748736 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disk /dev/mapper/pve-vm—103—disk—1: 8 GiB, 8589934592 bytes, 16777216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disk /dev/mapper/pve-vm—100—disk—1: 32 GiB, 34359738368 bytes, 67108864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: dos
Disk identifier: 0x0002290a

Device Boot Start End Sectors Size Id Type
/dev/mapper/pve-vm—100—disk—1p1 * 2048 718848 716801 350M 83 Linux
/dev/mapper/pve-vm—100—disk—1p2 718849 4913152 4194304 2G 82 Linux swap / Solaris
/dev/mapper/pve-vm—100—disk—1p3 4913153 7010304 2097152 1G 83 Linux
/dev/mapper/pve-vm—100—disk—1p4 7010305 67108863 60098559 28.7G f W95 Ext’d (LBA)
/dev/mapper/pve-vm—100—disk—1p5 7010306 27076608 20066303 9.6G 83 Linux
/dev/mapper/pve-vm—100—disk—1p6 27076610 38340608 11263999 5.4G 83 Linux
/dev/mapper/pve-vm—100—disk—1p7 38340610 64704512 26363903 12.6G 83 Linux
/dev/mapper/pve-vm—100—disk—1p8 64704514 66648064 1943551 949M 83 Linux

Partition 3 does not start on physical sector boundary.

Partition 4 does not start on physical sector boundary.

Partition 5 does not start on physical sector boundary.

Partition 6 does not start on physical sector boundary.

Partition 7 does not start on physical sector boundary.

Partition 8 does not start on physical sector boundary.

Partition 9 does not start on physical sector boundary.

Disk /dev/mapper/pve-vm—100—state—Nov_2016_2: 4.5 GiB, 4819255296 bytes, 9412608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes

View previous topic :: View next topic   Author Message Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

PostPosted: Sun Dec 17, 2017 6:40 pm    Post subject: EXT errors Reply with quote

I’ve been getting these errors everyone so often:

Code:
Dec 09 22:07:17 [kernel] [784634.848851] EXT4-fs (sda3): error count since last fsck: 1

Dec 09 22:07:17 [kernel] [784634.848855] EXT4-fs (sda3): initial error at time 1512259322: ext4_mb_generate_buddy:758

Dec 09 22:07:17 [kernel] [784634.848859] EXT4-fs (sda3): last error at time 1512259322: ext4_mb_generate_buddy:758

and earlier:

Code:
Nov 29 11:19:58 [kernel] [1552192.479561] EXT4-fs (sda3): error count since last fsck: 710

Nov 29 11:19:58 [kernel] [1552192.479565] EXT4-fs (sda3): initial error at time 1511004226: ext4_mb_complex_scan_group:1972

Nov 29 11:19:58 [kernel] [1552192.479569] EXT4-fs (sda3): last error at time 1511596639: ext4_mb_generate_buddy:758

Code:
Nov 28 10:51:58 [kernel] [1464112.095562] EXT4-fs (sda3): error count since last fsck: 710

Nov 28 10:51:58 [kernel] [1464112.095566] EXT4-fs (sda3): initial error at time 1511004226: ext4_mb_complex_scan_group:1972

Nov 28 10:51:58 [kernel] [1464112.095570] EXT4-fs (sda3): last error at time 1511596639: ext4_mb_generate_buddy:758

Is this a sign of a bad disk? A bad motherboard? Something else?

This is in a quite-old (it has a Core 2 Duo) laptop with an about 6 year old SSD in it.

Any suggestions about how to proceed would be helpful.

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Sun Dec 17, 2017 6:47 pm    Post subject: Reply with quote

Nicias,

Maybe all of these things, maybe none of them.

What other errors are there in dmesg?

Put it all on a pastebin site please.

The output of

Code:
smartctl -a /dev/sda

would be useful.

Don’t run fsck unless you have a known good set of backups.

fsck often makes a bad situation worse.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

PostPosted: Mon Dec 18, 2017 11:48 am    Post subject: Reply with quote

got a ton of new errors in the last day:

https://pastebin.com/s7KVxKmR

but SMART looks fine.

Code:
# smartctl -a /dev/sda

smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.12-gentoo] (local build)

Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===

Model Family:     Samsung based SSDs

Device Model:     Samsung SSD 840 Series

Serial Number:    S14CNEACA81371V

LU WWN Device Id: 5 002538 55002d356

Add. Product Id:  00000000

Firmware Version: DXT06B0Q

User Capacity:    120,034,123,776 bytes [120 GB]

Sector Size:      512 bytes logical/physical

Rotation Rate:    Solid State Device

Device is:        In smartctl database [for details use: -P show]

ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c

SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)

Local Time is:    Mon Dec 18 06:47:20 2017 EST

SMART support is: Available — device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

General SMART Values:

Offline data collection status:  (0x80)   Offline data collection activity

               was never started.

               Auto Offline Data Collection: Enabled.

Self-test execution status:      (   0)   The previous self-test routine completed

               without error or no self-test has ever

               been run.

Total time to complete Offline

data collection:       (  240) seconds.

Offline data collection

capabilities:           (0x53) SMART execute Offline immediate.

               Auto Offline data collection on/off support.

               Suspend Offline collection upon new

               command.

               No Offline surface scan supported.

               Self-test supported.

               No Conveyance Self-test supported.

               Selective Self-test supported.

SMART capabilities:            (0x0003)   Saves SMART data before entering

               power-saving mode.

               Supports SMART auto save timer.

Error logging capability:        (0x01)   Error logging supported.

               General Purpose Logging supported.

Short self-test routine

recommended polling time:     (   2) minutes.

Extended self-test routine

recommended polling time:     (  30) minutes.

SCT capabilities:           (0x003d)   SCT Status supported.

               SCT Error Recovery Control supported.

               SCT Feature Control supported.

               SCT Data Table supported.

SMART Attributes Data Structure revision number: 1

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       —       0

  9 Power_On_Hours          0x0032   095   095   000    Old_age   Always       —       24961

 12 Power_Cycle_Count       0x0032   097   097   000    Old_age   Always       —       2075

177 Wear_Leveling_Count     0x0013   094   094   000    Pre-fail  Always       —       52

179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       —       0

181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       —       0

182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       —       0

183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       —       0

187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       —       0

190 Airflow_Temperature_Cel 0x0032   067   053   000    Old_age   Always       —       33

195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       —       0

199 CRC_Error_Count         0x003e   100   100   000    Old_age   Always       —       0

235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       —       214

241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       —       2562291420

SMART Error Log Version: 1

No Errors Logged

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline       Completed without error       00%      1035         —

# 2  Extended offline    Completed without error       00%      1023         —

# 3  Short offline       Completed without error       00%      1000         —

# 4  Short offline       Completed without error       00%       999         —

# 5  Short offline       Completed without error       00%       975         —

# 6  Short offline       Completed without error       00%       951         —

# 7  Short offline       Completed without error       00%       927         —

# 8  Short offline       Completed without error       00%       903         —

# 9  Short offline       Completed without error       00%       869         —

#10  Extended offline    Completed without error       00%       855         —

#11  Short offline       Completed without error       00%       832         —

#12  Short offline       Completed without error       00%       831         —

#13  Short offline       Completed without error       00%       807         —

#14  Short offline       Completed without error       00%       783         —

#15  Short offline       Completed without error       00%       759         —

#16  Short offline       Completed without error       00%       735         —

#17  Short offline       Completed without error       00%       711         —

#18  Extended offline    Completed without error       00%       687         —

#19  Short offline       Completed without error       00%       664         —

#20  Short offline       Completed without error       00%       663         —

#21  Short offline       Completed without error       00%       633         —

SMART Selective self-test log data structure revision number 1

 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

  255        0    65535  Read_scanning was never started

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.



I have a known good backup.

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Mon Dec 18, 2017 12:15 pm    Post subject: Reply with quote

Nicias,

There are no underlying drive errors in dmesg nor in smartclt.

If you know your backup is good, remake the filesystem and restore the backup.

If the backup was made with those filesystem errors, you don’t know that its good, even if it seems to be.

Try fsck but be warned that all it does it make the filesystem metadata self consistent. It may trash your user data in the process of fixing the metadata.

That’s because in the face of missing or conflicting information, it guesses and it can guess incorrectly.

All the bits that fsck doesn’t know what to do with end up in /lost+found, which should always be empty.

You cannot fsck a mounted partition.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

PostPosted: Mon Dec 18, 2017 3:22 pm    Post subject: Reply with quote

I have done fsck as recently as a couple of weeks ago. (from a sysrescuecd usb)

This is just the system drive, it has no actual data on it, so I’m not worried about data loss. I’d clobber the whole thing and do a reinstall except for the time that would take.

I’ll wipe the disk and reinstall from the last backup. Why would these errors keep popping up?

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Mon Dec 18, 2017 3:49 pm    Post subject: Reply with quote

Nicias,

Unclean shutdows, PSU problems of some sort. Maybe even RAM issues.

Its worth a few cycles of memtest86. Be aware that memtest86 uses most of the rest of the system, so not all errors reported by memtest are due to RAM.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

PostPosted: Wed Dec 20, 2017 3:06 am    Post subject: Reply with quote

Memtest ran for 24 hours and found no errors.

fsck found a ton of errors on sda3. sdb1 had no errors. sda is an internal sata ssd. sdb1 is externally (usb) attached spinning rust.

Any suggestions? Bad drive? Bad motherboard/controler?

Back to top Ant P.
Watchman
Watchman

Joined: 18 Apr 2009
Posts: 6920

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Wed Dec 20, 2017 11:10 am    Post subject: Reply with quote

Nicias,

Bad SSD firmware ?

Do you use trim/discard?

There are SSDs with problem firmware where trim can erase the wrong things.

There is one famous example where LBA 0 (the boot sector) would be trimmed, making the system impossible to boot.

— edit —

Hmm …

Code:
Device Model:     Samsung SSD 840 Series



Lets just add that that device has some history
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

PostPosted: Wed Dec 20, 2017 12:47 pm    Post subject: Reply with quote

I don’t have trim or discard set.

So it seems like maybe I should get a new drive :/

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Wed Dec 20, 2017 1:26 pm    Post subject: Reply with quote

Nicias,

I wouldn’t go that far yet.

You have

Code:
Model Family:     Samsung based SSDs

Device Model:     Samsung SSD 840 Series



with

Code:
Firmware Version: DXT06B0Q

Is there a newer firmware?

What does it fix?

This tool may help. Its probably Windows only.

In increasing order of risk.

Its worth doing nothing, and see if the problems recur.

Its worth making a new filesystem and restoring from backup.

That will issue a trim command to the entire partition at the start of mke2fs.

If the backup is not known to be good, it may not help.

Reinstall after making a new filesystem.

Very last — update the drive firmware, if there is an update.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

PostPosted: Wed Dec 20, 2017 8:31 pm    Post subject: Reply with quote

The backup is file-level not file-system level. Doesn’t that mean that if it is an accurate copy of a bad file-system that it will just have files that are screwed up, not a corrupted file-system?

So if I reformat the drive, recreate the file system and restore from backup then I might just have some bad files, not a bad file-system. In that case would doing a emerge -e world (and recompile the kernel) fix those files?

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Wed Dec 20, 2017 9:02 pm    Post subject: Reply with quote

Nicias,

That’s correct.

You run the risk that something important like glibc is broken, so you won’t be able to boot, or something in the toolchain is broken, so you won’t be able to build packages.

However, challenges like that can be fixed if they arise.

You would have noticed both of those particular examples already though but you get the idea.

Its possible that the restored backup will not work as expected.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Wed Dec 20, 2017 9:56 pm    Post subject: Reply with quote

Nicias,

Good luck!
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

PostPosted: Fri Dec 22, 2017 2:29 pm    Post subject: Reply with quote

So far everything is running smoothly.

Reformatted and reinstalled from backup, then rebuilt toolchain, kernel, and world. Now doing the gcc upgrade for PIE (and world rebuild) no fs errors yet. When this world rebuild is done I’ll reboot to a live usb to check for fs errors.

In terms of trim/discard, it seems like best practice is to do that via a cron job. Is this correct?

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Fri Dec 22, 2017 5:58 pm    Post subject: Reply with quote

Nicias,

There are divided opinions on the use of trim/discard.

Once you issue a trim command, your data can be removed by the drive at any time.

There is generally no possibility of recovering data from trimmed space.

If that might matter to you, run fstrim manually when you are sure you wont want anything back.

Beware that some drives take a long time to become ready after a fstrim.

I have one that takes over 10min. If they are online, that’s fine, if you reboot, you might get a fright as the drive seems to have failed.

Personally, I have the discard option in /etc/fstab but only the installed system is on the SSD.

/home is on rotating rust, so trim/discard does not apply.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

PostPosted: Fri Dec 22, 2017 8:20 pm    Post subject: Reply with quote

NeddySeagoon,

There is not data on the SSD here either, so I put a daily fstrim cron job.

Thanks for all of your help. After it finished emerging, I restarted from a thumb drive. Checked the filesystems, no errors. Hopefully this fixes it.

-Nick

Back to top NeddySeagoon
Administrator
Administrator

Joined: 05 Jul 2003
Posts: 51961
Location: 56N 3W

PostPosted: Fri Dec 22, 2017 11:31 pm    Post subject: Reply with quote

Nicias,

We don’t know what happened, so cannot take any steps to stop it happening again.

All you can do is to watch for the errors recurring.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail.

Back to top Jaglover
Watchman
Watchman

Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

Back to top Nicias
Guru
Guru

Joined: 06 Dec 2005
Posts: 446

Back to top

Display posts from previous:   

Понравилась статья? Поделить с друзьями:
  • Expression m error code
  • Expression fatal error stalker
  • Expression fatal error function simpleexceptionfilter file xrdebugnew cpp line 498
  • Expression expected pycharm как исправить
  • Expression error при вычислении возникло переполнение стека продолжение невозможно