Blockdev ioctl error on blkrrpart device or resource busy

Sometimes, when resizing or otherwise mucking about with partitions on a disk, cfdisk will say: Wrote partition table, but re-read table failed. Reboot to update table. (This also happens with...

Sometimes, when resizing or otherwise mucking about with partitions on a disk, cfdisk will say:

Wrote partition table, but re-read table failed. Reboot to update table.

(This also happens with other partitioning tools, so I’m thinking this is a Linux issue rather than a cfdisk issue.) Why is this, and why does it only happens sometimes, and what can I do to avoid it?

Note: Please assume that none of the partitions I am actually editing are opened, mounted or otherwise in use.


Update:

cfdisk uses ioctl(fd, BLKRRPART, NULL) to tell Linux to reread the partition table. Two of the other tools recommended so far (hdparm -z DEVICE, sfdisk -R DEVICE) does exactly the same thing. The partprobe DEVICE command, on the other hand, seems to use a new ioctl called BLKPG, which might be better; I don’t know. (It also falls back on BLKRRPART if BLKPG fails.)

BLKPG seems to be a «this partition has changed; here is the new size» operation, and it looked like partprobe called it individually on all the partitions on the device passed, so it should work if the individual partitions are unused. However, I have not had the opportunity to try it.

Scenario / Questions

Sometimes, when resizing or otherwise mucking about with partitions on a disk, cfdisk will say:

Wrote partition table, but re-read table failed. Reboot to update table.

(This also happens with other partitioning tools, so I’m thinking this is a Linux issue rather than a cfdisk issue.) Why is this, and why does it only happens sometimes, and what can I do to avoid it?

Note: Please assume that none of the partitions I am actually editing are opened, mounted or otherwise in use.


Update:

cfdisk uses ioctl(fd, BLKRRPART, NULL) to tell Linux to reread the partition table. Two of the other tools recommended so far (hdparm -z DEVICE, sfdisk -R DEVICE) does exactly the same thing. The partprobe DEVICE command, on the other hand, seems to use a new ioctl called BLKPG, which might be better; I don’t know. (It also falls back on BLKRRPART if BLKPG fails.)

BLKPG seems to be a “this partition has changed; here is the new size” operation, and it looked like partprobe called it individually on all the partitions on the device passed, so it should work if the individual partitions are unused. However, I have not had the opportunity to try it.

Find below all possible solutions or suggestions for the above questions..

Suggestion: 1:

IMHO the most reliable/best answer is

partprobe /dev/sdX
Suggestion: 2:

Rereading partition table information doesn’t always work, but try

hdparm -z /dev/sda

or

sfdisk -R /dev/sda

If it works the values in /proc/partitions will change.

Suggestion: 4:

Note: Please assume that none of the partitions I am actually editing
are opened, mounted or otherwise in use.

Given that assumption, the partition table can be successfully rescanned, and the issue won’t arise. If you’re getting that error, it’s because the partition table is currently in use, and hence can’t be re-scanned without creating inconsistencies.

Suggestion: 5:

It is not based on partition that you are editing.

Suppose you have only one harddisk (/dev/sda) and two partitions (/dev/sda1, /dev/sda2) and you have mounted only one partition (/dev/sda1). If you delete or change anything about other partition which is not even mounted (/dev/sda2) you will get the error that re-reading of partition table failed and kernel will use old table.

But if you have two harddisks (/dev/sda, /dev/sdb) and none of the partitions of (/dev/sdb) are in use. Then you can add / delete / resize /edit partitions of /dev/sdb and they will be re-read without any problem. But even if one partition of /dev/sdb was mounted during change. Then kernel will keep using old table.

Suggestion: 6:

I (the original questioner) had a situation a few days ago when none of the other answers (including partprobe /dev/sdX, currently the accepted and highest-voted answer) worked. What did work, however, was this:

blockdev --rereadpt /dev/sdX

(I don’t know why this worked and the others didn’t, but I’m happy it did work, as it saved me a reboot on a busy server.)

Suggestion: 7:

i’m on centos 6.5 x64 ; kernel 2.6.32 . and i’m testing the fdisk trick to resize.

/dev/sda1 /boot
/dev/sda2 /

All the following commands did not make kernel reread partition :

i still need a reboot to make it work

Suggestion: 8:

With all mount points unmounted, running Yocto 2.4:

partprobe /dev/sda 

Still failed to re-load the partition table after partitions had been deleted on the device. Also tried — and failed were:

udevadm trigger --subsystem-match=block; udevadm settle
hdparm -z /dev/sda
blockdev --rereadpt /dev/sda

All reported similar “BLKRRPART failed : device or resource busy…” errors instructing me to reboot. Is this failure of previously working methods possibly due to the fact that udev is now under systemd control? Thinking along those lines I tried:

systemctl restart systemd-udevd.service

And suddenly my disk is available again, without a reboot!

Suggestion: 9:

When a command like blockdev --rereadpt /dev/sdX fails with

blockdev: ioctl error on BLKRRPART: Device or resource busy

this usually means that some (old) partition is indeed still somehow used by the kernel.

Possible causes/fixes:

  1. an sdX partition – say sdX1 – is still mounted – check with mount and umount it
  2. /dev/sdX1 is part of a software raid – check cat /proc/mdstat and possibly stop the relevant arrays, e.g. mdadm --stop /dev/md126
  3. /dev/sdX1 is part of an LVM physical volume – check with pvdisplay/vgdisplay and possibly deactivate with vgchange
  4. /dev/sdX1 is part of some device mapping – e.g. via cryptsetup – check /dev/mapper and lsblk and possibly remove the mapping (e.g. cryptsetup luksClose)
  5. Race condition with some udev probing – check running processes with ps and possibly kill one

If one tool – say blockdev --rereadpt fails usually similar ones like (partx -uv, kpartx, partprobe, kpartprobe) fail in a similar way until the root cause is eliminated.

Suggestion: 10:

You can also try:

echo 1 > /sys/block/sdX/device/rescan

(But won’t work, see the comment below)

Suggestion: 11:

kpartx -a <partition> can be run two times on newly created partition …. instead of rebooting the system.

Suggestion: 12:

Remember to check udev service is running. This is especially useful when partprobe, hdparm, blockdev, and various other commands do not seem to make any difference what device files are available in /dev/ directory.

Suggestion: 13:

For me neither partprobe or blockdev solution worked. Although, this one works:

udevadm settle --exit-if-exists=/dev/sdb1
Suggestion: 14:

If you read the manpage for ‘man oracleasm-scandisks’ you will note the text below. oracleasm is using /proc/partitions as the source of all scanning it performs. You must get your raw devices listed in /proc/partitions before you can do a scandisk. The Scanorder and Scanexclude parameters you place in /etc/sysconfig/oracleasm relate to the names found in /proc/partitions (!!!!).

———- man oracleasm-scandisks ——

HOW SCANNING HAPPENS
The scan proceeds in four basic stages.

   First, the list of disks to scan is created. If disks were specified on the command line, this is the list.
   If not, /proc/partitions is read, and each block device is added, subject to the -o and -x options.

   Second, the partition tables of each disk in the scan are reloaded unless the -s option was specified. Any
   disks that no longer exist are dropped.

   Third, the list of disks is recreated based on the new partition tables.

   Finally, each disk in the list is checked to see if it is marked for ASM use. Disks that are marked are
   instantiated.

Disclaimer: This has been sourced from a third party syndicated feed through internet. We are not responsibility or liability for its dependability, trustworthiness, reliability and data of the text. We reserves the sole right to alter, delete or remove (without notice) the content in its absolute discretion for any reason whatsoever.

Source: Reread partition table without rebooting?

Comments

@jl-montes

We’ve seen for the past several months occasional and random behavior when attempting disk installations after PXE booting to an in-memory version of CoreOS.

What we notice is that the coreos-install will fail and the last error indicates BLKRRPART: Device or resource busy

The work-around we’ve typically employed it to reboot, PXE again to CoreOS in-memory, then attempt the disk install again, 98-99% of the time we never see the error again and we get a Successful install to disk

Attached is a sample screen of when the random failure happens.

We’ve seen this on bare-metal blade servers and pizza-box servers, KVM vm’s, and Hyper-V vm;s in the past.

coreos-blkrrpart-devicebusy

@marineam

Hm, it is possible that we are racing with udev. I don’t know for sure but maybe if udev triggered BLKRRPART before us and is now probing the filesystems on the partitions the open partition device node(s) will cause the disk do be considered in-use. We could call udevadm settle but due to the way udev detects changes to disks it is hard to avoid racing with it. I will need to do more research to figure out the best solution.

@jumanjiman

fyi: we’ve seen the same issue and employeed same workaround as @jl-montes

@wdennis

Same problem experienced tonight (v444.4.0) on a bare-metal install from CD; booted again & retried install, happened again. Funny thing is after that I booted from disk to see what would happen, CoreOS was actually installed, just didn’t pick up my cloud-init (which was on a path mounted from USB drive.)

@ChrisGuilbault

I ran into this same issue repeated. Posted to the google group, but I ran into this issue 19 out of 20 times even with rebooting.

However I found a second work around that does not require a reboot.

After receiving the BLKRRPART error then I unmounted the drive I was installing to fixed the GPT to span the entire drive and ran the install script again. This worked 100% of the time. I tested it on three machines that I received the BLKRRPART or segfault errors on while trying to install to disk.

@cimnine

Same for me as for @wdennis: Script failed with BLKRRPART, but coreos was actually installed and it booted. To be sure everything is alright I shutdown the instance of CoreOS that got installed, re-ran the ISOLINUX and re-ran the install script once again. This time: Success!

Some more info that might be helpfull:
I have an Intel NUC. The disc is some SSD.
The disk had a multi-partition layout before. (Previously installed was XMBCbuntu with ‘default’ XMBCbuntu partition layout. Not that I expect XMBCbuntu to be the cause of the trouble, but it might help you to get an idea of the previously configured partition layout.)

johscheuer

added a commit
to johscheuer/theforeman-coreos-kubernetes
that referenced
this issue

Feb 23, 2015

@johscheuer

@apscomp

this error also happens when installing via ISO (CD-ROM), however, if you actually eject the CD, and reboot… you will find that it has indeed installed coreOS….

@domq

It seems unlikely that udev is the culprit: I just tried killing it until systemd gave up restarting it, and still

bash-4.2# sfdisk -R /dev/sda
sfdisk: BLKRRPART: Device or resource busy
sfdisk: This disk is currently in use.

Using CoreOS stable (CoreOS 607.0.0), also PXE-booted. I am at a loss to determine what is actually keeping the partitions busy, but since I can reproduce this at will, please feel free to suggest commands to try.

@marineam

Interesting, so there must be two races going on. There is a race with
udev, it uses inotify to detect if the full-disk node has been written to
and will then issue the same ioctl to have the kernel reprobe the device
and then as it receives the uevent back from the kernel it will recreate
/dev/disk/by-* symlinks so if you mount by label or similar too quickly it
can fail due to the missing link. We observe this while building CoreOS
images from time to time.

The explicit reprobe in this script is there because only relatively recent
versions of udev have this automatic behavior and even in versions that do
there isn’t a reliable way to wait for udev to do its magic. I previously
assumed that the busy errors were due to our reprobe happening while udev
was working but if that is not the case we have a whole new mystery on our
hands.
On Mar 19, 2015 11:33 AM, «domq» notifications@github.com wrote:

It seems unlikely that udev is the culprit: I just tried killing it until
systemd gave up restarting it, and still

bash-4.2# sfdisk -R /dev/sda
sfdisk: BLKRRPART: Device or resource busy
sfdisk: This disk is currently in use.


Reply to this email directly or view it on GitHub
#152 (comment).

@cybertk

encounter this issue on 633.1.0, any update?

@threedliteguy

I had the same error installing from ISO. From USB I got Input/Output Error. Pulling up gparted from the Ubuntu live cd, it fixed errors in the GPT layout, which was all small partitions (I was installing to a 1TB HD). Reran twice, same error.

@scari

Same error here. PXE booted and Installing from stable channel. Just followed same workaround as @jl-montes but the success rate is way lower.

@tdeheurles

Same issue installing with 745.1.0.

I had the issue installing from a ubuntu key, and then work.

Now I try to reinstall from inside the running coreos.
Can someone confirm that the command coreos-install can be executed from the running os ?

@stresler

We ran into this several times, and I think I found the culprit in at least one scenario.

We kept encountering this with client provisions and couldn’t reproduce in lab until we used the client’s exact userdata, and then it was intermittent.

When cloud-init contains anything that involves a download it seems to lock the device. The weird part is it does this even if the disk doesn’t have a filesystem on it at all.

I suspect docker or something else in third-party expects a disk to be present and if it isn’t it picks what it thinks should be the first one /dev/sda and tries to use it.

For our installer image, I removed any access to userdata until the second boot and haven’t encountered it again yet. Will update if I see it again.

@domq

So in the case of our farm of bare-metal boxes, I think that the issue was pre-existing LVM volumes .

Zapping them with vgremove prior to running coreos-install solves the issue for me. Teaching coreos-install to do same could be worthwhile, although slightly trickier for multi-disk systems.

@blemmenes

I can confirm using vgremove as @domq mentioned just worked for me on baremetal booted via IPMI mounted ISO.

@Evidlo

I had this issue, too. The disk used to be part of a RAID.

I went ahead and reformatted the disk with fdisk and it installed successfully. Not sure if this is actually the cause of the success though.

@Wintereise

Just ran into this, device used to be part of a LVM volume group.

Will remove metadata manually and see if it works.

@wgzhao

I had this issue, too.
I use system rescuecd liveOS booting,then ran core-install /dev/sda .but it give me
BLKRRPART: Device or resource busy
I comment out
blockdev --rereadpt "${DEVICE}

at about line 303

It’s works.

@stresler

It seems like this issue is getting a bit muddied. Unfortunately the BLKRRPART error can be caused by a variety of reasons and sometimes is legitimate reporting (device is actually busy).

I think for this issue to remain valid we need to create specific reproduction steps and treat each verified set of reproduction steps as a new issue. The issue we had has been resolved (see this comment).

Does anyone have specific reproduction steps? Otherwise I suggest archiving this for reference and treating new instances as new issues with a focus on tracking it down per hardware setup.

Does that sound reasonable?

@gbock

One case is definitely active LVM Volume groups on DEVICE from a previous install of another OS (my use case is having to install CoreOS via grub from a previous CentOS install).

Pre Install state:

localhost ~ # parted -s /dev/sda print
Model: ATA CentOS to CoreOS (scsi)
Disk /dev/sda: 68.7GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  525MB   524MB   primary  ext4         boot
 2      525MB   68.7GB  68.2GB  primary               lvm

localhost ~ # lsblk 
NAME                 MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                    8:0    0    64G  0 disk 
|-sda1                 8:1    0   500M  0 part 
`-sda2                 8:2    0  63.5G  0 part 
  |-VolGroup-lv_home 254:0    0    31G  0 lvm  
  |-VolGroup-lv_root 254:1    0  30.5G  0 lvm  
  `-VolGroup-lv_swap 254:2    0     2G  0 lvm  
sr0                   11:0    1  96.4M  0 rom  
sr1                   11:1    1   395M  0 rom  
loop0                  7:0    0 182.1M  0 loop /usr

Post Install state:

localhost ~ # parted -s /dev/sda print
Warning: Not all of the space available to /dev/sda appears to be used, you can fix the GPT to use all of the space (an extra 124928000 blocks) or continue with the current setting? 
Model: ATA CentOS to CoreOS (scsi)
Disk /dev/sda: 68.7GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: pmbr_boot

Number  Start   End     Size    File system  Name        Flags
 1      2097kB  136MB   134MB   fat16        EFI-SYSTEM  boot, legacy_boot, esp
 2      136MB   138MB   2097kB               BIOS-BOOT   bios_grub
 3      138MB   1212MB  1074MB  ext2         USR-A
 4      1212MB  2286MB  1074MB               USR-B
 6      2286MB  2420MB  134MB   ext4         OEM
 7      2420MB  2487MB  67.1MB               OEM-CONFIG
 9      2487MB  4754MB  2267MB  ext4         ROOT

localhost ~ # lsblk 
NAME                 MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                    8:0    0    64G  0 disk 
|-sda1                 8:1    0   500M  0 part 
`-sda2                 8:2    0  63.5G  0 part 
  |-VolGroup-lv_home 254:0    0    31G  0 lvm  
  |-VolGroup-lv_root 254:1    0  30.5G  0 lvm  
  `-VolGroup-lv_swap 254:2    0     2G  0 lvm  
sr0                   11:0    1  96.4M  0 rom  
sr1                   11:1    1   395M  0 rom  
loop0                  7:0    0 182.1M  0 loop /usr

These should be set inactive before the image is written to DEVICE:

---- /usr/bin/coreos-install    2015-12-01 02:01:29.000000000 +0000
+++ /tmp/coreos-install 2015-12-03 16:20:37.535278241 +0000
@@ -284,6 +284,9 @@
 echo "Downloading the signature for ${IMAGE_URL}..."
 wget --inet4-only --no-verbose -O "${WORKDIR}/${SIG_NAME}" "${SIG_URL}"

+# Deactivate any LVM volume groups on DEVICE
+vgs -a -o +devices | awk "/ ${DEVICE/////}[0-9]/{print $1}" | sort -u | xargs -n1 vgchange -an
+
 echo "Downloading, writing and verifying ${IMAGE_NAME}..."
 declare -a EEND
 if ! wget --inet4-only --no-verbose -O - "${IMAGE_URL}" 

I’m using this unit for now in my oem cloud config (a bit more of a hammer):

coreos:
  units:
    - name: deactivate-lvm.service
      command: start
      content: |
        [Service]
        Type=oneshot
        ExecStart=/bin/sh -c '/usr/sbin/vgs --noheadings -o vg_name | /usr/bin/xargs -n1 /usr/sbin/vgchange -an'

@coconutpilot

I can reproduce this quite easily PXE booting on a test cluster of Dell Optiplex 960 towers (desktop class machines) and running coreos-install. I wonder how this lower spec of hardware comes into play … (slow disks, unsafe caching, etc).

This is an example of a failed install:

core@fox3 ~ $ sudo coreos-install -C alpha -V current -d /dev/sda -c fox3.yaml 
Checking availability of "local-file"
Fetching user-data from datasource of type "local-file"
Downloading the signature for http://alpha.release.core-os.net/amd64-usr/current/coreos_production_image.bin.bz2...
2015-12-07 23:15:29 URL:http://alpha.release.core-os.net/amd64-usr/current/coreos_production_image.bin.bz2.sig [543/543] -> "/tmp/coreos-install.RsziI7s9km/coreos_production_image.bin.bz2.sig" [1]
Downloading, writing and verifying coreos_production_image.bin.bz2...
2015-12-07 23:16:13 URL:http://alpha.release.core-os.net/amd64-usr/current/coreos_production_image.bin.bz2 [231125324/231125324] -> "-" [1]
gpg: Signature made Thu Dec  3 02:51:35 2015 UTC using RSA key ID 1CB5FA26
gpg: key 93D2DCB4 marked as ultimately trusted
gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: Good signature from "CoreOS Buildbot (Offical Builds) <buildbot@coreos.com>" [ultimate]
blockdev: ioctl error on BLKRRPART: Device or resource busy

Both sync and blockdev --flushbufs work around this problem (for me at least).

$ diff -u /usr/bin/coreos-install /tmp/coreos-install 
--- /usr/bin/coreos-install     2015-11-05 02:11:46.000000000 +0000
+++ /tmp/coreos-install 2015-12-07 23:11:29.251420878 +0000
@@ -300,7 +300,7 @@
 fi

 # inform the OS of partition table changes
-blockdev --rereadpt "${DEVICE}"
+blockdev --flushbufs --rereadpt "${DEVICE}"

 if [[ -n "${CLOUDINIT}" ]] || [[ -n "${COPY_NET}" ]]; then
     # The ROOT partition should be #9 but make no assumptions here!

@crawford

@jumanjiman

@coconutpilot

Before I submitted my pull request I wanted to do some more testing. A simpler test case of the bug is (may need to run a few times):

blockdev --rereadpt /dev/sda & blockdev --rereadpt /dev/sda & wait

Looking at the kernel ioctl: http://lxr.free-electrons.com/source/block/ioctl.c#L184

its wrapped in a mutex so I am at a loss as to why this is happening?

As noted by @marineam in the first comment to the issue report it seems to be a race with udev, this is where udev does BLKRRPART:

https://github.com/systemd/systemd/blob/564c44436cf64adc7a9eff8c17f386899194a893/src/udev/udevd.c#L1043

This means my proposed fix blockdev --flushbufs only worked because it gave enough time for the rescan called by udevd to complete. Additionally @marineam had it figured out from the start.

A proposal for a solution:

  • add a pre-flight check that the install device has no mounted partitions
  • remove the call to blockdev --rereadpt as udevd already effectively makes the same ioctl.
  • add a sleep 1 between writing the filesystem to /dev/sdX and mounting. Perhaps udevadm settle could be used here, but docs focus on using that command at boot time.

Does this sound sane?

@danilochilene

I got the same issue.

Only worked after deleting the VG.

@jotasixto

When I tried install on disk the CoreOS stable (835.12.0) on XEN 3.16 PV Guest I had the BLKRRPAR error.
My mistake was that I had the LVM volume with malformed name. I was adding a number to the name to volume label, I think that this create a conflict interpreting the volume as a partition.
I rename the volume label without numbers at the end and it works fine.

@thereallukl

I just hit it when I am installing CoreOS on machines that had CentOS installed and for me removing DM mapping prior to running installer fixes the issue:

dmsetup remove centos_cnlvr01r07s2-root
dmsetup remove centos_cnlvr01r07s2-home
dmsetup remove centos_cnlvr01r07s2-swap

centos_blah_blah names can be listed from /dev/mapper/*.

After that I can write CoreOS to /dev/sda

The-42

pushed a commit
to avionic-design/pbs-platform-avionic-design
that referenced
this issue

May 10, 2016

@dirkleber

@ivarec

Just bumped into this issue on a machine that was previously a RAID-1 setup. Got it to work after thrashing both disks’ MBR and rebooting:

BEWARE: this will trash all data in those disks!

dd if=/dev/zero of=/dev/sda bs=512 count=1
dd if=/dev/zero of=/dev/sdb bs=512 count=1
reboot

@crawford

@haolez which version of CoreOS was being used to run coreos_install?

@ivarec

@crawford 1122.2.0

EDIT Ah, sorry! I was using 1122.2.0 to run coreos-install. The one mentioned previously was what I was trying to install locally.

(Not a production environment)

@pfischer8989

This still happens on 1153.0.0 as well. Fixed it with @haolez dd trick.

@crawford

@haolez interesting. coreos-install dd’s over the MBR of the target disk, so one of those calls you mentioned shouldn’t be necessary. Is the firmware booting from the wrong device? What is the behavior you are seeing?

@crawford

Closing due to inactivity.

@honeyankit

Hi,

I am getting the below error while doing a pxeboot installation of CoreOS v1122.3.0 in virtualbox:

What we notice is that the coreos-install will fail and the last error indicates BLKRRPART: Device or resource busy

I tried no of suggestions mentioned above but none of it solve this problem.
Any help?

@Merlin83b

Experienced this on current stable (1235.12.0). The storage was previously set up with LVM. As @wdennis stated, the install had actually completed and on reboot came up to CoreOS. The contents of my cloud-config.yaml hadn’t been applied so had to reinstall again. The second time round there were no errors and cloud-config.yaml settings were applied.

@samganesan

I was installing coreos 1576.4.0 on a machine that had ubuntu installed with LVM active. I experienced the same thing and so I tried to reproduce it on a different machine that also had LVM installed as well as the negative control of a machine with no LVM on a previous install. This is completely reproducible on 4 attempts. For the fourth attempt, I dd-ed a LVM Linux system on to a machine before doing the coreos install. Installs but no cloud-config details… SO reboot and re-install and that clears it up.

I was installing after booting into a live USB of ubuntu and I tried it with a live USB of a Centos as well. I wonder what is holding on to the disk. I will try later this weekend(we are getting a snow storm … so not much to do out there) with adding the deactivate LVM to my install script…

@bgilbert

It sounds as though the primary culprits are open LVM PVs and RAID volumes. coreos-install shouldn’t automatically close them, but it could check for this case by running blockdev --rereadpt early (followed by udevadm settle) and refuse to proceed with installation if that fails.

An ambitious implementation could also check whether the disk has open LVM PVs (pvs or similar?), RAID volumes (/proc/mdstat), mounted filesystems (/proc/mounts), or swap (/proc/swaps) and provide remediation instructions if found.

@shivarammysore

I did a disk install of CoreOS and I see the exact same issue.

blockdev: ioctl error on BLKRRPART: Device or resource busy
Failed to reread partition on /dev/sda

Here are the steps that I took:

  1. Download ISO image and burn it on a CD
  2. Boot CD and login
  3. Run command: coreos-install -d /dev/sda -i ignition.json
    4. ignition.json is similar to https://gist.github.com/shivarammysore/28d2d5fe520805451a5ff47ed8f0dfe4 with the complete RSA key.

@crawford

I just ran into this and in my case, it was because the disk currently had Container Linux instead. When I booted the ISO image, it mounted /dev/sda3 and /dev/sda9 to /usr and / respectively. I had to use dd to destroy both the primary and secondary GPT tables.

@leonardochaia

This is still happening and I can’t figure out the fix from this issue, can someone point me to a workaround?

$ sudo coreos-install -d /dev/sdb -i ignition.json
Current version of CoreOS Container Linux stable is 2191.5.0
Downloading the signature for https://stable.release.core-os.net/amd64-usr/2191.5.0/coreos_production_image.bin.bz2...
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
2019-09-17 17:40:36 URL:https://stable.release.core-os.net/amd64-usr/2191.5.0/coreos_production_image.bin.bz2.sig [566/566] -> "/tmp/coreos-install.ILrflXrixd/coreos_production_image.bin.bz2.sig" [1]
Downloading, writing and verifying coreos_production_image.bin.bz2...
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
^[
2019-09-17 17:48:10 URL:https://stable.release.core-os.net/amd64-usr/2191.5.0/coreos_production_image.bin.bz2 [481116178/481116178] -> "-" [1]
gpg: Signature made mié 04 sep 2019 01:27:14 -03
gpg:                using RSA key FD986FB096482F906F55B2EA01C9CAE767B3CA0E
gpg: key 50E0885593D2DCB4 marked as ultimately trusted
gpg: checking the trustdb
gpg: marginals needed: 3  completes needed: 1  trust model: pgp
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: Good signature from "CoreOS Buildbot (Offical Builds) <buildbot@coreos.com>" [ultimate]
blockdev: ioctl error on BLKRRPART: Device or resource busy
Failed to reread partitions on /dev/sdb
blockdev: ioctl error on BLKRRPART: Device or resource busy
Failed to reread partitions on /dev/sdb
blockdev: ioctl error on BLKRRPART: Device or resource busy
Failed to reread partitions on /dev/sdb
blockdev: ioctl error on BLKRRPART: Device or resource busy
Failed to reread partitions on /dev/sdb

@shivarammysore

@leonardochaia

I managed to ‘fix’ the install on my thumbdrive using @ivarec solution

dd if=/dev/zero of=/dev/sda bs=512 count=1

If we’re gonna wipe the device anyways, maybe the core-install script could do this for us?

@Cytrian

We had a similar problem with Flatcar. After some investigation I found that udevadm settle triggered a filesystem check.
We masked the systemd-fsck@.service before starting the installer.
systemctl mask systemd-fsck@.service

[Impact]
grml2usb autopkgtest frequently fail in Ubuntu’s CI. For example:
  https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-hirsute/hirsute/amd64/g/grml2usb/20201116_194744_cbe0f@/log.gz

An excerpt showing the error:
2020-11-17 14:14:49,362 Installing default MBR
2020-11-17 14:14:49,363 executing: dd if=’/dev/loop0′ of=’/tmp/tmpa2umy9df’ bs=5
12 count=1
2020-11-17 14:14:49,381 executing: dd if=/usr/share/grml2usb/mbr/mbrldr of=/tmp/
tmpa2umy9df bs=439 count=1 conv=notrunc
2020-11-17 14:14:49,385 executing: dd if=’/tmp/tmpa2umy9df’ of=’/dev/loop0′ bs=5
12 count=1 conv=notrunc
2020-11-17 14:14:49,401 executing: sync
2020-11-17 14:14:49,433 Probing device via ‘blockdev —rereadpt /dev/loop0’
blockdev: ioctl error on BLKRRPART: Device or resource busy
2020-11-17 14:14:49,452 Execution failed: («Couldn’t execute blockdev on ‘%s’ (i
nstall util-linux?)», ‘/dev/loop0’)

[Test Case]
Run the built-in autopkgtest on a cloud instance.

[Where problems might occur]
We may find that the system wide sync is necessary in some other environments, which could cause a similar issue to this one to appear there.

Hi there! I have been using UnRaid for awhile already, and I have 2 idle disks would like to preclear them first. iirc, I did once to them but I am not sure so I would like to preclear them again. But then I am having issue while dealing with sdj and sdm.

image.thumb.png.1e860d3dd1250c84aa825b11f3189657.png

When I use binhex-preclear docker to preclear sdj and sdm with command below:

preclear_binhex.sh -f -M 4 /dev/sdj

I got an error below:

blockdev: ioctl error on BLKRRPART: Device or resourece busy

So I tried to delete the partitions in UD for sdj and sdm, (destructive mode enabled), and got the errors as below:

Aug 6 11:04:09 UnRaid unassigned.devices: Removing partition '1' from disk '/dev/sdj'.
Aug 6 11:04:09 UnRaid unassigned.devices: Remove parition failed result 'Error: Partition doesn't exist. '
Aug 6 11:04:23 UnRaid unassigned.devices: Removing partition '1' from disk '/dev/sdm'.
Aug 6 11:04:23 UnRaid unassigned.devices: Remove parition failed result 'Error: Partition doesn't exist. '

P.S. I did all this while the array has started, but this should not be a problem.

My goal overall is just to preclear both of them, then take out from the UnRaid, may I know how? Thanks!


Edited August 6, 2020 by PzrrL

Solved

From DWIKI

Linux is a free Unix-type operating system originally created by Linus Torvalds with the assistance of developers around the world. Developed under the GNU General Public License , the source code for Linux is freely available to everyone.

Links

  • linux kernel
  • Distributions
  • https://kb.novaordis.com/index.php/Events_OS_Metrics
  • Linux on ircnet: #linux2

Rescue CDs

  • Scientific Linux
  • SystemRescueCd
  • Knoppix

Checking resources

*inxi
*http://www.slashroot.in/linux-system-io-monitoring

show what is doing most disk accesses

*iotop 
glances

IO statistics

iostat
vmstat
dstat
ioping
atop

CPU usage etc

top
atop
htop
vtop
fio
pidstat

Administration

  • About swap
  • POSIX ACLS
  • LVM
  • Linux Software Raid
  • Boot loader showdown: Getting to know LILO and GRUB
  • Linux Power Management

FAQ

check if interface exists

/sys/class/net/$IF

check if virtual or physical machine

dmidecode -s system-manufacturer
virt-who
lscpu
virt-what

Find bios version

dmidecode -s bios-version

watch which files a process opens

watch ls -l /proc/`pidof clamd`/fd

pam_succeed_if(sudo:auth): requirement «uid >= 1000» not met

Grow (GPT) last partition to max available

Where ‘3’ is partition number

parted /dev/sdf resize 3 100%

No longer allowed?

gdisk

timestamp to human readable

date -d @1522142497

List hardware

  • lshw
  • dmidecode
  • lsusb
  • hwinfo
  • lspci
  • lscpu
dmidecode -t baseboard

BLKRRPART: Device or resource busy

spend all day or reboot

calling ioclt to re-read partition table: Device or resource busy

rescan partition table

Keep an eye on

cat /proc/partitions

Partition(s) have been written, but we have been unable to inform the kernel of the change

partprobe
kpartx /dev/sdg
partx -uv /dev/sdg (this one worked on centos 7.3!)

This one gave new disk size!
Verified: CentOS 6, Ubuntu 22

echo 1 > /sys/block/sde/device/rescan

Only one that seemed to work on (centos7.x)

echo "- - -" > /sys/class/scsi_host/host2/scan

or scan all:

echo "- - -" | tee /sys/class/scsi_host/host*/scan

The winner so far:

blockdev --rereadpt /dev/sdg

but may throw blockdev: ioctl error on BLKRRPART: Device or resource busy

maybe

hdparm -z /dev/sdg

Force reboot

I found sometimes ‘reboot’ and ‘shutdown’ don’t work in virtual machines, in that case try:

#sync
echo s > /proc/sysrq-trigger
#optionally umount
echo u > /proc/sysrq-trigger
#reboot
echo b > /proc/sysrq-trigger

Mignight commander/dialog strange characters

Quick fix:

export LANG=en_US.ISO-8859-1

This probleem seems to be related to screen/su —

kpartx failing silently

No output/result when trying to create:

kpartx -lv /dev/mapper/foo

this might means the device hasn’t been partitioned

No output when trying to delete:

kpartx -d /dev/mapper/foop1

means you should use

kpartx -d /dev/mapper/foo

Create swapfile

fallocate -l 4G /var/swapfile
mkswap /var/swapfile

Temporary:

swapon -v /var/swapfile

In fstab: /var/swapfile swap swap sw,prio=-1

swapon failed: Invalid argument

problably trying to use fallocate on rhel/centos, use dd instead

partx: error adding partition

try kpartx instead

Find maximum depth of directory

find /var/www -type d | awk -F"/" 'NF > max {max = NF} END {print max}'

pvcreate: Cannot use /dev/sdb: device is partitioned

Even /proc/partitions will confirm there’s no partitions, so to convince pvcreate:

wipefs --all /dev/sdb

Assuming that you’re getting this as a result of automating (e.g., using expect) the fdisk operation (and that the partition isn’t actually mounted), try adding a few seconds of delay after modifying the partition and before writing the partition able.

I got the same error when I was trying to automate a call to fdisk on Centos 7.6 a la:

# (echo "d"; echo "";
        echo "n"; echo ""; echo 3; echo 2001954; echo "";
        echo "w") | fdisk /dev/sdb
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): Partition number (1-3, default 3): Partition 3 is deleted

Command (m for help): Partition type:
   p   primary (2 primary, 0 extended, 2 free)
   e   extended
Select (default p): Using default response p
Partition number (3,4, default 3): First sector (2001954-31116287, default 2002944): Last sector, +sectors or +size{K,M,G} (2001954-31116287, default 31116287): Using default value 31116287
Partition 3 of type Linux and of size 13.9 GiB is set

Command (m for help): The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

My suspicion was that my piped-in command stream was surfacing a timing issue in fdisk (that wouldn’t be triggered by slower/manual input) so I started sprinkling sleep commands to delay various inputs until the error went away. The problem in my case was that the w was happening too soon after the new partition was defined.

A sleep 5 before the w results in consistent success:

# (echo "d"; echo "";
        echo "n"; echo ""; echo 3; echo 2001954; echo "";
        sleep 5; echo "w") | fdisk /dev/sdb

Понравилась статья? Поделить с друзьями:
  • Block strike error unknown error
  • Block start at read error abrt
  • Block if without end if как исправить
  • Block error rate
  • Block error amnf victoria что это