Bug 769359

Summary: virt-resize on RHEL 6 kernel fails to re-read the partition table
Product: Red Hat Enterprise Linux 6 Reporter: Richard W.M. Jones <rjones>
Component: libguestfsAssignee: Richard W.M. Jones <rjones>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: grant_williamson, jzheng, leiwang, mbooth, pcao, qguan, qwan, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libguestfs-1.16.3-1.el6 Doc Type: Bug Fix
Doc Text:
Cause: When a block device is closed, udev fires off a process which opens the block device again. Consequence: libguestfs operations which write to a disk then rely on the disk being immediately free (eg. so the kernel can reread the partition table) would occasionally fail. There are many such operations, but one in particular was 'virt-resize'. Fix: After libguestfs has written to a block device, it is now careful to wait for the udev action to finish before returning. Result: virt-resize and other operations won't fail intermittently for no apparent reason.
Story Points: ---
Clone Of: 769304 Environment:
Last Closed: 2012-06-20 07:00:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 769304    
Bug Blocks:    
Attachments:
Description Flags
Simple reproducer. none

Description Richard W.M. Jones 2011-12-20 15:26:05 UTC
+++ This bug was initially created as a clone of Bug #769304 +++

Description of problem:

Unable to resize Windows XP guest using virt-resize.

Fatal error: exception Guestfs.Error("part_set_bootable: do_part_set_bootable: parted: /dev/vdb: Warning: WARNING: the kernel failed to re-read the partition table on /dev/vdb (Device or resource busy).  As a result, it may not reflect all of your changes until after reboot.")

Version-Release number of selected component (if applicable):

RHEL 6.2 + libguestfs 1.14.1 (from preview repository)

Trace output is:

libguestfs: trace: part_add "/dev/sdb" "primary" 63 16772863
libguestfs: trace: part_add = 0
libguestfs: trace: copy_device_to_device "/dev/sda1" "/dev/sdb1" "size:6432136704"
libguestfs: trace: copy_device_to_device = 0
libguestfs: trace: part_set_bootable "/dev/sdb" 1 true
libguestfs: trace: part_set_bootable = -1 (error)
Fatal error: exception Guestfs.Error("part_set_bootable: do_part_set_bootable: parted: /dev/vdb: Warning: WARNING: the kernel failed to re-read the partition table on /dev/vdb (Device or resource busy).  As a result, it may not reflect all of your changes until after reboot.")

Debug output doesn't particularly add anything.

Comment 1 Richard W.M. Jones 2011-12-20 15:27:34 UTC
Reproducer is simple.  Take any Windows XP guest and do:

  truncate -s 10G resized.img      # 10G > current size
  virt-resize winxp.img resized.img --expand sda1

Comment 2 Richard W.M. Jones 2011-12-22 00:02:56 UTC
Some important points to note:

The failure only seems to occur on baremetal, not when
virt-resize is run inside a virtual machine.  Or this
may be a symptom that the problem is timing related,
since virt-resize runs much slower in a VM.

I tested baremetal vs in a VM, for identical versions of
the kernel, qemu-kvm and parted, so it doesn't appear to
be caused by differing versions of any of these.

Also it doesn't seem to matter whether or not
libguestfs-winsupport is installed.  Without libguestfs-
winsupport, ntfs resizing is not done, but that doesn't
appear to make any difference to whether we can reproduce
this bug.

Target disk size (eg. 8G, 10G, etc) doesn't appear to
make any difference.

Comment 3 Richard W.M. Jones 2012-01-30 11:23:57 UTC
Might be caused by udev/blkid:
https://rwmj.wordpress.com/2012/01/19/udev-unexpectedness/#content

Comment 4 Richard W.M. Jones 2012-02-06 10:57:11 UTC
This was also reported for a CentOS 6.2 *guest* using
the RHEL 6.3 preview packages.  See:

https://bugzilla.redhat.com/show_bug.cgi?id=769304#c6

Comment 5 Richard W.M. Jones 2012-02-06 20:12:01 UTC
Created attachment 559745 [details]
Simple reproducer.

Simple reproducer for this bug.

Comment 6 Richard W.M. Jones 2012-02-06 21:38:33 UTC
(In reply to comment #5)
> Created attachment 559745 [details]
> Simple reproducer.
> 
> Simple reproducer for this bug.

Instructions:

(1) Using RHEL 6.3, preferably on baremetal.

(2) Download the attachment.

(3) chmod +x reproducer769359.pl

(4) ./reproducer769359.pl

The program should fail relatively quickly with an error like this one:

part_set_bootable: do_part_set_bootable: parted: /dev/vda: Warning: WARNING: the kernel failed to re-read the partition table on /dev/vda (Device or resource busy).  ...

If the program keeps running for a long period of time (eg. > 5 minutes)
then the bug is probably not present.  However note that the bug is
very timing-dependent.

Comment 8 Qixiang Wan 2012-03-15 05:10:42 UTC
Verified with libguestfs-1.16.10-1.el6.

With the old package libguestfs-1.16.2-1, the defect can be reproduced with the reproducer in comment 5 within 1 minute, after update to libguestfs-1.16.10-1.el6, the reproducer can run well without error for more than 20 minutes.

Comment 9 Richard W.M. Jones 2012-04-26 13:33:52 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
When a block device is closed, udev fires off a process which opens the block device again.

Consequence:
libguestfs operations which write to a disk then rely on the disk being immediately free (eg. so the kernel can reread the partition table) would occasionally fail.  There are many such operations, but one in particular was 'virt-resize'.

Fix:
After libguestfs has written to a block device, it is now careful to wait for the udev action to finish before returning.

Result:
virt-resize and other operations won't fail intermittently for no apparent reason.

Comment 11 errata-xmlrpc 2012-06-20 07:00:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0774.html