Bug 618317
Summary: | RFE: RHEL5 Xen: support online dynamic resize of guest virtual disks | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Pasi Karkkainen <pasik> |
Component: | kernel-xen | Assignee: | Laszlo Ersek <lersek> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 5.6 | CC: | bmr, drjones, jentrena, jzheng, leiwang, lersek, mrezanin, pbonzini, qwan, xen-maint, yufang521247, yuzhang, yuzhou |
Target Milestone: | rc | Keywords: | FutureFeature |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-2.6.18-284.el5 | Doc Type: | Enhancement |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-02-21 03:28:15 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 514490, 648851 | ||
Attachments: |
Description
Pasi Karkkainen
2010-07-26 16:27:07 UTC
Support for online dynamic resize of Xen PV domU disks was added in upstream kernel.org Linux 2.6.36 xen-blkfront driver. Related RHEL6 RFE: https://bugzilla.redhat.com/show_bug.cgi?id=654982 Simple patch, backport should be possible. Reproducing the problem (saving it here also in order to help QE): dom0 running x86_64 -259 Create new LV (PE size is 32 MB in VolGroup0): lvcreate --extents=1 --name=bz618317 VolGroup0 Attach to guest (running x86_64 -260): xm block-attach rhel56-64bit-pv phy:/dev/mapper/VolGroup0-bz618317 xvdb w Guest: fdisk -ul /dev/xvdb 255 heads, 63 sectors/track, 4 cylinders, total 65536 sectors Resize LV in dom0: lvresize --extents=+1 /dev/VolGroup0/bz618317 Extending logical volume bz618317 to 64.00 MB Repeat fdisk check in guest: fdisk -ul /dev/xvdb 255 heads, 63 sectors/track, 4 cylinders, total 65536 sectors (In reply to comment #0) > Upstream patch here: (Upstream changed their URL scheme.) linux-2.6.18-xen.hg c/s 1005:f7f420bd7b7a: http://xenbits.xen.org/hg/linux-2.6.18-xen.hg/rev/f7f420bd7b7a Also backport 1006:13e25228ce40 (http://xenbits.xen.org/hg/linux-2.6.18-xen.hg/rev/13e25228ce40) which moves the definition of struct backend_info from xenbus.c to common.h -- the new vbd_resize() function in vbd.c needs to know the complete type to dereference (blkif->be). Also backport some include changes from xen-unstable c/s 12333:4eaadb2ae198... Now that common.h defines struct backend_info, client code needs to know struct xenbus_watch as well. Move inclusion of xenbus.h from all C files to "common.h". Created attachment 498339 [details]
move definition of struct backend_info around
Backport of linux-2.6.18-xen.hg c/s 1006:13e25228ce40.
Plus some #include massaging according to xen-unstable c/s 12333:4eaadb2ae198.
Created attachment 498340 [details]
vbd resizing
backport of linux-2.6.18-xen.hg c/s 1005:f7f420bd7b7a
Brew build: https://brewweb.devel.redhat.com/taskinfo?taskID=3316497 I installed the kernel-xen package in both dom0 and domU, then increased the LV size twice, and checked with fdisk in the domU each time (see comment 3). The first check went okay, dom0 logged "VBD Resize: new size 131072", fdisk displayed the increased size, and the domU logged "Setting capacity to 131072". However, after the second lvresize command returned in dom0 successfully (dmesg: "VBD Resize: new size 196608") and I issued "fdisk -ul /dev/xvdb" in the domU, the fdisk command hung. dom0 message: INFO: task blkback.1.xvdb:4875 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. blkback.1.xvd D ffff88019bed37a0 0 4875 19 4671 (L-TLB) ffff88019bec3d90 0000000000000246 ffffffff880756b7 ffffffff880bcc4a 000000000000000a ffff88019bed37a0 ffff8801da7590c0 0000000000007e8e ffff88019bed3988 0000000000000015 Call Trace: [<ffffffff880756b7>] :scsi_mod:scsi_done+0x0/0x18 [<ffffffff880bcc4a>] :libata:ata_scsi_rw_xlat+0x0/0x188 [<ffffffff8028df64>] printk+0x52/0xc6 [<ffffffff802634a3>] __down_read+0x82/0x9a [<ffffffff803bd22e>] xenbus_transaction_start+0x15/0x62 [<ffffffff888f4e5b>] :blkbk:vbd_resize+0x7e/0x12f [<ffffffff888f3a46>] :blkbk:blkif_schedule+0x6f/0x4c9 [<ffffffff888f39d7>] :blkbk:blkif_schedule+0x0/0x4c9 [<ffffffff8029d046>] keventd_create_kthread+0x0/0xc4 [<ffffffff802339a3>] kthread+0xfe/0x132 [<ffffffff8025fb2c>] child_rip+0xa/0x12 [<ffffffff8029d046>] keventd_create_kthread+0x0/0xc4 [<ffffffff802338a5>] kthread+0x0/0x132 [<ffffffff8025fb22>] child_rip+0x0/0x12 domU message: INFO: task fdisk:2467 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fdisk D 000000a7d1bc1dc1 0 2467 2421 (NOTLB) ffff88002ec99c08 0000000000000282 ffff88002f3ca040 0000000000000000 0000000000000008 ffff88003f389080 ffffffff804feb80 0000000000007675 ffff88003f389268 ffffffff80263909 Call Trace: [<ffffffff80263909>] _spin_lock_irqsave+0x9/0x14 [<ffffffff8022947b>] sync_page+0x0/0x43 [<ffffffff8022947b>] sync_page+0x0/0x43 [<ffffffff802625e5>] io_schedule+0x3f/0x67 [<ffffffff802294b9>] sync_page+0x3e/0x43 [<ffffffff80262729>] __wait_on_bit_lock+0x36/0x66 [<ffffffff8024164e>] __lock_page+0x5e/0x64 [<ffffffff8029d28c>] wake_bit_function+0x0/0x23 [<ffffffff8020ca5c>] do_generic_mapping_read+0x1de/0x391 [<ffffffff8020d8fd>] file_read_actor+0x0/0x101 [<ffffffff8020cd5b>] __generic_file_aio_read+0x14c/0x198 [<ffffffff802c02ad>] generic_file_read+0xac/0xc5 [<ffffffff8031e886>] inode_has_perm+0x56/0x63 [<ffffffff8029d25e>] autoremove_wake_function+0x0/0x2e [<ffffffff80243d20>] do_ioctl+0x21/0x6b [<ffffffff8032140b>] selinux_file_permission+0x9f/0xb4 [<ffffffff8020bd4f>] vfs_read+0xcb/0x171 [<ffffffff802126ef>] sys_read+0x45/0x6e [<ffffffff8025f2f9>] tracesys+0xab/0xb6 The domU side waits for the dom0 side. The dom0 side is blocked on some semaphore operation in xenbus_transaction_start(). This seems to be a deadlock. A printk() within __down_read() seems outright garbled. The upstream version's context has try_to_freeze(). Related changesets: http://xenbits.xen.org/hg/linux-2.6.18-xen.hg/rev/c09686d2bbff http://xenbits.xen.org/hg/linux-2.6.18-xen.hg/rev/cb50d25a9468 I have no idea if they would fix the deadlock. I assume not; according to Documentation/power/kernel_threads.txt, this only matters when the system (dom0) is suspended. ... Okay, after some digging, this is my understanding: The original patch was submitted by Ky Srinivasan on 09 Mar 2010 [1]. On 16 Mar 2010, Joost Roeleveld reported the same problem as described above [2]. He also faced the problem on the second resizing attempt. On 18 Mar 2010, Ky resubmitted the corrected patch as a full patch [3], and then as a delta [4]. The difference is: blkfront calls revalidate_disk() too on the resized disk. It was discussed in further parts of the thread that revalidate_disk() is not present in linux-2.6.18-xen.hg. As Pasi pointed out [5] [6], RHEL-5 luckily does have revalidate_disk(). (See commits 39bf2c01 and 56d76e5a, and bug 444964.) The patch fixed the problem for Joost [7]. So it seems that we can't purely rely on upstream now, because they still need to port the RHEL-5 revalidate_disk() to their tree (as of 1080:c896d26c6b7c). I will add the revalidate_disk() to blkfront and retest. [1] http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00467.html [2] http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00930.html [3] http://lists.xensource.com/archives/html/xen-devel/2010-03/msg01047.html [4] http://lists.xensource.com/archives/html/xen-devel/2010-03/msg01049.html [5] http://lists.xensource.com/archives/html/xen-devel/2010-03/msg01159.html [6] http://lists.xensource.com/archives/html/xen-devel/2010-03/msg01201.html [7] http://lists.xensource.com/archives/html/xen-devel/2010-04/msg00097.html (In reply to comment #8) > Joost Roeleveld reported the same problem as described above [2]. He also > faced the problem on the second resizing attempt. Lack of the revalidate_disk() call probably leaves the system in such a state that further resize attempts won't work. Created attachment 498371 [details] revalidate virtual disk in blkfront after size change Taken from http://lists.xensource.com/archives/html/xen-devel/2010-03/msg01049.html Brew build: https://brewweb.devel.redhat.com/taskinfo?taskID=3317051 2nd resize attempt hung with the same symptoms as in comment 8. I notice if vbd_resize() reaches the call to xenbus_transaction_end(), then in case of any return value except -EAGAIN, it will call xenbus_transaction_end() twice in a row. If the first call was successful (commit succeeded), then this is definitely wrong. xenbus_transaction_end() calls up_read(&xs_state.suspend_mutex); unconditionally, which is the other half of down_read(&xs_state.suspend_mutex); called by xenbus_transaction_start(). vbd_resize() messes up this mutex by calling xenbus_transaction_end() twice in a row, and that may be why blkback hangs in xenbus_transaction_start() / __down_read() on the second try. I'll try to fix this and if it works, we'll have to submit it to upstream. Created attachment 498486 [details]
fix xenbus xact start deadlock by removing double xact end
With this patch, everything worked fine, multiple increases and decreases. host & guest: 2.6.18-260.el5.vbd_resize_bz618317_4.local.xen (Brew build completed in the meantime: https://brewweb.devel.redhat.com/taskinfo?taskID=3318970) host dmesg: VBD Resize: new size 131072 VBD Resize: new size 196608 VBD Resize: new size 262144 VBD Resize: new size 196608 VBD Resize: new size 131072 VBD Resize: new size 65536 guest fdisk: 255 heads, 63 sectors/track, 4 cylinders, total 65536 sectors 255 heads, 63 sectors/track, 8 cylinders, total 131072 sectors 255 heads, 63 sectors/track, 12 cylinders, total 196608 sectors 255 heads, 63 sectors/track, 16 cylinders, total 262144 sectors 255 heads, 63 sectors/track, 12 cylinders, total 196608 sectors 255 heads, 63 sectors/track, 8 cylinders, total 131072 sectors 255 heads, 63 sectors/track, 4 cylinders, total 65536 sectors guest dmesg: Setting capacity to 131072 Setting capacity to 196608 Setting capacity to 262144 end_request: I/O error, dev xvdb, sector 262136 Buffer I/O error on device xvdb, logical block 32767 Setting capacity to 196608 end_request: I/O error, dev xvdb, sector 262136 Buffer I/O error on device xvdb, logical block 32767 end_request: I/O error, dev xvdb, sector 196600 Buffer I/O error on device xvdb, logical block 24575 Setting capacity to 131072 end_request: I/O error, dev xvdb, sector 196600 Buffer I/O error on device xvdb, logical block 24575 end_request: I/O error, dev xvdb, sector 131064 Buffer I/O error on device xvdb, logical block 16383 Setting capacity to 65536 end_request: I/O error, dev xvdb, sector 131064 Buffer I/O error on device xvdb, logical block 16383 The guest dmesg errors are the consequence of the decrease steps. Upstream knows about them and they seem to be harmless: http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00752.html Mailed patch in attachment 498486 [details] to upstream: http://lists.xensource.com/archives/html/xen-devel/2011-05/msg00670.html Great, thanks! Could you please add more debugging information to the log entries.. ie. which VBD was resized (on the host), for which VM? And in the guest: "Setting capacity to 131072" - what unit is that? sectors? And which block device (xvdX) it was? I *think* earlier there was a patch that adds more logging.. see xen-devel mailinglist archives. I think it was sent by Ky Srinivasan. (In reply to comment #14) > Could you please add more debugging information to the log entries.. ie. which > VBD was resized (on the host), for which VM? > > And in the guest: "Setting capacity to 131072" - what unit is that? sectors? > And which block device (xvdX) it was? > > I *think* earlier there was a patch that adds more logging.. see xen-devel > mailinglist archives. I think it was sent by Ky Srinivasan. http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01588.html Created attachment 498564 [details]
vbd resizing + revalidate vbd + xenbus xact deadlock removed
Created attachment 498565 [details]
more informative dmesg entries after resizing
both host & guest: 2.6.18-260.el5.vbd_resize_bz618317_5.local.xen host dmesg: VBD Resize: Domid: 2, Device: (253, 4), New Size: 131072 sectors VBD Resize: Domid: 2, Device: (253, 4), New Size: 196608 sectors VBD Resize: Domid: 2, Device: (253, 4), New Size: 262144 sectors VBD Resize: Domid: 2, Device: (253, 4), New Size: 196608 sectors VBD Resize: Domid: 2, Device: (253, 4), New Size: 131072 sectors VBD Resize: Domid: 2, Device: (253, 4), New Size: 65536 sectors [root@lacos-workstation ~]# ls -l /dev/mapper/VolGroup0-bz618317 brw-rw---- 1 root disk 253, 4 May 12 16:42 /dev/mapper/VolGroup0-bz618317 guest fdisk: 255 heads, 63 sectors/track, 8 cylinders, total 131072 sectors 255 heads, 63 sectors/track, 12 cylinders, total 196608 sectors 255 heads, 63 sectors/track, 16 cylinders, total 262144 sectors 255 heads, 63 sectors/track, 12 cylinders, total 196608 sectors 255 heads, 63 sectors/track, 8 cylinders, total 131072 sectors 255 heads, 63 sectors/track, 4 cylinders, total 65536 sectors guest dmesg: Changing capacity of (202, 16) to 131072 sectors Changing capacity of (202, 16) to 196608 sectors Changing capacity of (202, 16) to 262144 sectors end_request: I/O error, dev xvdb, sector 262136 Buffer I/O error on device xvdb, logical block 32767 Changing capacity of (202, 16) to 196608 sectors end_request: I/O error, dev xvdb, sector 262136 Buffer I/O error on device xvdb, logical block 32767 end_request: I/O error, dev xvdb, sector 196600 Buffer I/O error on device xvdb, logical block 24575 Changing capacity of (202, 16) to 131072 sectors end_request: I/O error, dev xvdb, sector 196600 Buffer I/O error on device xvdb, logical block 24575 end_request: I/O error, dev xvdb, sector 131064 Buffer I/O error on device xvdb, logical block 16383 Changing capacity of (202, 16) to 65536 sectors end_request: I/O error, dev xvdb, sector 131064 Buffer I/O error on device xvdb, logical block 16383 without errors: Changing capacity of (202, 16) to 131072 sectors Changing capacity of (202, 16) to 196608 sectors Changing capacity of (202, 16) to 262144 sectors Changing capacity of (202, 16) to 196608 sectors Changing capacity of (202, 16) to 131072 sectors Changing capacity of (202, 16) to 65536 sectors [root@localhost ~]# ls -l /dev/xvdb brw-r----- 1 root disk 202, 16 May 12 10:53 /dev/xvdb Created attachment 500795 [details] online dynamic resize of guest virtual disks (4) blkback: don't call vbd_size() if bd_disk is NULL http://lists.xensource.com/archives/html/xen-devel/2011-05/msg01710.html Laszlo: Are you planning to add the online resize patches also to the rhel6 kernel? It's tracked here: https://bugzilla.redhat.com/show_bug.cgi?id=654982 . (In reply to comment #24) > Laszlo: Are you planning to add the online resize patches also to the rhel6 > kernel? Yes, at some point one of us will do it. Patch(es) available in kernel-2.6.18-284.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. Patch(es) available in kernel-2.6.18-284.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. Verified with Host: kernel-xen-2.6.18-300.el5 Guests: RHEL5.7 PV guests with kernel-xen-2.6.18-300.el5 (both i386 and x86_64) test steps (follow comment 2): in host (PE Size = 4096K): # xm block-attach rhel5x32pv phy:/dev/VolGroup0/testlv xvdb w # lvresize --extents=+1 /dev/VolGroup0/testlv # lvresize --extents=+1 /dev/VolGroup0/testlv # lvresize --extents=+1 /dev/VolGroup0/testlv # lvresize --extents=+1024 /dev/VolGroup0/testlv # lvresize --extents=-512 /dev/VolGroup0/testlv host dmesg: VBD Resize: Domid: 4, Device: (253, 0), New Size: 16384 sectors VBD Resize: Domid: 4, Device: (253, 0), New Size: 24576 sectors VBD Resize: Domid: 4, Device: (253, 0), New Size: 32768 sectors VBD Resize: Domid: 4, Device: (253, 0), New Size: 8421376 sectors VBD Resize: Domid: 4, Device: (253, 0), New Size: 4227072 sectors in guest: $ fdisk -ul /dev/xvdb 255 heads, 63 sectors/track, 0 cylinders, total 8192 sectors 255 heads, 63 sectors/track, 1 cylinders, total 16384 sectors 255 heads, 63 sectors/track, 1 cylinders, total 24576 sectors 255 heads, 63 sectors/track, 2 cylinders, total 32768 sectors 255 heads, 63 sectors/track, 524 cylinders, total 8421376 sectors 255 heads, 63 sectors/track, 263 cylinders, total 4227072 sectors # dmesg Changing capacity of (202, 16) to 16384 sectors Changing capacity of (202, 16) to 24576 sectors Changing capacity of (202, 16) to 32768 sectors Changing capacity of (202, 16) to 8421376 sectors end_request: I/O error, dev xvdb, sector 8421368 Buffer I/O error on device xvdb, logical block 1052671 Changing capacity of (202, 16) to 4227072 sectors end_request: I/O error, dev xvdb, sector 8421368 Buffer I/O error on device xvdb, logical block 1052671 The I/O error messages printed while decreasing the lv are harmless according to comment 13. Is this supposed to work with the boot VBD /dev/xvda? [root@guest ~]# dmesg | tail -1 Changing capacity of (202, 0) to 33554432 sectors [root@guest ~]# fdisk -ul /dev/xvda | grep sectors$ 255 heads, 63 sectors/track, 1044 cylinders, total 16777216 sectors [root@guest ~]# /sbin/blockdev --rereadpt /dev/xvda BLKRRPART: Device or resource busy (In reply to comment #31) > Is this supposed to work with the boot VBD /dev/xvda? > > [root@guest ~]# dmesg | tail -1 > Changing capacity of (202, 0) to 33554432 sectors > [root@guest ~]# fdisk -ul /dev/xvda | grep sectors$ > 255 heads, 63 sectors/track, 1044 cylinders, total 16777216 sectors > [root@guest ~]# /sbin/blockdev --rereadpt /dev/xvda > BLKRRPART: Device or resource busy I have no idea. Can you trace why the ioctl returns -EBUSY? (In reply to comment #32) > (In reply to comment #31) > > Is this supposed to work with the boot VBD /dev/xvda? > I have no idea. By that I mean I can't recall any argument for or against such support in the upstream discussion. Looking at blkdev_reread_part(), it can immediately return -EBUSY if: if (!mutex_trylock(&bdev->bd_mutex)) return -EBUSY; Can you reread the partition table (inside the guest) without resizing the LV first in the host? The patch in attachment 500795 [details] adds a call to revalidate_disk().
connect() [drivers/xen/blkfront/blkfront.c]
-> revalidate_disk() [fs/block_dev.c]
-> no blkfront specific revalidation, see "xlvbd_block_fops" in
"drivers/xen/blkfront/vbd.c"
-> mutex_lock(&bdev->bd_mutex)
-> check_disk_size_change(disk, bdev)
-> mutex_unlock(&bdev->bd_mutex)
That's the same mutex (but most probably not the only other locking site).
I recall from my testing that you have to read a file or do something similar on the vbd first to have the guest kernel notice the size change. Can you please try that?
1. LV resizing in the host,
2. in the guest, cat /etc/motd
3. reread the partition table
Thanks.
> Can you reread the partition table (inside the guest) without resizing the LV
> first in the host?
It still returns -EBUSY indeed.
I think in Julio's case we are not even getting that far (although I did just check and BLKPG ioctls to also fail to modify non-busy partitions: # partx -a /dev/xvda BLKPG: Device or resource busy error adding partition 1 BLKPG: Device or resource busy error adding partition 2 BLKPG: Device or resource busy error adding partition 3 BLKPG: Device or resource busy error adding partition 4 Does xvda do any of its own partition logic? Seems odd to have xvda3 and xvda4 (not currently defined in MBR) as start/size 0 devices that return ENXIO on access and don't permit changes.) I took a quick look at the fdisk side and it seems that BLKGETSIZE/BLKGETSIZE64 are still returning the old size value even though proc & sys report the new size: # grep xvda /proc/partitions 202 0 16777216 xvda 202 1 104391 xvda1 202 2 8281507 xvda2 # cat /sys/block/xvda/size 33554432 [ both report correct size, 17179869184 bytes ] # blockdev --getsz /dev/xvda 16777216 # fdisk -l /dev/xvda Disk /dev/xvda: 8589 MB, 8589934592 bytes 255 heads, 63 sectors/track, 1044 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/xvda1 * 1 13 104391 83 Linux /dev/xvda2 14 1044 8281507+ 8e Linux LVM [ both report the wrong (old) size as reported by BLKGETSIZE/BLKGETSIZE64 ] dmesg also reports the resize: Changing capacity of (202, 0) to 33554432 sectors (In reply to comment #38) > I think in Julio's case we are not even getting that far (although I did just > check and BLKPG ioctls to also fail to modify non-busy partitions: > > # partx -a /dev/xvda > BLKPG: Device or resource busy > error adding partition 1 > BLKPG: Device or resource busy > error adding partition 2 > BLKPG: Device or resource busy > error adding partition 3 > BLKPG: Device or resource busy > error adding partition 4 In a -274 guest on a -303 host, the same is printed right after booting into the guest (no resizing whatsoever, so the host side having the host side patch should not matter). There's a difference in the output between the first and any further attempts. First attempt: [root@dhcp-1-154 ~]# partx -a /dev/xvda BLKPG: Device or resource busy error adding partition 1 BLKPG: Device or resource busy error adding partition 2 Second and further attempts: [root@dhcp-1-154 ~]# partx -a /dev/xvda BLKPG: Device or resource busy error adding partition 1 BLKPG: Device or resource busy error adding partition 2 BLKPG: Device or resource busy error adding partition 3 BLKPG: Device or resource busy error adding partition 4 However, /dev/xvdb works both with "partx -a" and "blockdev --rereadpt". > Does xvda do any of its own partition logic? What do you mean? Blkfront has the following fops [drivers/xen/blkfront/vbd.c]: static struct block_device_operations xlvbd_block_fops = { .owner = THIS_MODULE, .open = blkif_open, .release = blkif_release, .ioctl = blkif_ioctl, .getgeo = blkif_getgeo }; In the guest the xvda block device should be partitioned like any other block device, I think. Boot disk partitions cannot be reread on a physical machine & physical disk either. > Seems odd to have xvda3 and xvda4 (not currently defined in MBR) as > start/size 0 devices that return ENXIO on access and don't permit changes.) It may be odd, but they were not introduced by the patches for this bug. Please always compare results with RHEL-5.7 guests. > I took a quick look at the fdisk side and it seems that > BLKGETSIZE/BLKGETSIZE64 are still returning the old size value even though > proc & sys report the new size: I think it's due to the same mutex being held. The BLKGETSIZE64 ioctl is served by i_size_read(bdev->bd_inode) That quantity is set by: connect() [drivers/xen/blkfront/blkfront.c] -> revalidate_disk() [fs/block_dev.c] -> bdget_disk() -> mutex_lock(&bdev->bd_mutex) -> check_disk_size_change(disk, bdev) -> i_size_write(bdev->bd_inode, disk_size) -> mutex_unlock(&bdev->bd_mutex) If bdget_disk() fails (or the mutex is held), the new value is not written. The latter case should also cause a thread to hang -- therefore I assume it's bdget_disk() that fails for the boot disk. But we should summon block layer experts for that. BTW check_disk_size_change() prints "detected capacity change" on KERN_INFO level, and the above logs don't contain that. Please repeat the same resizing test (and compare the dmesgs) under a -304 PV guest and -304 host between xvda and xvdb. check_disk_size_change() compares the gendisk size with the blockdev size, and if they differ, the blockdev size is adapted to the gendisk size. This happens under the blockdev mutex. The vbd resizing happens on the gendisk side first -- see the set_capacity() call in attachment 500795 [details] --, then revalidate_disk() triggers check_disk_size_change(), if bdget_disk() succeeds and the mutex can be acquired. The "Changing capacity of (202, 0) to 33554432 sectors" message corresponds to the gendisk level change, printed by blkfront (see the patch). The missing "... detected capacity change from ..." message would be printed by the blockdev layer, and it's not (for xvda). So, it's not supposed to work with the boot vbd, for the same reason why the partition table of a physical boot disk can't be re-read without rebooting. In the guest, the boot gendisk is in fact resized, but the kernel prevents the blockdev from following suit. I'd greatly appreciate if you could run this by a block layer expert. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0150.html |