Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Thanks. I've had plenty of space available in VG, I could create/resize the pool to bigger size but the point is that even though I've started at small size (128M) with virtual size of 12G the system hang is not cool.
Here's how I got to this point. We have a in house made GUI talking to libvirt and creating guests. While creating we've been using capacity xml tag to pass the size to libvirt and allocation xml tag set to 0 because LV pool in libvirt did not support sparse volumes. RHEL 6.4 however adds sparse LV support so by using allocation of 0 we tell libvirt to create sparse volume. This is obviously something we fix in our code but I went ahead and started to test it.
Default snapshot based sparse LVs are useless. Once they are filled to 100% per design of snapshot is invalidated and all data are gone. Since we create the snapshot of 32M (1 extent) in size we will fill it up very quickly because dmeventd is not able to keep up with the autoextend. That lead me to thin volumes when we see same behaviour (dmeventd not able to resize fast enough) and once we fill the thin volume system hangs.
Seems you need to be more conservative with the amount of space you add to the thin-pool. I understand your concern but you're not using the thin-pool device as designed. What exactly are you saying you want to happen?
I understand that "system hang is not cool" but it isn't system hang; it is a hang of IO being issued to the thin pool. Now if these leads to system-wide deadlock due to interdependent writeback needed to free memory in the VM (as in memory mgmt VM ;) then yes that certainly isn't cool.
Again, what would be the ideal response you're looking for? Do you just want the thin-pool's metadata to transition to read-only mode? This means writes will fail with -EIO.
Ideally I would like to see behavior similar (or same) as with regular LVs. When we're running out of space and currently allocated size is <= virtual size I would expect delayed IO or EBUSY until it's enlarged. When we hit the virtual size boundary I would expect ENOSPC error while keeping the LV mounted RW so that one can clean up when necessary.
Agreed with Milos, if we encounter a situation where we inadvertently run out of space, I can see "hanging" being an answer until the pool is expanded, however, most folks would expect an ENOSPC error to kick back. Indeed its hard to unwedge the situation since the LVM tools also lock up when they stat the thin volume that's hung. Once you run a pool out of space - you are SOL. That's not Enterprise Linux, sorry :-)
I was excited about thin volumes until I discovered this. Pretty much unusable until this is fixed.
For example here is an strace of the lvs command:
ioctl(3, DM_TABLE_STATUS, 0x1813fa0) = 0
stat("/dev/vg-local-test01/thin", {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 5), ...}) = 0
open("/dev/vg-local-test01/thin", O_RDONLY|O_DIRECT|O_NOATIME
[hang]
Hi,
I've noticed that this BZ was moved to needinfo. Not sure if I should provide any additional information or not but I've taken a look at the code and found upstream commit(3e1a0699095803e53072699a4a1485af7744601d) that seem to be enhancing error handling in this particular case. I hope I'll find some time to test this in coming days. I'm attaching the patch.