Bug 1150015 - VM abnormal stop after LV refreshing when using thin provisioning on block storage
Summary: VM abnormal stop after LV refreshing when using thin provisioning on block st...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: 3.4.3-1
Assignee: Nir Soffer
QA Contact: Aharon Canan
URL:
Whiteboard: storage
: 1150012 (view as bug list)
Depends On: 1149705
Blocks: 1155566 1156075
TreeView+ depends on / blocked
 
Reported: 2014-10-07 09:01 UTC by rhev-integ
Modified: 2016-02-10 17:25 UTC (History)
30 users (show)

Fixed In Version: vdsm-4.14.17-2.el6ev
Doc Type: Bug Fix
Doc Text:
Cause: When using thin provisioning on block storage, RHEVM creates a 1GiB LV. When the disk fills up to certain threshold, RHEVM attempts to extend the LV. Extending a LV triggers a udev change event and vdsm's udev rule is evaluated, setting the permissions of the lv. In recent versions of systemd (RHEL7, fedora), udev changed the behavior, removing selinux label from devices when setting device permissions (see bug 1147910). This causes the LV to lose the selinux label assigned by libvirt, which caused the VM to lose access to the LV and pause. When the VM is restarted, libvirt assigns the selinux label to the vm again. Consequence: After a thin provisioned disk on block storage is extended automatically, the VM pauses, and you cannot resume it. The only way to resume it is to shutdown it down and start it up again. Fix: VDSM udev rules were modified so VDSM images do not use OWNER and GROUP for setting device permissions. Instead the chown command is run in order to set device permission, so udev does not modify the device selinux label. Result: VMs with thinly provisioned disks based on block storage do not get paused when an extension is required, and can operate properly on RHEL7 hosts.
Clone Of: 1149705
Environment:
Last Closed: 2014-11-12 02:29:26 UTC
oVirt Team: Storage
Target Upstream Version:


Attachments (Terms of Use)
/var/log/ from the host and engine.log (8.22 MB, application/x-gzip)
2014-11-04 09:42 UTC, Elad
no flags Details
vdsm logs (part 2) (12.98 MB, application/x-gzip)
2014-11-04 14:09 UTC, Elad
no flags Details
vdsm logs (part 1) (10.28 MB, application/x-gzip)
2014-11-04 14:29 UTC, Elad
no flags Details
vdsm logs (part 1-1) (9.45 MB, application/x-gzip)
2014-11-04 14:51 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1844 0 normal SHIPPED_LIVE vdsm 3.4.3-1 bug fix async release 2014-11-12 07:28:40 UTC
oVirt gerrit 33492 0 None None None Never
oVirt gerrit 33555 0 None None None Never
oVirt gerrit 33620 0 None None None Never
oVirt gerrit 33627 0 None None None Never
oVirt gerrit 33628 0 None None None Never
oVirt gerrit 33632 0 None None None Never

Comment 1 Tal Nisan 2014-10-07 12:29:47 UTC
*** Bug 1150012 has been marked as a duplicate of this bug. ***

Comment 3 Eyal Edri 2014-10-28 14:54:47 UTC
still missing the patch for 3.4.3-1

Comment 5 Elad 2014-11-04 09:31:31 UTC
I tested the scenario using a thin disk created on a FC storage domain. Installed OS on the guest for simulating an extension of the lv.   
During the OS installation, VM stops:

vdsm.log:

libvirtEventLoop::INFO::2014-11-04 09:08:04,563::vm::4602::vm.Vm::(_onIOError) vmId=`bf87e50e-b931-4504-b5c8-4d704369da34`::abnormal vm stop device virtio-disk0 error enospc

engine.log:

2014-11-04 10:06:53,133 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-89) [6c63f4ff] VM vm_fc_01 bf87e50e-b931-4504-b5c8-4d704369da34 moved from Up --> Paused
2014-11-04 10:06:53,251 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-89) [6c63f4ff] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm_fc_01 has paused due to no Storage space error.



libvirt.log:

2014-11-04 09:08:04.530+0000: 107968: debug : qemuProcessHandleIOError:938 : Transitioned guest vm_fc_01 to paused state due to IO error


The VM is unpaused immediately and OS installation is resumed.

I'm moving the bug to ASSIGNED since the VM still gets paused on storage space error.



Checked on:
vdsm-4.14.17-1.pkvm2_1.1.ppc64
libvirt-1.1.3-1.pkvm2_1.17.11.ppc64
qemu-kvm-1.6.0-2.pkvm2_1.17.10.ppc64



Attaching:
/var/log directory from host and engine.log

Comment 6 Elad 2014-11-04 09:42:47 UTC
Created attachment 953508 [details]
/var/log/ from the host and engine.log

Comment 7 Michal Skrivanek 2014-11-04 11:08:36 UTC
what do you expect the VM to do while the storage is being extended/allocated?

Comment 8 Elad 2014-11-04 12:02:32 UTC
(In reply to Michal Skrivanek from comment #7)
> what do you expect the VM to do while the storage is being
> extended/allocated?

VM shouldn't get paused, the volume extend operation should occur before the disk gets to a situation it runs out of space

Comment 9 Michal Skrivanek 2014-11-04 12:23:12 UTC
I can;t find any regular extension request in vdsm.log. Seems you were either writing too quickly or the highWrite monitoring doesn't work.
Please verify settings and behavior around the threshold of extension...before you reach disk full. Comment #5 just shows once you reach ENOSPC the drives get extended and it continues ok

Comment 10 Elad 2014-11-04 12:30:38 UTC
(In reply to Michal Skrivanek from comment #9)
> I can;t find any regular extension request in vdsm.log. Seems you were
> either writing too quickly or the highWrite monitoring doesn't work.
> Please verify settings and behavior around the threshold of
> extension...before you reach disk full. Comment #5 just shows once you reach
> ENOSPC the drives get extended and it continues ok

Just for clarification - I'm not exteding the volume manually. I've created a thin provision disk on a FC domain and installed OS on it. I expect vdsm to perform lvextend operation automatically when necessary for extending the volume when it reaches to the defined threshold for extension.

Comment 11 Michal Skrivanek 2014-11-04 12:33:30 UTC
(In reply to Elad from comment #10)
> Just for clarification - I'm not exteding the volume manually. I've created
> a thin provision disk on a FC domain and installed OS on it. I expect vdsm
> to perform lvextend operation automatically when necessary for extending the
> volume when it reaches to the defined threshold for extension.

I'm not saying you are. I'm saying you should verify your threshold and monitoring interval setting and make sure you're not write in higher rate than that. If there is an issue in the code, with highWrite function, then please attach vdsm.log since vdsm startup. I do see some related issues with that from ~5 days ago. Since then the vdsm was restarted so it may not be connected, but still..more logs always help. But please check what I said in comment #5 first

Comment 12 Elad 2014-11-04 14:09:33 UTC
Created attachment 953617 [details]
vdsm logs (part 2)

Comment 13 Nir Soffer 2014-11-04 14:27:53 UTC
The fact that the vm was unpaused automatically prove that this bug is fixed.

This fix handles the case when vm is paused after it the disk lost the selinux label, and the vm cannot access it. In this state not only the vm will never unpause, but it cannot be resumed manually. The only way to use such vm is to shutdown and start it again.

What you describe here is unrelated issue, vm getting paused for shot time during heavy io usage. Please open another bug for this issue.

Note that we cannot guarantee that vm will never pause during heavy io workload. We only guarantee that the vm will be unpaused in this case after a disk was extended.

Comment 14 Elad 2014-11-04 14:29:12 UTC
Created attachment 953630 [details]
vdsm logs (part 1)

Comment 15 Elad 2014-11-04 14:51:58 UTC
Created attachment 953645 [details]
vdsm logs (part 1-1)

Comment 16 Elad 2014-11-04 15:11:43 UTC
(In reply to Nir Soffer from comment #13)
> The fact that the vm was unpaused automatically prove that this bug is fixed.
> 
> This fix handles the case when vm is paused after it the disk lost the
> selinux label, and the vm cannot access it. In this state not only the vm
> will never unpause, but it cannot be resumed manually. The only way to use
> such vm is to shutdown and start it again.
> 
> What you describe here is unrelated issue, vm getting paused for shot time
> during heavy io usage. Please open another bug for this issue.
> 
> Note that we cannot guarantee that vm will never pause during heavy io
> workload. We only guarantee that the vm will be unpaused in this case after
> a disk was extended.

Since the described behavior is the expected, moving the bug to VERIFIED (details in comment #5)

Comment 18 errata-xmlrpc 2014-11-12 02:29:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2014-1844.html

Comment 19 Sven Kieske 2014-11-12 08:35:53 UTC
There is neither in this BZ nor in the errata described at:
https://rhn.redhat.com/errata/RHBA-2014-1844.html
a clear description what the actual bug is, and what this fix does.

Could this information get provided somehow?
Thanks in advance

Comment 20 Nir Soffer 2014-12-21 13:58:40 UTC
(In reply to Sven Kieske from comment #19)
> There is neither in this BZ nor in the errata described at:
> https://rhn.redhat.com/errata/RHBA-2014-1844.html
> a clear description what the actual bug is, and what this fix does.
> 
> Could this information get provided somehow?

The bug:
After thin provisioned disk on block storage is extended automatically, the vm pause, and you cannot resume it. The only way to resume is to shutdown the vm and start it again.

The root cause:
When using thin provisioning on block storage, ovirt creates 1GiB lv. When the disk becomes too full, ovirt extend the lv. Extending a lv trigger a udev change event and vdsm udev rule is evaluated, setting the permissions of the lv. In recent versions of systemd (el7, fedora), udev changed the behavior, removing selinux label from devices when setting device permissions (bug 1147910). This cause the lv to loose the selinux label assigned by libvirt, which cause the vm to loose access to the lv and pause. When the vm is restarted, libvirt assign the selinux label to the vm again.

The fix:
Vdsm udev rules was modified so vdsm images do not use OWNER and GROUP for setting device permissions. Instead we run the chown command to set device permission, so udev does not modify the device selinux label.

Comment 21 Allon Mureinik 2014-12-21 14:08:25 UTC
I've added the above explanation (with some minor spelling and grammar fixes) to the doc-text field.
I think it's too late to be added in to the errata, but at least it will appear at the standard location of the bug.


Note You need to log in before you can comment on or make changes to this bug.