1113327 – Disk image ownership changed after failed VM start due to existing lock

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1113327 - Disk image ownership changed after failed VM start due to existing lock

Summary: Disk image ownership changed after failed VM start due to existing lock

Keywords:
Status:	CLOSED DUPLICATE of bug 547546
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	pre-dev-freeze
Target Release:	7.1
Assignee:	Michal Privoznik
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-06-26 00:52 UTC by Need Real Name
Modified:	2014-08-06 08:15 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-08-05 14:06:33 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Need Real Name 2014-06-26 00:52:19 UTC

Description of problem:

After a VM start fails to acquire an NFS disk image direct lock, the ownership of the image file is changed from qemu:qemu to root:root, causing the existing lock holder to lose access permission and begin experiencing IO errors in the guest.

Version-Release number of selected component (if applicable):

libvirt-1.1.1-29.el7.x86_64  on CentOS 7 Public QA build.

How reproducible:

Always.

Steps to Reproduce:
1. enable virtlockd.socket service on two hosts (may need reboot)
2. configure lock_manager=lockd in /etc/libvirt/qemu.conf on both hosts
3. use a shared NFS space for disk image files on both hosts
4. create VMs A and B on separate hosts, both using same disk image file
5. start VM A and confirm that it boots normally
6. attempt to start VM B
7. VM B fails to start due to existing lock on disk image file

Actual results:

8. VM A experiences virtual disk failure due to lost permissions
9. stopping and restarting VM A restores permissions

Expected results:

8. VM A continues to run normally since it holds the lock

Additional info:

My datastore is configured as "dir" type where the actual /var/lib/libvirt/images/teststore is an autofs mount so that I can control the NFS mount options to get NFSv4.1 in-band locks and callbacks plus a soft mount.

Comment 1 Need Real Name 2014-06-26 04:39:30 UTC

I just did another test and determined that this can be reproduced on a single host using a regular local filesystem for image storage.  Follow the steps above on a single host, using a local filesystem and create two VMs that use the same image file as their virtual disk.

When the second one fails to start due to the lock, the first loses contact with its disk.  However, the lock remains held, and multiple attempts to start the second VM will continue to fail until the first VM is powered off and the lock released.

Comment 2 Michal Privoznik 2014-06-26 09:42:47 UTC

The problem is, when starting a new domain libvirt executes the following operations (P = parent (libvirtd), C = Child):

1P) prepare cmd line
2P) fork()
3C) lock_domain_disks()
4P) do the labeling
5P) Notify child
6C) exec(qemu, ...)


The problem is disk locking and labeling are done in parallel, so if one of if fails, we may have unfortunatelly done the other one. Looking deeper into the code this diff should fix the bug:

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 5b598be..7cb0ae9 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -2736,6 +2736,12 @@ static int qemuProcessHook(void *data)
     if (virSecurityManagerClearSocketLabel(h->driver->securityManager, h->vm->def) < 0)
         goto cleanup;
 
+    VIR_DEBUG("Setting domain security labels");
+    if (virSecurityManagerSetAllLabel(driver->securityManager,
+                                      vm->def, stdin_path) < 0)
+        goto cleanup;
+    VIR_DEBUG("Labelling done");
+
     if (virNumaSetupMemoryPolicy(h->vm->def->numatune, h->nodemask) < 0)
         goto cleanup;
 
@@ -4092,17 +4098,6 @@ int qemuProcessStart(virConnectPtr conn,
         qemuProcessInitCpuAffinity(driver, vm, nodemask) < 0)
         goto cleanup;
 
-    VIR_DEBUG("Setting domain security labels");
-    if (virSecurityManagerSetAllLabel(driver->securityManager,
-                                      vm->def, stdin_path) < 0)
-        goto cleanup;
-
-    /* Security manager labeled all devices, therefore
-     * if any operation from now on fails and we goto cleanup,
-     * where virSecurityManagerRestoreAllLabel() is called
-     * (hidden under qemuProcessStop) we need to restore labels. */
-    stop_flags &= ~VIR_QEMU_PROCESS_STOP_NO_RELABEL;
-
     if (stdin_fd != -1) {
         /* if there's an fd to migrate from, and it's a pipe, put the
          * proper security label on it
@@ -4121,12 +4116,17 @@ int qemuProcessStart(virConnectPtr conn,
             goto cleanup;
     }
 
-    VIR_DEBUG("Labelling done, completing handshake to child");
     if (virCommandHandshakeNotify(cmd) < 0) {
         goto cleanup;
     }
     VIR_DEBUG("Handshake complete, child running");
 
+    /* Security manager labeled all devices, therefore
+     * if any operation from now on fails and we goto cleanup,
+     * where virSecurityManagerRestoreAllLabel() is called
+     * (hidden under qemuProcessStop) we need to restore labels. */
+    stop_flags &= ~VIR_QEMU_PROCESS_STOP_NO_RELABEL;
+
     if (migrateFrom)
         flags |= VIR_QEMU_PROCESS_START_PAUSED;

Comment 4 Martin Kletzander 2014-06-26 11:22:37 UTC

I've missed your comment, and I had a patch prepared already, the proposal is here:

https://www.redhat.com/archives/libvir-list/2014-June/msg01352.html

Comment 6 Jiri Denemark 2014-08-05 14:06:33 UTC


*** This bug has been marked as a duplicate of bug 547546 ***

Comment 7 Need Real Name 2014-08-05 18:36:40 UTC

How is this a duplicate of 547546?

The bug isn't that permissions are not restored, as described in 547546, but that they should never have been adjusted in the first place when locks had not yet been acquired.  Anything other than strict ordering of lock acquisition prior to any modification to disk image state or permissions will inherently have race conditions and endanger the existing domain that already holds valid locks...

Comment 8 Michal Privoznik 2014-08-06 08:15:34 UTC

(In reply to Need Real Name from comment #7)
> How is this a duplicate of 547546?
> 
> The bug isn't that permissions are not restored, as described in 547546, but
> that they should never have been adjusted in the first place when locks had
> not yet been acquired.  Anything other than strict ordering of lock
> acquisition prior to any modification to disk image state or permissions
> will inherently have race conditions and endanger the existing domain that
> already holds valid locks...

The patchset proposed in the referenced bug should fix this bug too.

Note You need to log in before you can comment on or make changes to this bug.