Bug 1004673

Summary: when live migrate, both nodes change the permission of files
Product: [Community] Virtualization Tools Reporter: YunQiang Su <wzssyqa>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED DEFERRED QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: bloch, crobinso, eblake, gsun, rbalakri, rhbugzilla, stephan, tharbaug
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-02 14:24:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description YunQiang Su 2013-09-05 08:02:58 UTC
Description of problem:
Here we have 2 nodes sharing the same NFS storages for disks for VM

When compile libvirt with         
      --with-qemu-user=libvirt-qemu  \
        --with-qemu-group=kvm    \

and start a VM with libvirt-qemu:kvm

When live migrate VMs, the owner of these files will be mass

Version-Release number of selected component (if applicable):
1.0.1 is test

How reproducible:
root@node1:/var/lib/nova/instances/instance-00000004# ls -l
total 18916
-rw-rw---- 1 libvirt-qemu kvm     17004 Sep  5 15:56 console.log
-rw-r--r-- 1 libvirt-qemu kvm  35651584 Sep  5 15:56 disk
-rw-rw-r-- 1 libvirt-qemu kvm   4731440 Sep  5 13:51 kernel
-rw-rw-r-- 1 nova         nova     1673 Sep  5 13:51 libvirt.xml
-rw-rw-r-- 1 libvirt-qemu kvm   2254249 Sep  5 13:51 ramdisk

root@node1:/var/lib/nova/instances/instance-00000004# virsh migrate --live instance-00000004 qemu+tcp://node2/system

Actual results:
root@node1:/var/lib/nova/instances/instance-00000004# ls -l
total 18896
-rw-rw---- 1 root root        0 Sep  5 15:57 console.log
-rw-r--r-- 1 root root 35651584 Sep  5 15:56 disk
-rw-rw-r-- 1 root root  4731440 Sep  5 13:51 kernel
-rw-rw-r-- 1 nova nova     1673 Sep  5 13:51 libvirt.xml
-rw-rw-r-- 1 root root  2254249 Sep  5 13:51 ramdisk

Expected results:
root@node1:/var/lib/nova/instances/instance-00000004# ls -l
total 18916
-rw-rw---- 1 libvirt-qemu kvm     17004 Sep  5 15:56 console.log
-rw-r--r-- 1 libvirt-qemu kvm  35651584 Sep  5 15:56 disk
-rw-rw-r-- 1 libvirt-qemu kvm   4731440 Sep  5 13:51 kernel
-rw-rw-r-- 1 nova         nova     1673 Sep  5 13:51 libvirt.xml
-rw-rw-r-- 1 libvirt-qemu kvm   2254249 Sep  5 13:51 ramdisk

Additional info:
It seems that the owner of these files has been changed by node1 again after they are changed by node2

Comment 1 Eric Blake 2013-09-05 11:56:04 UTC
What is the content of /etc/libvirt/qemu.conf?  You MUST have the same user/group setup on BOTH sides of the migration (either by explicit listing in the conf file, or by the defaults built into the binary at configure time), otherwise you are using an unsupported configuration.  It is NOT supported to change owner/group across migration.  This is probably invalid usage rather than a bug, although I need to know your conf file contents to know for sure.

Comment 2 YunQiang Su 2013-09-07 10:40:39 UTC
I use the default configure file full of comment, yes, it is a empty file.

What I report is that, when the instance stop, the disk file will restore to root:root, while when instance start, disks file will be chowned by libvirt-qemu:kvm.

I wish that both sides they have the same user:group.
While due to this bug:


After migration, the user:group will be not the same.

I guess that user:group changed by the destination node, and then changed back to root:root by the source node.

Comment 3 Eric Blake 2013-09-09 13:56:23 UTC
(In reply to YunQiang Su from comment #2)
> I use the default configure file full of comment, yes, it is a empty file.

Are you running the same self-built libvirtd on both nodes?  What happens if you put explicit:
user="libvirt-qemu"
group="kvm"
in your /etc/libvirt/qemu.conf on both sides?

> 
> What I report is that, when the instance stop, the disk file will restore to
> root:root, while when instance start, disks file will be chowned by
> libvirt-qemu:kvm.

That is normal behavior.  If you don't like libvirt doing chown, set:
dynamic_ownership=1
in your /etc/libvirt.qemu.conf on both sides (but then YOU are responsible for correct ownership, rather than relying on libvirt to auto-change it as needed).

> 
> I wish that both sides they have the same user:group.
> While due to this bug:
> 
> 
> After migration, the user:group will be not the same.

You still haven't proven to me that you have identical libvirt configurations on both source and destination; if permissions are botched because of mismatched config, the bug is on your end; only if we can prove that permissions are botched by identically-configured libvirtd is it a libvirt bug.  I'm still leaning towards this being usage error and not a libvirt bug.

Comment 4 Stephan Ellis 2013-12-17 21:33:54 UTC
I can confirm this problem on fresh CentOS 6.5 machine.  On live migrate, libvirt is changing the owner and group on disk images to 0:0.  destroy/start that same vm in place on the host it migrated to and it gets changed to qemu:kvm.  I've even tried setting dynamic_ownership to 0, but it still changes the owner group on live migrate.  I've got 3 hosts and the shared storage is a gluster volume.  Same problem shows up on NFS.

relevant entry from libvirtd.log looks like:

2013-12-17 21:27:17.714+0000: 11549: info : virSecurityDACRestoreSecurityFileLabel:340 : Restoring DAC user and group on '/var/lib/libvirt/images/vol1/test2.qcow2'
2013-12-17 21:27:17.726+0000: 11549: info : virSecurityDACSetOwnership:296 : Setting DAC user and group on '/var/lib/libvirt/images/vol1/test2.qcow2' to '0:0'
2013-12-17 21:27:20.444+0000: 11545: info : remoteDispatchAuthList:2398 : Bypass polkit auth for privileged client pid:11859,uid:0
2013-12-17 21:27:20.541+0000: 11545: info : virSecuritySELinuxSetFileconHelper:794 : Setting SELinux context on '/var/lib/libvirt/images/vol1/test2.qcow2' to 'unconfined_u:object_r:svirt_image_t:s0:c303,c544'
2013-12-17 21:27:20.543+0000: 11545: info : virSecuritySELinuxSetFileconHelper:833 : Setting security context 'unconfined_u:object_r:svirt_image_t:s0:c303,c544' on '/var/lib/libvirt/images/vol1/test2.qcow2' not supported
2013-12-17 21:27:20.543+0000: 11545: info : virSecurityDACSetOwnership:296 : Setting DAC user and group on '/var/lib/libvirt/images/vol1/test2.qcow2' to '107:36'
2013-12-17 21:27:20.544+0000: 11545: info : virSecurityDACSetOwnership:296 : Setting DAC user and group on '/var/lib/libvirt/images/vol1/bi-win2k8-r2dc64-20130821.qcow2' to '107:36'


As you can see, when the vm was migrated, the disk image was changed to 0:0.  destroy/stop resulted in the permissions being set correctly.

Comment 5 Stephan Ellis 2013-12-17 23:00:00 UTC
Just realized that it was probably because I didn't have the storage.owner-* options set on the volume properly.

(In reply to Stephan Ellis from comment #4)
> I can confirm this problem on fresh CentOS 6.5 machine.  On live migrate,
> libvirt is changing the owner and group on disk images to 0:0. 
> destroy/start that same vm in place on the host it migrated to and it gets
> changed to qemu:kvm.  I've even tried setting dynamic_ownership to 0, but it
> still changes the owner group on live migrate.  I've got 3 hosts and the
> shared storage is a gluster volume.  Same problem shows up on NFS.
> 
> relevant entry from libvirtd.log looks like:
> 
> 2013-12-17 21:27:17.714+0000: 11549: info :
> virSecurityDACRestoreSecurityFileLabel:340 : Restoring DAC user and group on
> '/var/lib/libvirt/images/vol1/test2.qcow2'
> 2013-12-17 21:27:17.726+0000: 11549: info : virSecurityDACSetOwnership:296 :
> Setting DAC user and group on '/var/lib/libvirt/images/vol1/test2.qcow2' to
> '0:0'
> 2013-12-17 21:27:20.444+0000: 11545: info : remoteDispatchAuthList:2398 :
> Bypass polkit auth for privileged client pid:11859,uid:0
> 2013-12-17 21:27:20.541+0000: 11545: info :
> virSecuritySELinuxSetFileconHelper:794 : Setting SELinux context on
> '/var/lib/libvirt/images/vol1/test2.qcow2' to
> 'unconfined_u:object_r:svirt_image_t:s0:c303,c544'
> 2013-12-17 21:27:20.543+0000: 11545: info :
> virSecuritySELinuxSetFileconHelper:833 : Setting security context
> 'unconfined_u:object_r:svirt_image_t:s0:c303,c544' on
> '/var/lib/libvirt/images/vol1/test2.qcow2' not supported
> 2013-12-17 21:27:20.543+0000: 11545: info : virSecurityDACSetOwnership:296 :
> Setting DAC user and group on '/var/lib/libvirt/images/vol1/test2.qcow2' to
> '107:36'
> 2013-12-17 21:27:20.544+0000: 11545: info : virSecurityDACSetOwnership:296 :
> Setting DAC user and group on
> '/var/lib/libvirt/images/vol1/bi-win2k8-r2dc64-20130821.qcow2' to '107:36'
> 
> 
> As you can see, when the vm was migrated, the disk image was changed to 0:0.
> destroy/stop resulted in the permissions being set correctly.

Comment 6 Mark R. 2014-02-18 17:53:17 UTC
(In reply to Stephan Ellis from comment #5)
> Just realized that it was probably because I didn't have the storage.owner-*
> options set on the volume properly.
> 

Following up with this since Stephan hasn't reported back, but setting storage.owner-*id doesn't help with this issue.  I also have VM images on a gluster volume, and I'm running libvirtd with qemu.conf as provided (defaults) in CentOS 6.5.

On migration, the images are chown'ed to 0:0 regardless of what options are configured for the gluster volume.

As mentioned in this comment ( https://bugzilla.redhat.com/show_bug.cgi?id=1057645#c7 ), if libvirt can detect gluster as a shared filesystem as it does for others, and therefore skip the chown on migration, this issue is squashed.

You can also set the 'user=qemu', 'group=qemu', and 'dynamic_ownership=0' options in qemu.conf, restart libvirtd, and successfully do a live migration.  However, that is much less administration-friendly, e.g. a VM that has an iSCSI multipath device on the host as one of its disks can no longer start until you chown qemu:qemu that multipath disk's /dev/dm-X entry. On shutdown/destroy of the VM, /dev/dm-X is chown'ed back to root:disk regardless of 'dynamic_ownership=0', so the VM is unable to start w/o intervention each time.  

Of the two possibilities, the ability to detect gluster and not chown on migration fixes all of this and makes life much simpler for us end-users.

Comment 7 Thayne Harbaugh 2014-12-03 19:30:28 UTC
This may be related to 714997:

  https://bugzilla.redhat.com/show_bug.cgi?id=714997

Comment 8 Cole Robinson 2016-04-10 14:36:27 UTC
Can anyone confirm this is still an issue with recentish libvirt versions? If so please report what distro and libvirt version you are using

Comment 9 Cole Robinson 2016-05-02 14:24:56 UTC
Since there's no response, closing as DEFERRED. But if anyone is still affected with newer libvirt versions, please re-open and we can triage from there