Created attachment 1698779 [details] libvirt-logs After further investigation, the DAC is the one changing the ownership. In the working logs: libvirt-4.5.0-23.el7_7.6.x86_64 libvirt-daemon-4.5.0-23.el7_7.6.x86_64 a677858f-2e47-49f4-ba2f-a3e385c1620a/b5e19c0e-03e7-4d62-ba2c-d292394d49d2 [root@dhcp-0-194 ~]# cat /etc/exports /root/storage_domains *(rw,no_subtree_check,anonuid=36,anongid=36) 2020-06-25 11:42:52.046+0000: 31037: info : virSecuritySELinuxSetFileconHelper:1156 : Setting SELinux context on '/rhev/data-center/mnt/10.35.0.194:_root_storage__domains_sd5/a9e2fc52-9700-4 ba1-9909-041bd4b2d45b/images/a677858f-2e47-49f4-ba2f-a3e385c1620a/b5e19c0e-03e7-4d62-ba2c-d292394d49d2' to 'system_u:object_r:mnt_t:s0' 2020-06-25 11:42:52.050+0000: 31037: debug : virFileIsSharedFSType:3670 : Check if path /rhev/data-center/mnt/10.35.0.194:_root_storage__domains_sd5/a9e2fc52-9700-4ba1-9909-041bd4b2d45b/imag es/a677858f-2e47-49f4-ba2f-a3e385c1620a/b5e19c0e-03e7-4d62-ba2c-d292394d49d2 with FS magic 26985 is shared 2020-06-25 11:42:52.050+0000: 31037: info : virSecuritySELinuxSetFileconHelper:1200 : Setting security context 'system_u:object_r:mnt_t:s0' on '/rhev/data-center/mnt/10.35.0.194:_root_storag e__domains_sd5/a9e2fc52-9700-4ba1-9909-041bd4b2d45b/images/a677858f-2e47-49f4-ba2f-a3e385c1620a/b5e19c0e-03e7-4d62-ba2c-d292394d49d2' not supported 2020-06-25 11:42:52.050+0000: 31037: info : virSecurityDACRestoreFileLabelInternal:665 : Restoring DAC user and group on '/rhev/data-center/mnt/10.35.0.194:_root_storage__domains_sd5/a9e2fc5 2-9700-4ba1-9909-041bd4b2d45b/images/a677858f-2e47-49f4-ba2f-a3e385c1620a/b5e19c0e-03e7-4d62-ba2c-d292394d49d2' 2020-06-25 11:42:52.050+0000: 31037: info : virSecurityDACSetOwnershipInternal:567 : Setting DAC user and group on '/rhev/data-center/mnt/10.35.0.194:_root_storage__domains_sd5/a9e2fc52-9700 -4ba1-9909-041bd4b2d45b/images/a677858f-2e47-49f4-ba2f-a3e385c1620a/b5e19c0e-03e7-4d62-ba2c-d292394d49d2' to '0:0' 2020-06-25 11:42:52.053+0000: 31037: info : virSecurityDACSetOwnershipInternal:611 : Setting user and group to '0:0' on '/rhev/data-center/mnt/10.35.0.194:_root_storage__domains_sd5/a9e2fc52 -9700-4ba1-9909-041bd4b2d45b/images/a677858f-2e47-49f4-ba2f-a3e385c1620a/b5e19c0e-03e7-4d62-ba2c-d292394d49d2' not permitted Which makes sense, since the default should be root_squash. While our problem is seen in: libvirt-4.5.0-36.el7.x86_64 libvirt-daemon-4.5.0-36.el7.x86_64 421709f9-610f-44fb-b646-a96c41ca95bd/bdf1a7d6-0634-4b7f-83f5-bcbd1130c70a /Compute_NFS 10.0.0.0/255.0.0.0(rw,no_root_squash) /RHV_NFS *(rw,no_root_squash) /Automation_NFS 10.0.0.0/255.0.0.0(rw,no_root_squash) /RHOS_infra 10.0.0.0/255.0.0.0(rw,no_root_squash) /QE_images 10.0.0.0/255.0.0.0(rw,no_root_squash) #/ *(fsid=0,rw,no_root_squash) 2020-06-25 11:21:32.766+0000: 38504: info : virSecuritySELinuxSetFileconHelper:1156 : Setting SELinux context on '/rhev/data-center/mnt/3par-nfs-vfs1.scl.lab.tlv.redhat.com:_vfs1_vfs1_rhv_co mpute_compute-he-6_nfs__0/d6e38a95-a4ec-4a8d-a8d7-ae81dcadb72f/images/421709f9-610f-44fb-b646-a96c41ca95bd/bdf1a7d6-0634-4b7f-83f5-bcbd1130c70a' to 'system_u:object_r:mnt_t:s0' 2020-06-25 11:21:32.766+0000: 38504: debug : virFileIsSharedFSType:3670 : Check if path /rhev/data-center/mnt/3par-nfs-vfs1.scl.lab.tlv.redhat.com:_vfs1_vfs1_rhv_compute_compute-he-6_nfs__0/ d6e38a95-a4ec-4a8d-a8d7-ae81dcadb72f/images/421709f9-610f-44fb-b646-a96c41ca95bd/bdf1a7d6-0634-4b7f-83f5-bcbd1130c70a with FS magic 26985 is shared 2020-06-25 11:21:32.766+0000: 38504: info : virSecuritySELinuxSetFileconHelper:1200 : Setting security context 'system_u:object_r:mnt_t:s0' on '/rhev/data-center/mnt/3par-nfs-vfs1.scl.lab.tl v.redhat.com:_vfs1_vfs1_rhv_compute_compute-he-6_nfs__0/d6e38a95-a4ec-4a8d-a8d7-ae81dcadb72f/images/421709f9-610f-44fb-b646-a96c41ca95bd/bdf1a7d6-0634-4b7f-83f5-bcbd1130c70a' not supported 2020-06-25 11:21:32.766+0000: 38504: info : virSecurityDACRestoreFileLabelInternal:665 : Restoring DAC user and group on '/rhev/data-center/mnt/3par-nfs-vfs1.scl.lab.tlv.redhat.com:_vfs1_vfs 1_rhv_compute_compute-he-6_nfs__0/d6e38a95-a4ec-4a8d-a8d7-ae81dcadb72f/images/421709f9-610f-44fb-b646-a96c41ca95bd/bdf1a7d6-0634-4b7f-83f5-bcbd1130c70a' 2020-06-25 11:21:32.766+0000: 38504: info : virSecurityDACSetOwnershipInternal:567 : Setting DAC user and group on '/rhev/data-center/mnt/3par-nfs-vfs1.scl.lab.tlv.redhat.com:_vfs1_vfs1_rhv_ compute_compute-he-6_nfs__0/d6e38a95-a4ec-4a8d-a8d7-ae81dcadb72f/images/421709f9-610f-44fb-b646-a96c41ca95bd/bdf1a7d6-0634-4b7f-83f5-bcbd1130c70a' to '0:0' While the storage set with no_root_squash giving libvirt privilege to change ownership on the mounting point. Since this VM went up using virDomainRestoreFlags, we can't provide it seclabel to prevent the ownership change like for other disks (they available in the domxml).
The 3par-nfs is also no_root_squash. In libvirt 6 we see the same behavior, but without the DAC lines in the debug log.
Workaround options: 1. Configure the storage to root_squash. 2. Change the file ownership (set the volume path to by # chown 36:36 <path> or # chown vdsm:kvm <path>) - it will be required each time the restore executed and you wish to restore again with memory the same snapshot. 3. Edit the host configuration: in /etc/libvirt/qemu.conf change dynamic_ownership to 0. # vi /etc/libvirt/qemu.conf dynamic_ownership=0 # systemctl restart libvirtd
(In reply to Liran Rotenberg from comment #3) > Workaround options: > 1. Configure the storage to root_squash. This is possible. > 2. Change the file ownership (set the volume path to by # chown 36:36 <path> > or # chown vdsm:kvm <path>) > - it will be required each time the restore executed and you wish to restore > again with memory the same snapshot. This works but not something we can ask users to do. > 3. Edit the host configuration: in /etc/libvirt/qemu.conf change > dynamic_ownership to 0. > > # vi /etc/libvirt/qemu.conf > dynamic_ownership=0 > # systemctl restart libvirtd This is not possible since vdsm removed the code fixing ownership in anything but storage, and depend on libvirt for this. I think this should be fixed in libvirt. If not, we need to add API in supervdsm to change volume ownership, so we can clean after libvirt. Basically storage code assumes that nobody is modifying volume owner and group. If some program modify them it is responsible for fixing the it later.
Michal, can you please have a look and triage this one? Thanks.
I had a long discussion with Liran yesterday in IRC and we agreed that this is indeed a libvirt bug and it's worth fixing. I think I have an idea how to fix it. But what is bothering me is that libvirt has been chown()-ing the file on restore since like forever. Why this bug hasn't demonstrated itself sooner? One thing is that the file lives on a root squashed NFS, but not every NFS vdsm uses is root squashed. Anyway, stay tuned for more info.
(In reply to Michal Privoznik from comment #6) > I had a long discussion with Liran yesterday in IRC and we agreed that this > is indeed a libvirt bug and it's worth fixing. I think I have an idea how to > fix it. But what is bothering me is that libvirt has been chown()-ing the > file on restore since like forever. Why this bug hasn't demonstrated itself > sooner? One thing is that the file lives on a root squashed NFS, but not > every NFS vdsm uses is root squashed. Anyway, stay tuned for more info. Hi Michal, For this issue, I am a little confused about what is the expected result. So no matter what has be done, libvirt should always set the images' ownership to original ones(which is vdsm 36:36 in this bz)? During my test, not only virDomainRestoreFlags+shutdown will cause image ownership changed, but just a 'virsh start + virsh destroy' (virDomainCreate?) will also trigger the ownership modified. So is the following steps are valid test ccases? Pls note I used qemu:qemu as the ownership in the test. Tested with libvirt-6.0.0-19.module+el8.2.1+6538+c148631f.x86_64 Scenario 1. Use no_root_squash nfs config to reproduce this issue 1. Edit the /etc/exports as follow: root@yisun-test2 ~ ## cat /etc/exports /nfsexp *(rw,no_root_squash) 2. Create a qcow2 image and chown it to qemu:qemu root@yisun-test2 ~ ## qemu-img create -f qcow2 /nfsexp/test.qcow2 1G Formatting '/nfsexp/test.qcow2', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off refcount_bits=16 root@yisun-test2 ~ ## chown 107:107 /nfsexp/test.qcow2 root@yisun-test2 ~ ## ll /nfsexp/ -rw-r--r--. 1 qemu qemu 196624 Jun 28 16:01 test.qcow2 3. Restart nfs service root@yisun-test2 ~ ## systemctl restart nfs USING THIS IMG IN VM: 1. Mount the nfs dir root@yisun-test1 ~ 04:22:45$ mount -t nfs 10.66.85.212:/nfsexp /nfs root@yisun-test1 ~ 04:24:31$ ll /nfs total 196 -rw-r--r--. 1 qemu qemu 196624 Jun 28 04:23 test.qcow2 <== group and owner is qemu 2. Use the img in vm root@yisun-test1 ~ 04:26:02$ virsh domblklist test Target Source --------------------------- vda /nfs/test.qcow2 3. Start the vm: root@yisun-test1 ~ 04:26:33$ virsh start test Domain test started 4. Destroy the vm root@yisun-test1 ~ 04:27:02$ virsh destroy test Domain test destroyed root@yisun-test1 ~ 04:27:39$ ll /nfs/test.qcow2 -rw-r--r--. 1 root root 196624 Jun 28 04:23 /nfs/test.qcow2 <==== group and owner changed to root Scenario 2. Use root_squash nfs config to avoid this issue NFS Server setup: 1. Edit the /etc/exports as follow: root@yisun-test2 ~ ## cat /etc/exports /nfsexp *(rw,root_squash) 2. Create a qcow2 image and chown it to qemu:qemu root@yisun-test2 ~ ## qemu-img create -f qcow2 /nfsexp/test.qcow2 1G Formatting '/nfsexp/test.qcow2', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off refcount_bits=16 root@yisun-test2 ~ ## chown 107:107 /nfsexp/test.qcow2 root@yisun-test2 ~ ## ll /nfsexp/ -rw-r--r--. 1 qemu qemu 196624 Jun 28 16:01 test.qcow2 3. Restart nfs service root@yisun-test2 ~ ## systemctl restart nfs USING THIS IMAGE IN A VM: 1. Mount the nfs dir: root@yisun-test1 ~ 04:21:26$ mount -t nfs 10.66.85.212:/nfsexp /nfs root@yisun-test1 ~ 04:21:28$ ll /nfs -rw-r--r--. 1 qemu qemu 196624 Jun 28 04:01 test.qcow2 <==== group and owner is qemu 2. Use the img in vm root@yisun-test1 ~ 04:21:38$ virsh domblklist test Target Source --------------------------- vda /nfs/test.qcow2 3. Start the vm root@yisun-test1 ~ 04:22:01$ virsh start test Domain test started root@yisun-test1 ~ 04:22:03$ ll -Z /nfs/test.qcow2 -rw-r--r--. 1 qemu qemu system_u:object_r:nfs_t:s0 196624 Jun 28 04:01 /nfs/test.qcow2 4. Destroy the vm root@yisun-test1 ~ 04:22:11$ virsh destroy test Domain test destroyed root@yisun-test1 ~ 04:22:16$ ll -Z /nfs/test.qcow2 -rw-r--r--. 1 qemu qemu system_u:object_r:nfs_t:s0 196624 Jun 28 04:01 /nfs/test.qcow2 <==== owner not changed, still qemu:qemu
We had similar bug: BZ 1666795 where we saw the ownership change for every disk at destory. The difference here is that we don't see the disk in the domxml. The disk is an argument provided to virDomainRestoreFlags (the path of the volume). From RHV point of view, we can't add the seclabel element to prevent it.
(In reply to Liran Rotenberg from comment #8) > We had similar bug: BZ 1666795 where we saw the ownership change for every > disk at destory. > The difference here is that we don't see the disk in the domxml. The disk is > an argument provided to virDomainRestoreFlags (the path of the volume). > From RHV point of view, we can't add the seclabel element to prevent it. Thx for the info involve save/restore qe to track
(In reply to yisun from comment #7) > Hi Michal, > For this issue, I am a little confused about what is the expected result. So > no matter what has be done, libvirt should always set the images' ownership > to original ones(which is vdsm 36:36 in this bz)? Yes, this is the idea behind remembering original owner/SELinux lable of the file. However, the way it is implemented it needs help from filesystem - to store extended attributes (XATTRs). But NFS still lacks those. In such cases libvirt falls back to its old behavior - restore to root:root. The rationale for this behavior is that files attached to a domain may contain sensitive information (e.g. disks may contain passwords; well hashes of passwords, but still). > During my test, not only > virDomainRestoreFlags+shutdown will cause image ownership changed, but just > a 'virsh start + virsh destroy' (virDomainCreate?) will also trigger the > ownership modified. Yes, but as explained above, NFS lacks XATTRs (there were some patches sent at the end of last year to implement XATTR support for NFSv4, but I don't know what happened to them), therefore libvirt has no place to store the original owner of the file. The way vdsm/RHV addresses this issue is that for paths they don't want libvirt to touch (e.g. disks), they set <seclabel model='dac' relabel='no'/>. However, as Liran points out, there is no interface to put the norelabel onto the file the domain is restoring from. Therefore, the idea is to change libvirt so that it does not change ownership of the restore file. I could call the code that save the original owner into XATTRs and so that when the domain us up & running and the file is chown()-ing back the XATTR would be read and the original owner (e.g. vdsm:kvm) restored. BUT, since this is NFS it won't work. And while I could blame NFS for that, we will need a different approach. I'm working on a PoC solution.
Patches posted upstream: https://www.redhat.com/archives/libvir-list/2020-July/msg00007.html
I've just merged patches upstream: 77ef118456 qemu_security: Complete renaming of virSecurityManagerSetAllLabel() argument f68a14d17f secdrivers: Rename @stdin_path argument of virSecurityDomainSetAllLabel() 7e235954e5 Revert "qemuSecurityDomainRestorePathLabel: Introduce @ignoreNS argument" 824e349397 qemu: Use qemuSecuritySetSavedStateLabel() to label restore path d665b1ef3b security_selinux: Implement virSecurityManager{Set,Restore}SavedStateLabel e69df41b6d qemu_security: Implement virSecurityManager{Set,Restore}SavedStateLabel 228a27f59b security: Reintroduce virSecurityManager{Set,Restore}SavedStateLabel v6.5.0-142-g77ef118456
Reproduce this issue on rhel-av8.2: libvirt-daemon-6.0.0-25.2.module+el8.2.1+7722+a9e38cf3.x86_64 qemu-kvm-4.2.0-29.module+el8.2.1+7712+3c3fe332.2.x86_64 Steps: # cp avocado-norootsquash.save /yanqzhan-nfs/ # chown 107:107 /yanqzhan-nfs/avocado-norootsquash.save # ll -Z /yanqzhan-nfs/avocado-norootsquash.save -rw-------. 1 qemu qemu system_u:object_r:nfs_t:s0 556000011 Aug 27 06:03 /yanqzhan-nfs/avocado-norootsquash.save # virsh restore /yanqzhan-nfs/avocado-norootsquash.save Domain restored from /yanqzhan-nfs/avocado-norootsquash.save # ll -Z /yanqzhan-nfs/avocado-norootsquash.save -rw-------. 1 root root system_u:object_r:nfs_t:s0 556000011 Aug 27 06:03 /yanqzhan-nfs/avocado-norootsquash.save # virsh shutdown avocado-vt-vm1 Domain avocado-vt-vm1 is being shutdown # ll -Z /yanqzhan-nfs/avocado-norootsquash.save -rw-------. 1 root root system_u:object_r:nfs_t:s0 556000011 Aug 27 06:03 /yanqzhan-nfs/avocado-norootsquash.save Verify this bug on rhel-av8.3: libvirt-daemon-6.6.0-2.module+el8.3.0+7567+dc41c0a9.x86_64 qemu-kvm-5.1.0-3.module+el8.3.0+7708+740a1315.x86_64 Steps: # cp avocado-norootsquash.save /yanqzhan-nfs/ yqz-NFS]# chown 107:107 avocado-norootsquash.save # ll -Z /yanqzhan-nfs/avocado-norootsquash.save -rw-------. 1 qemu qemu system_u:object_r:nfs_t:s0 388200099 Aug 27 05:56 /yanqzhan-nfs/avocado-norootsquash.save # virsh restore /yanqzhan-nfs/avocado-norootsquash.save Domain restored from /yanqzhan-nfs/avocado-norootsquash.save # ll -Z /yanqzhan-nfs/avocado-norootsquash.save -rw-------. 1 qemu qemu system_u:object_r:nfs_t:s0 388200099 Aug 27 05:56 /yanqzhan-nfs/avocado-norootsquash.save # virsh shutdown avocado-vt-vm1 Domain avocado-vt-vm1 is being shutdown # ll -Z /yanqzhan-nfs/avocado-norootsquash.save -rw-------. 1 qemu qemu system_u:object_r:nfs_t:s0 388200099 Aug 27 05:56 /yanqzhan-nfs/avocado-norootsquash.save
Hi Liran, Could you check whether comment18 meets your requirement please? Or do you want to try your rhv operations again before this bug set to 'verified'? Thanks.
Yes, as long virsh restore triggers virDomainRestoreFlags of avocado-norootsquash.save and the NFS is like the name suggests - configured as no_root_squash.
Thanks again. The log msg for comment15 is: 2020-08-27 10:16:28.681+0000: 18246: debug : virDomainRestore:955 : conn=0x7fdd68007960, from=/yanqzhan-nfs/avocado-norootsquash.save To test virDomainRestoreFlags: # ll -Z /yanqzhan-nfs total 1101788 -rw-------. 1 qemu qemu system_u:object_r:nfs_t:s0 388200099 Aug 27 23:33 avocado-norootsquash.save -rw-r--r--. 1 qemu qemu system_u:object_r:nfs_t:s0 740098048 Aug 27 23:44 RHEL-8.2.0-RC-1.3-x86_64.qcow2 # virsh restore /yanqzhan-nfs/avocado-norootsquash.save --xml avocado.xml-dac Domain restored from /yanqzhan-nfs/avocado-norootsquash.save # ll -Z /yanqzhan-nfs total 1101788 -rw-------. 1 qemu qemu system_u:object_r:nfs_t:s0 388200099 Aug 27 23:33 avocado-norootsquash.save -rw-r--r--. 1 qemu qemu system_u:object_r:nfs_t:s0 740098048 Aug 27 23:44 RHEL-8.2.0-RC-1.3-x86_64.qcow2 # virsh list --all Id Name State -------------------------------------- 2 avocado-vt-vm1 running # virsh shutdown avocado-vt-vm1 Domain avocado-vt-vm1 is being shutdown # virsh list --all Id Name State -------------------------------------- # ll -Z /yanqzhan-nfs total 1101788 -rw-------. 1 qemu qemu system_u:object_r:nfs_t:s0 388200099 Aug 27 23:33 avocado-norootsquash.save -rw-r--r--. 1 qemu qemu system_u:object_r:nfs_t:s0 740098048 Aug 27 23:46 RHEL-8.2.0-RC-1.3-x86_64.qcow2 # cat libvirtd.log|grep from= 2020-08-28 03:45:28.043+0000: 39513: debug : virDomainRestoreFlags:1025 : conn=0x7fa63c0076d0, from=/yanqzhan-nfs/avocado-norootsquash.save, dxml=<domain type='kvm'> Since test result is as expected, mark this bug as verified.
(In reply to yanqzhan from comment #21) > Thanks again. > > The log msg for comment15 is: Sorry, here should be ' for comment18' instead. > 2020-08-27 10:16:28.681+0000: 18246: debug : virDomainRestore:955 : > conn=0x7fdd68007960, from=/yanqzhan-nfs/avocado-norootsquash.save >
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5137