Description of problem:
When creating a snapshot that includes memory, libvirt (with enabled dynamic_ownership) attempts to chown the memory backing file to 0:0. This fails in regular oVirt NFS setup (anonuid=36,anongid=36,all_squash). To work with dynamic_ownership enabled, oVirt needs libvirt to respect NFS permissions or at least allow developers to use mechanism similar to seclabel (DAC) to selectively disable dynamic_ownership behavior.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Prepare a NFS share with options (anonuid=36,anongid=36,all_squash) -- preferably for oVirt host.
2. Use that NFS store as a target path for virDomainSnapshotCreateXML. oVirt example of <domainsnapshot> looks as follows:
<?xml version='1.0' encoding='utf-8'?>
<disk name="sda" snapshot="external" type="file">
<source file="/rhev/data-center/mnt/REDACTED/e96cb7b3-97d9-459a-98f0-4b38659dccca/images/cf999339-0f1b-4490-9279-b94064221b10/4c79cf0e-4a65-4af1-9164-273fa74dc1de" type="file"/>
<memory file="/rhev/data-center/mnt/REDACTED/e96cb7b3-97d9-459a-98f0-4b38659dccca/images/ecdba3a1-d38c-4f12-ad4e-9ed07f734f9c/27c0049d-adc4-4022-bd3b-0c6a64188d20" snapshot="external" />
The permissions under that NFS domain typically look as follows:
# ls -l
-rw-rw----. 1 vdsm kvm 2949120 Jun 8 13:39 93769776-01d8-4a3c-9f7b-c55ced7e9d81
-rw-rw----. 1 vdsm kvm 1048576 Jun 8 13:36 93769776-01d8-4a3c-9f7b-c55ced7e9d81.lease
-rw-r--r--. 1 vdsm kvm 265 Jun 8 13:36 93769776-01d8-4a3c-9f7b-c55ced7e9d81.meta
-rw-rw----. 1 vdsm kvm 42949672960 Jun 8 13:36 e7bd7620-693e-4139-a688-542f133cc9cd
-rw-rw----. 1 vdsm kvm 1048576 May 31 11:33 e7bd7620-693e-4139-a688-542f133cc9cd.lease
-rw-r--r--. 1 vdsm kvm 327 Jun 8 13:36 e7bd7620-693e-4139-a688-542f133cc9cd.meta
Snapshot creation fails. From journal:
Jun 08 13:32:04 localhost.localdomain libvirtd: 2018-06-08 11:32:04.914+0000: 16327: error : virFileOpenForceOwnerMode:2153 : cannot chown '/rhev/data-center/mnt/REDACTED/e96cb7b3-97d9-459a-98f0-4b38659dccca/images/ecdba3a1-d38c-4f12-ad4e-9ed07f734f9c/27c0049d-adc4-4022-bd3b-0c6a64188d20' to (0, 0): Operation not permitted
Jun 08 13:32:04 localhost.localdomain libvirtd: 2018-06-08 11:32:04.918+0000: 16327: error : qemuOpenFileAs:3212 : Error from child process creating '/rhev/data-center/mnt/REDACTED/e96cb7b3-97d9-459a-98f0-4b38659dccca/images/ecdba3a1-d38c-4f12-ad4e-9ed07f734f9c/27c0049d-adc4-4022-bd3b-0c6a64188d20': Operation not permitted
Memory portion of the snapshot succeeds, and has correct permissions set up.
I am able to provide oVirt environment to reproduce/investigate the issue. This behavior does not happen with dynamic_ownership=0.
Bug reproduced on libvirt-4.4 and libvirt-3.9.0-14.el7_5.6.x86_64 by these steps:
1. Buildup a NFS server with options (anonuid=36,anongid=36,all_squash,rw)
2. Mount NFS to /mnt
3. Create domain snapshot with following xml
<disk name="vda" snapshot="external" type="file">
<source file="/mnt/snap" type="file"/>
<memory file="/mnt/mem" snapshot="external" />
Got following error:
# virsh -k0 -K0 snapshot-create pc snap.xml
error: Error from child process creating '/mnt/mem': Operation not permitted
(In reply to Martin Polednik from comment #0)
> Description of problem:
> When creating a snapshot that includes memory, libvirt (with enabled
> dynamic_ownership) attempts to chown the memory backing file to 0:0. This
> fails in regular oVirt NFS setup (anonuid=36,anongid=36,all_squash). To work
> with dynamic_ownership enabled, oVirt needs libvirt to respect NFS
> permissions or at least allow developers to use mechanism similar to
> seclabel (DAC) to selectively disable dynamic_ownership behavior.
Libvirt takes DAC label of your domain and uses that to access the file as a fallback if accessing as root:root fails. What does your domain XML look like? I'm not fully convinced that you can access the NFS (which has 36:36) if your domain is running as say 40:40.
Created attachment 1453847 [details]
Attaching an example domain.xml.
NFS should be normally accessible for all users as all_squash is used.
(In reply to Milan Zamazal from comment #4)
> Created attachment 1453847 [details]
> Attaching an example domain.xml.
So the domain is running under 107:107 but you have NFS squashed to 36:36.
> NFS should be normally accessible for all users as all_squash is used.
Well, this is something libvirt can't know. Libvirt can only know the domain is running as 107:107 AND dynamicOwnership is set AND the path where you're trying to save memory is on NFS.
I see two options here:
1) make dynamicOwnership best effort only on NFS mounts. But this has great security implications on non-squashed NFS if chown() fails.
2) invent <seclabel/> for /domainsnapshot/memory so that different uid:gid can be specified. But that seems like a hack to workaround this one particular issue.
Peter, I recall talking to you about this. Do you have any other idea? I'm willing to with 2) since it's the only possible solution.
I've post a patch upstream to start some discussion:
Basically, it's a version of 1) because it does not enforce dynamicOwnership for memory snapshot (which kind of makes sense - read the commit message for explanation).
1. create snapshot: PASSED
2. create snapshot with --reuse-external: FAILED (detailed info in test steps)
Steps as follow:
1. On vdsm host, prepare a nfs dir
# cat /etc/exports
# ll /home/ | grep nfs
drwxr-xr-x. 2 vdsm kvm 23 Aug 28 06:09 nfs
# service nfs restart
Redirecting to /bin/systemctl restart nfs.service
2. mount the nfs locally
# mount | grep /home/nfs
10.73.73.57:/home/nfs on /mnt type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.73.73.57,local_lock=none,addr=10.73.73.57)
3. prepare a running vm and a snapshot xml
# cat /etc/libvirt/qemu.conf | grep dynamic_ownership
#dynamic_ownership = 1
# cat snap.xml
<disk name="vda" snapshot="external" type="file">
<source file="/mnt/disk.snap" type="file"/>
<memory file="/mnt/mem.snap" snapshot="external" />
# virsh domstate vm2; virsh domblklist vm2
4. create the snapshot
# virsh snapshot-create vm2 snap.xml
Domain snapshot 1535449611 created from 'snap.xml'
# ll /mnt
-rw-------. 1 vdsm kvm 196768 Aug 28 05:46 disk.snap
-rw-------. 1 vdsm kvm 506736552 Aug 28 05:46 mem.snap
5. create the snapshot again with --reuse-external flag
# virsh snapshot-create vm2 snap.xml --reuse-external
error: internal error: unable to execute QEMU command 'cont': Failed to get "write" lock
<===== failed with lock issue
# virsh domstate vm2
<===== vm pasued
# ll /mnt
-rw-------. 1 vdsm kvm 1769472 Aug 28 05:47 disk.snap
<===== mem snapshot gone, only disk snap shot left
Pls help to check if this is expected behaviour or I configured something wrong?
I've managed to reproduce the issue. But I think what you came across are two new bugs:
1) qemu fails to get write lock because it collides with itself (after the first snapshot it has /mnt/disk.snap plugged in, and it tries to lock it for the second time when --reuse-external is run).
2) libvirt deletes the memory even if the memory snapshot was done correctly.
I will post patch for 2). Not sure how to handle 1). Maybe Peter has an idea?
Patch for 2) posted upstream:
Case 1 is completely wrong usage. If you are using --reuse-external you need to make sure that the image is unused. The data in the image _will be destroyed_!
It was correct for qemu to fail the snapshot because of that reason and the only thing we can do is to go through the backing chains to see whether the file is shared. But that is a very complex fix for this. Additionally it will not save you from using a file used by some other VM.
Also note that qemu should consider acquiring the write locks at the time when the transaction command is issued even when the vCPUs are paused, since that state can't be rolled back in any sane way.
Also, as Peter pointed out in discussion to the patch for 2), removing the file actually is expected behaviour. The patch will not be merged then.
(In reply to Michal Privoznik from comment #16)
> Also, as Peter pointed out in discussion to the patch for 2), removing the
> file actually is expected behaviour. The patch will not be merged then.
thx Michal and Peter, then set VERIFIED.
*** Bug 1636846 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.