Created attachment 1639804 [details] vdsm log with ERROR Description of problem: I encounter a permissions failure when starting up a vm created on base of NFS storage. I am not sure the following fact is related, but i'll mention it as well: The filing vm is created on rhev 4.4 env out of template that originates from 4.3 env. The error is: 2019-11-25 08:57:18,000+0000 ERROR (vm/24c88f53) [virt.vm] (vmId='24c88f53-7875-4c41-9811-89d801a03845') The vm start process failed (vm:841) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 775, in _startUnderlyingVm self._run() File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 2600, in _run dom.createWithFlags(flags) File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python3.6/site-packages/libvirt.py", line 1166, in createWithFlags if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) libvirt.libvirtError: internal error: child reported (status=125): unable to open /rhev/data-center/mnt/172.30.83.204:_mnt_red01__l1__group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5-588f-4148-9d75-2daf3e12d997/12ab80ae-bd80-42b3-9f0a-255e68d0baf7: Permission denied Checking permissions for this path on the vdsm host. it appears to be normal as far as I know: [root@f01-h04-000-1029u mnt]# ll /rhev/data-center/mnt/172.30.83.204:_mnt_red01__l1__group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5-588f-4148-9d75-2daf3e12d997/12ab80ae-bd80-42b3-9f0a-255e68d0baf7 -rw-rw----. 2 vdsm kvm 3221225472 Nov 25 08:38 /rhev/data-center/mnt/172.30.83.204:_mnt_red01__l1__group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5-588f-4148-9d75-2daf3e12d997/12ab80ae-bd80-42b3-9f0a-255e68d0baf7 Checking this path on the NFS server shows nothing weird as well: [root@nfs-server /]# ll /mnt/red01_l1_group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5-588f-4148-9d75-2daf3e12d997/ total 3147984 -rw-rw----. 2 36 36 3221225472 Nov 25 03:38 12ab80ae-bd80-42b3-9f0a-255e68d0baf7 -rw-rw----. 2 36 36 1048576 Nov 25 03:37 12ab80ae-bd80-42b3-9f0a-255e68d0baf7.lease -rw-r--r--. 2 36 36 341 Nov 25 03:47 12ab80ae-bd80-42b3-9f0a-255e68d0baf7.meta -rw-rw----. 1 36 36 198208 Nov 25 03:47 519fbd7f-b04c-430e-8df6-7d27cb2f495a -rw-rw----. 1 36 36 1048576 Nov 25 03:47 519fbd7f-b04c-430e-8df6-7d27cb2f495a.lease -rw-r--r--. 1 36 36 252 Nov 25 03:47 519fbd7f-b04c-430e-8df6-7d27cb2f495a.meta [root@nfs-server /]# ll -Z /mnt/red01_l1_group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5-588f-4148-9d75-2daf3e12d997/12ab80ae-bd80-42b3-9f0a-255e68d0baf7 -rw-rw----. 36 36 system_u:object_r:unlabeled_t:s0 /mnt/red01_l1_group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5-588f-4148-9d75-2daf3e12d997/12ab80ae-bd80-42b3-9f0a-255e68d0baf7 Version-Release number of selected component (if applicable): engine: rhv-release-4.4.0-5-999.noarch redhat-release-server-7.7-10.el7.x86_64 vdsm host: rhv-release-4.4.0-5-999.noarch redhat-release-eula-8.2-0.5.el8.x86_64 vdsm-4.40.0-141.gitb9d2120.el8ev.x86_64 python3-libvirt-5.6.0-1.module+el8.1.0+3891+3b51c436.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create NFS storage domain 2. Create vm out of existing template 3. Start the vm Actual results: VM fails to start Expected results: VM shouldnt fail Additional info: Attaching vdsm log as a file, and providing an URL (in a private message bellow) for downloading log-collector archive with relevant vdsm host. To make it easier to search for that particular failure, grep for '2019-11-25 08:57:18' in vdsm log
Ilan, can you: - attach output of /etc/exports on the NFS server - attach output of "exportfs -v" on the NFS server
This is the vm xml that failed to start: <?xml version="1.0" encoding="utf-8"?> <domain type="kvm" xmlns:ns0="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <name>ovirt_enabled</name> <uuid>24c88f53-7875-4c41-9811-89d801a03845</uuid> <memory>1048576</memory> <currentMemory>1048576</currentMemory> <iothreads>1</iothreads> <maxMemory slots="16">4194304</maxMemory> <vcpu current="1">16</vcpu> <sysinfo type="smbios"> <system> <entry name="manufacturer">Red Hat</entry> <entry name="product">RHEL</entry> <entry name="version">8.2-0.5.el8</entry> <entry name="serial">00000000-0000-0000-0000-0cc47af9641e</entry> <entry name="uuid">24c88f53-7875-4c41-9811-89d801a03845</entry> </system> </sysinfo> <clock adjustment="0" offset="variable"> <timer name="rtc" tickpolicy="catchup"/> <timer name="pit" tickpolicy="delay"/> <timer name="hpet" present="no"/> </clock> <features> <acpi/> </features> <cpu match="exact"> <model>SandyBridge</model> <topology cores="1" sockets="16" threads="1"/> <numa> <cell cpus="0-15" id="0" memory="1048576"/> </numa> <feature name="vmx" policy="require"/></cpu> <cputune/> <devices> <input bus="usb" type="tablet"/> <channel type="unix"> <target name="ovirt-guest-agent.0" type="virtio"/> <source mode="bind" path="/var/lib/libvirt/qemu/channels/24c88f53-7875-4c41-9811-89d801a03845.ovirt-guest-agent.0"/> </channel> <channel type="unix"> <target name="org.qemu.guest_agent.0" type="virtio"/> <source mode="bind" path="/var/lib/libvirt/qemu/channels/24c88f53-7875-4c41-9811-89d801a03845.org.qemu.guest_agent.0"/> </channel> <rng model="virtio"> <backend model="random">/dev/urandom</backend> <alias name="ua-2aa66d71-4fd0-47f8-a0ba-b39ad1e9e047"/> </rng> <graphics autoport="yes" keymap="en-us" passwd="*****" passwdValidTo="1970-01-01T00:00:01" port="-1" type="vnc"> <listen network="vdsm-display" type="network"/> </graphics> <controller index="0" ports="16" type="virtio-serial"> <alias name="ua-337d0bde-92d9-47df-894f-ffd2308d3798"/> </controller> <sound model="ich6"> <alias name="ua-33bcd779-7b4a-4361-b3bf-f3760a536582"/> </sound> <memballoon model="virtio"> <stats period="5"/> <alias name="ua-53ddd9b9-ad6b-4d45-af15-5edb3ac82975"/> </memballoon> <video> <model heads="1" ram="65536" type="qxl" vgamem="16384" vram="8192"/> <alias name="ua-611c2d05-001b-4ff9-9edb-173afc748793"/> </video> <controller index="0" model="qemu-xhci" ports="8" type="usb"/> <graphics autoport="yes" passwd="*****" passwdValidTo="1970-01-01T00:00:01" port="-1" tlsPort="-1" type="spice"> <channel mode="secure" name="main"/> <channel mode="secure" name="inputs"/> <channel mode="secure" name="cursor"/> <channel mode="secure" name="playback"/> <channel mode="secure" name="record"/> <channel mode="secure" name="display"/> <channel mode="secure" name="smartcard"/> <channel mode="secure" name="usbredir"/> <listen network="vdsm-display" type="network"/> </graphics> <controller index="0" model="virtio-scsi" type="scsi"> <driver iothread="1"/> <alias name="ua-d1a7651e-f0c7-4b1c-8589-d24a29d1c495"/> </controller> <channel type="spicevmc"> <target name="com.redhat.spice.0" type="virtio"/> </channel> <disk device="cdrom" snapshot="no" type="file"> <driver error_policy="report" name="qemu" type="raw"/> <source file="" startupPolicy="optional"> <seclabel model="dac" relabel="no" type="none"/> </source> <target bus="sata" dev="sdc"/> <readonly/> <alias name="ua-5f022f0a-66c4-4712-bea6-1cba00244791"/> </disk> <disk device="disk" snapshot="no" type="file"> <target bus="virtio" dev="vda"/> <source file="/rhev/data-center/mnt/172.30.83.204:_mnt_red01__l1__group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5-588f-4148-9d75-2daf3e12d997/519fbd7f-b04c-430e-8df6-7d27cb2f495a"> <seclabel model="dac" relabel="no" type="none"/> </source> <driver cache="none" error_policy="stop" io="threads" iothread="1" name="qemu" type="qcow2"/> <alias name="ua-013d1cf5-588f-4148-9d75-2daf3e12d997"/> <boot order="1"/> <serial>013d1cf5-588f-4148-9d75-2daf3e12d997</serial> </disk> <interface type="bridge"> <model type="virtio"/> <link state="up"/> <source bridge="vm"/> <alias name="ua-c5c4cb1d-eff5-46df-8e17-0988cab1b52a"/> <mac address="00:1a:4a:16:34:01"/> <mtu size="1500"/> <filterref filter="vdsm-no-mac-spoofing"/> <bandwidth/> </interface> <interface type="bridge"> <model type="virtio"/> <link state="up"/> <source bridge="display"/> <alias name="ua-ab8d097f-0bdf-466e-94ca-9d1f04fd6c98"/> <mac address="00:1a:4a:16:36:01"/> <mtu size="1500"/> <filterref filter="vdsm-no-mac-spoofing"/> <bandwidth/> </interface> </devices> <pm> <suspend-to-disk enabled="no"/> <suspend-to-mem enabled="no"/> </pm> <os> <type arch="x86_64" machine="pc-q35-rhel8.0.0">hvm</type> <smbios mode="sysinfo"/> </os> <metadata> <ns0:qos/> <ovirt-vm:vm> <ovirt-vm:minGuaranteedMemoryMb type="int">1024</ovirt-vm:minGuaranteedMemoryMb> <ovirt-vm:clusterVersion>4.4</ovirt-vm:clusterVersion> <ovirt-vm:custom/> <ovirt-vm:device mac_address="00:1a:4a:16:36:01"> <ovirt-vm:custom/> </ovirt-vm:device> <ovirt-vm:device mac_address="00:1a:4a:16:34:01"> <ovirt-vm:custom/> </ovirt-vm:device> <ovirt-vm:device devtype="disk" name="vda"> <ovirt-vm:poolID>82771514-8edd-49de-ad13-392f04b09b95</ovirt-vm:poolID> <ovirt-vm:volumeID>519fbd7f-b04c-430e-8df6-7d27cb2f495a</ovirt-vm:volumeID> <ovirt-vm:imageID>013d1cf5-588f-4148-9d75-2daf3e12d997</ovirt-vm:imageID> <ovirt-vm:domainID>25fdae2d-d184-4c48-a9c3-7a792e3a5bb3</ovirt-vm:domainID> </ovirt-vm:device> <ovirt-vm:launchPaused>false</ovirt-vm:launchPaused> <ovirt-vm:resumeBehavior>auto_resume</ovirt-vm:resumeBehavior> </ovirt-vm:vm> </metadata> </domain>
(In reply to Nir Soffer from comment #2) > Ilan, can you: > - attach output of /etc/exports on the NFS server > - attach output of "exportfs -v" on the NFS server [root@nfs-server ~]# cat /etc/exports /mnt *(rw,no_subtree_check) [root@nfs-server ~]# [root@nfs-server ~]# exportfs -v /mnt <world>(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,root_squash,no_all_squash) [root@nfs-server ~]#
Ilan, can you change the export to: /mnt *(rw,no_subtree_check,anonuid=36,anongid=36) And reload: exportfs -r I think the issue will be resolved after that, base on same issue I had in Fedora 30. This does not mean this is not a bug, but we have a workaround and it should not block testing.
(In reply to Ilan Zuckerman from comment #0) > [root@nfs-server /]# ll -Z > /mnt/red01_l1_group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5- > 588f-4148-9d75-2daf3e12d997/12ab80ae-bd80-42b3-9f0a-255e68d0baf7 > -rw-rw----. 36 36 system_u:object_r:unlabeled_t:s0 > /mnt/red01_l1_group/25fdae2d-d184-4c48-a9c3-7a792e3a5bb3/images/013d1cf5- > 588f-4148-9d75-2daf3e12d997/12ab80ae-bd80-42b3-9f0a-255e68d0baf7 unlabeled_t looks suspicious; I have these labels on Fedora 30 host and Fedora 29 NFS server: $ ls -lhZ /rhev/data-center/mnt/nfs1\:_export_2/ total 0 drwxr-xr-x. 5 vdsm kvm system_u:object_r:nfs_t:s0 48 Nov 9 01:26 55255570-983a-4d82-907a-19b964abf7ed [nsoffer@host1 ~]$ ls -lhRZ /rhev/data-center/mnt/nfs1\:_export_2/ '/rhev/data-center/mnt/nfs1:_export_2/': total 0 drwxr-xr-x. 5 vdsm kvm system_u:object_r:nfs_t:s0 48 Nov 9 01:26 55255570-983a-4d82-907a-19b964abf7ed '/rhev/data-center/mnt/nfs1:_export_2/55255570-983a-4d82-907a-19b964abf7ed': total 0 drwxr-xr-x. 2 vdsm kvm system_u:object_r:nfs_t:s0 89 Nov 9 01:25 dom_md drwxr-xr-x. 8 vdsm kvm system_u:object_r:nfs_t:s0 270 Nov 26 05:32 images drwxr-xr-x. 4 vdsm kvm system_u:object_r:nfs_t:s0 30 Nov 9 01:26 master '/rhev/data-center/mnt/nfs1:_export_2/55255570-983a-4d82-907a-19b964abf7ed/dom_md': total 3.3M -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 1.0M Nov 26 16:37 ids -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 16M Nov 26 13:35 inbox -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 2.0M Nov 26 14:21 leases -rw-r--r--. 1 vdsm kvm system_u:object_r:nfs_t:s0 482 Nov 9 01:25 metadata -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 16M Nov 26 14:21 outbox -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 1.3M Nov 9 01:25 xleases '/rhev/data-center/mnt/nfs1:_export_2/55255570-983a-4d82-907a-19b964abf7ed/images': total 0 drwxr-xr-x. 2 vdsm kvm system_u:object_r:nfs_t:s0 149 Nov 9 17:40 03bf2260-5779-4b6d-8792-d2fc486c79c0 ... '/rhev/data-center/mnt/nfs1:_export_2/55255570-983a-4d82-907a-19b964abf7ed/images/03bf2260-5779-4b6d-8792-d2fc486c79c0': total 1.2G -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 6.0G Nov 23 23:03 ed0d16bd-6337-4c31-b4c2-de36f7457cd4 -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 1.0M Nov 9 17:40 ed0d16bd-6337-4c31-b4c2-de36f7457cd4.lease -rw-r--r--. 1 vdsm kvm system_u:object_r:nfs_t:s0 298 Nov 9 17:40 ed0d16bd-6337-4c31-b4c2-de36f7457cd4.meta Lets also run restorecon on the NFS server to make sure selinux is configured properly on the server.
(In reply to Nir Soffer from comment #7) > Ilan, can you change the export to: > > /mnt *(rw,no_subtree_check,anonuid=36,anongid=36) > > And reload: > > exportfs -r > > I think the issue will be resolved after that, base on same issue I had in > Fedora 30. > > This does not mean this is not a bug, but we have a workaround and it should > not block > testing. I can confirm that this issue is resolved if you change the export to *(rw,no_subtree_check,anonuid=36,anongid=36). The vm was able to spin up.
Nir, seems like it's a recurring issue with 4.4 NFS shares, do we know the root cause? Is there an action item on our side?
This is an issue with libvirt, trying to access storage as root. Probably they changed the behavior recently. It may be related to this change by the virt team in 4.3: commit c54f8211dd4f98c748540378f72dd8d80cb4e2ef Author: Martin Polednik <mpolednik> Date: Tue Mar 27 15:44:37 2018 +0200 devices: enable dynamic_ownership Libvirt provides dynamic_ownership option. When enabled, libvirt takes care of ownership and permissions of endpoints required for VM run. It was previously disabled due to legacy shared storage issues; these should be fixed by now or, if any problem is discovered, fixed ASAP. Enabling the option allows us to drop ownership handling from VDSM for host devices, and allows OVMF guest vars store without additional permission handling code. The new behavior is explicitly disabled for storage code - the disk ownership is still handled by VDSM. One notable device not handled by libvirt dynamic ownership is hwrng. We must still handle ownership of that device ourselves. Change-Id: Ibfd8f67f88e361600f417aaf52625b5bf6ea1214 Signed-off-by: Martin Polednik <mpolednik> With this change libvirt handles ownership changes for anything but storage. We disable dynanic ownership for storage by adding: <seclabel type='none' relable='no' model='dac'> With this libvirt should not modify disks so they should not need to access storage as root. Maybe libvirt changed the way they handle vms when dynamic ownership is enabled. Accessing storage as root is possible only if you squash root to 36:36, and this is not something we control (we don't manage the exports on the storage). We may need to document this in the storage admin guide.
(In reply to Nir Soffer from comment #11) ... > With this change libvirt handles ownership changes for anything but storage. > We disable dynanic ownership for storage by adding: > > <seclabel type='none' relable='no' model='dac'> > > With this libvirt should not modify disks so they should not need to access > storage as root. Maybe libvirt changed the way they handle vms when dynamic > ownership is enabled. well, exactly. So why the heck is it failing? IIUC the only difference in mount settings is that per comment #6 root was squashed (i assume to nobody?), and now you squash all including root. Good, but we don't do anything with drives so no one should try to touch them, except for qemu process under kvm:vdsm which works sine the permissions are 36:36. So what exactly is it trying to access those files? I guess we need more debug info
(In reply to Michal Skrivanek from comment #12) > (In reply to Nir Soffer from comment #11) Someone from libvirt should answer this.
Nir, do we need to open a libvirt bug for that?
(In reply to Eyal Shenitzky from comment #14) > Nir, do we need to open a libvirt bug for that? No, we had a libvirt bug and it should be fixed now. I don't remember the bug number. We can move this to ON_QA since the issue should be resolved.
Should be fixed in 4.4, lets test this.
Hi, I am encountering this issue right now on 4.4 - https://bugzilla.redhat.com/show_bug.cgi?id=1853568 My /etc/exports on the NFS storage looks like so: /mnt/nfs_share/storage *(rw,no_subtree_check,anonuid=36,anongid=36) This is how the attached NFS storage looks under my node: [root@ovirt-node1 7439af82-d83b-4659-aa3a-ed9e69385ac6]# pwd /rhev/data-center/mnt/ovirt-storage1.home.robinopletal.com:_mnt_nfs__share_storage/c2873dcf-dd1c-4069-867c-592345e6be7d/images/7439af82-d83b-4659-aa3a-ed9e69385ac6 [root@ovirt-node1 7439af82-d83b-4659-aa3a-ed9e69385ac6]# ls -laZ total 700420 drwxr-xr-x. 2 vdsm kvm system_u:object_r:nfs_t:s0 149 Jul 3 16:17 . drwxr-xr-x. 7 vdsm kvm system_u:object_r:nfs_t:s0 226 Jul 3 16:54 .. -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 716176896 Jul 3 16:34 a71bca37-81ba-40c7-8c2f-d3a644ea2729 -rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 1048576 Jul 3 16:17 a71bca37-81ba-40c7-8c2f-d3a644ea2729.lease -rw-r--r--. 1 vdsm kvm system_u:object_r:nfs_t:s0 367 Jul 3 16:17 a71bca37-81ba-40c7-8c2f-d3a644ea2729.meta Reading through this thread I believe this should be fixed, but in such case I have no explanation for this behaviour. Thank you for any pointers in this regard. All the best!
*** Bug 1853568 has been marked as a duplicate of this bug. ***
Verified on rhv-release-4.4.1-11-001.noarch 1. Create NFS mount [root@yellow-vdsb /]# vi /etc/exports /ilan_test_nfs *(rw,no_subtree_check) [root@yellow-vdsb /]# exportfs -v /ilan_test_nfs <world>(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,root_squash,no_all_squash) [root@yellow-vdsb /]# ll ilan_test_nfs/ total 4 drwxr-xr-x. 5 36 36 4096 Jul 9 12:15 5ea779aa-6f6b-4d54-b96c-6b9e053ba31e 2. Create NFS SD pointing to this export 3. Copy over template disk to newly created SD 4. Create the vm and start it Expected: Starting the vm should succeed. Actual: VM started. no Errors.
This bugzilla is included in oVirt 4.4.1 release, published on July 8th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.