Bug 1856461
| Summary: | [OSP16.1] After a complete OC reboot it is impossible to spawn new VMs (qemu-kvm permission denied) | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Michele Baldessari <michele> |
| Component: | openstack-nova | Assignee: | OSP DFG:Compute <osp-dfg-compute> |
| Status: | CLOSED DUPLICATE | QA Contact: | OSP DFG:Compute <osp-dfg-compute> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 16.1 (Train) | CC: | cjeanner, dasmith, eglynn, jhakimra, kchamart, sbauza, sgordon, vromanso |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-07-15 06:32:12 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Description of problem: compose: RHOS-16.1-RHEL-8-20200710.n.0 After rebooting the whole overcloud with infrared on a largish composable HA overcloud (3xctrl, 3xdb, 3xnetworker, 3xmessaging, 2xcomputes) we observe that spinning up a new VM is failing with: nova-conductor.log:2020-07-13 11:25:56.482 22 ERROR nova.scheduler.utils [req-f0bc5b86-155e-4bb4-b4f6-a57525896804 e2a8ef60be7e41e7ad3bc071842df17d 66d436466a8043aa8aac6fcc85a35ca2 - default default] [instance: fd15ea72-95d9-4f62-b982-29dec5895750] Error from last host: compute-0.redhat.local (node compute-0.redhat.local): ['Traceback (most recent call last):\n', ' File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2437, in _build_and_run_instance\n block_device_info=block_device_info)\n', ' File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 3647, in spawn\n cleanup_instance_disks=created_disks)\n', ' File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 6473, in _create_domain_and_network\n cleanup_instance_disks=cleanup_instance_disks)\n', ' File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__\n self.force_reraise()\n', ' File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise\n six.reraise(self.type_, self.value, self.tb)\n', ' File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise\n raise value\n', ' File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 6439, in _create_domain_and_network\n post_xml_callback=post_xml_callback)\n', ' File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 6368, in _create_domain\n guest.launch(pause=pause)\n', ' File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 143, in launch\n self._encoded_xml, errors=\'ignore\')\n', ' File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__\n self.force_reraise()\n', ' File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise\n six.reraise(self.type_, self.value, self.tb)\n', ' File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise\n raise value\n', ' File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 138, in launch\n return self._domain.createWithFlags(flags)\n', ' File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit\n result = proxy_call(self._autowrap, f, *args, **kwargs)\n', ' File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call\n rv = execute(f, *args, **kwargs)\n', ' File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute\n six.reraise(c, e, tb)\n', ' File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise\n raise value\n', ' File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker\n rv = meth(*args, **kwargs)\n', ' File "/usr/lib64/python3.6/site-packages/libvirt.py", line 1265, in createWithFlags\n if ret == -1: raise libvirtError (\'virDomainCreateWithFlags() failed\', dom=self)\n', 'libvirt.libvirtError: internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Permission denied\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', ' File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2161, in _do_build_and_run_instance\n filter_properties, request_spec)\n', ' File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2537, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', 'nova.exception.RescheduledException: Build of instance fd15ea72-95d9-4f62-b982-29dec5895750 was re-scheduled: internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Permission denied\ And on compute-0 I see: 2020-07-13 17:14:07.630+0000: 3595: error : qemuMonitorIORead:511 : Unable to read from monitor: Connection reset by peer 2020-07-13 17:14:07.631+0000: 3595: error : qemuProcessReportLogError:2064 : internal error: qemu unexpectedly closed the monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Permission denied 2020-07-13 17:14:07.631+0000: 3730: error : qemuProcessReportLogError:2064 : internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Permission denied It does seem to me that selinux is hating us: [root@compute-0 libvirt]# perl -pe 's/(\d+)/localtime/e' /var/log/audit/audit.log|grep denied type=AVC msg=audit(Mon Jul 13 17:23:51 2020.703:18435): avc: denied { entrypoint } for pid=101351 comm="libvirtd" path="/usr/libexec/qemu-kvm" dev="overlay" ino=68136 scontext=system_u:system_r:svirt_t:s0:c174,c324 tcontext=system_u:object_r:container _file_t:s0:c358,c374 tclass=file permissive=0 type=AVC msg=audit(Mon Jul 13 17:23:51 2020.337:28226): avc: denied { entrypoint } for pid=145579 comm="libvirtd" path="/usr/libexec/qemu-kvm" dev="overlay" ino=68136 scontext=system_u:system_r:svirt_t:s0:c98,c1009 tcontext=system_u:object_r:container _file_t:s0:c358,c374 tclass=file permissive=0 What I don't understand is that on my node, I do see the same permission denied in libvirt.log before the problem: 2020-07-13 11:25:54.181+0000: 57688: error : qemuProcessReportLogError:2064 : internal error: qemu unexpectedly closed the monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Permission denied 2020-07-13 11:25:54.184+0000: 57695: error : qemuProcessReportLogError:2064 : internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Permission denied 2020-07-13 12:10:25.318+0000: 3595: info : libvirt version: 6.0.0, package: 17.2.module+el8.2.0+6629+3fc0f2c2 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2020-05-14-11:03:02, ) 2020-07-13 12:10:25.318+0000: 3595: info : hostname: compute-0.redhat.local 2020-07-13 12:10:25.318+0000: 3595: error : qemuMonitorIORead:511 : Unable to read from monitor: Connection reset by peer 2020-07-13 12:10:25.321+0000: 3595: error : qemuProcessReportLogError:2064 : internal error: qemu unexpectedly closed the monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Permission denied 2020-07-13 12:10:25.322+0000: 3727: error : qemuProcessReportLogError:2064 : internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Permission denied Yet at those times (11:24 and 12:10) tempest ran just fine *and* I can't find any corresponding selinux denied entries. I'll leave a NEEDINFO on me while I narrow the scope of this issue. I raise it now mainly for awareness and to see if anyone else is experiencing this.