Description of problem: I deployed overcloud according to the following document. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.1/html/configuring_the_compute_service_for_instance_creation/assembly_configuring-instance-security_vgpu#assembly_configuring-compute-nodes-to-provide-emulated-TPM-devices-for-instances_TPM Then I rebooted the compute node. After rebooting the compute node, I tried to create an instance with vTPM, but it failed due to a SELinux issue. ~~~ (central) [stack@undercloud ~]$ openstack server create --network yatanaka_network0 --image cirros-0.6.2 --flavor vtpm-flavor cirros_vtpm --host central-novacompute-1.yatanaka.example.com --os-compute-api-version 2.74 [root@central-novacompute-1 ~]# vi /var/log/containers/nova/nova-compute.log 2024-05-07 13:51:43.079 2 ERROR nova.compute.manager [req-e48b5e0f-2ced-41eb-9f14-58c8f19444d3 7dbca1b3b5d54daf96000e422f3acfda 7309ecd94e5245be928ef9e4c4ea83dc - default default] [instance: c4f226d7-03fa-4a12-bb89-22b2140c9983] Failed to build and run instance: libvirt.libvirtError: operation failed: swtpm died and reported: ====> Instance creation fails because swtpm is not running. [root@central-novacompute-1 ~]# grep swtpm /var/log/audit/audit.log|grep AVC type=AVC msg=audit(1715057384.359:465): avc: denied { write } for pid=4861 comm="swtpm" path="/run/libvirt/qemu/swtpm/1-instance-00000054-swtpm.pid" dev="tmpfs" ino=2705 scontext=system_u:system_r:svirt_t:s0:c135,c269 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0 type=AVC msg=audit(1715057384.360:466): avc: denied { write } for pid=4861 comm="swtpm" name="swtpm" dev="tmpfs" ino=2704 scontext=system_u:system_r:svirt_t:s0:c135,c269 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=dir permissive=0 ====> The reason why swtpm couldn't start was the above SELinux error. ====> Because /run/libvirt/qemu/swtpm/1-instance-00000054-swtpm.pid is container_ro_file_t, swtpm cannot write to the file. [root@central-novacompute-1 ~]# podman exec -it nova_virtqemud mount |grep run tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,size=9812140k,nr_inodes=819200,mode=755,inode64) none on /run/credentials/systemd-tmpfiles-setup-dev.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700) none on /run/credentials/systemd-sysctl.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700) none on /run/credentials/systemd-tmpfiles-setup.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700) tmpfs on /run/netns type tmpfs (rw,nosuid,nodev,seclabel,size=9812140k,nr_inodes=819200,mode=755,inode64) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=3228348k,nr_inodes=807087,mode=700,inode64) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=3228348k,nr_inodes=807087,mode=700,uid=1000,gid=1000,inode64) tmpfs on /run/systemd/journal/dev-log type tmpfs (rw,nosuid,nodev,seclabel,size=9812140k,nr_inodes=819200,mode=755,inode64) tmpfs on /run/libvirt type tmpfs (rw,nosuid,nodev,seclabel,size=9812140k,nr_inodes=819200,mode=755,inode64) tmpfs on /run/secrets type tmpfs (rw,seclabel,size=9812140k,nr_inodes=819200,mode=755,inode64) [root@central-novacompute-1 ~]# podman exec -it nova_virtqemud ls -lZd /run/libvirt/qemu/ drwxr-xr-x. 7 qemu qemu system_u:object_r:container_ro_file_t:s0 180 May 7 13:51 /run/libvirt/qemu/ ===> I can see that the SELinux label of /run/libvirt/qemu/ is container_ro_file_t ~~~ If I restart tripleo_nova_virtlogd_wrapper.service, the SELinux context of /run/libvirt/qemu/ is changed to container_file_t and instance creation works. ~~~ [root@central-novacompute-1 ~]# systemctl restart tripleo_nova_virtlogd_wrapper.service [root@central-novacompute-1 ~]# podman exec -it nova_virtqemud ls -lZd /run/libvirt/qemu/ drwxr-xr-x. 7 qemu qemu system_u:object_r:container_file_t:s0 180 May 7 13:51 /run/libvirt/qemu/ (central) [stack@undercloud ~]$ openstack server create --network yatanaka_network0 --image cirros-0.6.2 --flavor vtpm-flavor cirros_vtpm --host central-novacompute-1.yatanaka.example.com --os-compute-api-version 2.74 (central) [stack@undercloud ~]$ openstack server list +--------------------------------------+-----------------------------------+---------+----------------------------------------------------+--------------------------+----------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-----------------------------------+---------+----------------------------------------------------+--------------------------+----------------+ | a28a531d-7ea2-4e03-ad8b-c81cde5f3a46 | cirros_vtpm | ACTIVE | yatanaka_network0=192.168.0.235 | cirros-0.6.2 | vtpm-flavor | [root@central-novacompute-1 ~]# podman exec -it nova_virtqemud ls -laZ /run/libvirt/qemu/swtpm/ total 4 drwxrwx---. 2 qemu tss system_u:object_r:container_file_t:s0 80 May 7 14:46 . drwxr-xr-x. 7 qemu qemu system_u:object_r:container_file_t:s0 220 May 7 14:46 .. -rw-r--r--. 1 root root system_u:object_r:container_file_t:s0 4 May 7 14:46 1-instance-00000060-swtpm.pid srw-------. 1 qemu qemu system_u:object_r:svirt_image_t:s0:c728,c755 0 May 7 14:46 1-instance-00000060-swtpm.sock ~~~ If we restart nova_virtqemud, the SELinux context becomes container_ro_file_t again and instance creation fails. ~~~ [root@central-novacompute-1 ~]# systemctl restart tripleo_nova_virtqemud.service [root@central-novacompute-1 ~]# podman exec -it nova_virtqemud ls -laZ /run/libvirt/qemu/swtpm/ total 4 drwxrwx---. 2 qemu tss system_u:object_r:container_ro_file_t:s0 80 May 7 13:58 . drwxr-xr-x. 7 qemu qemu system_u:object_r:container_ro_file_t:s0 220 May 7 14:04 .. -rw-r--r--. 1 root root system_u:object_r:container_ro_file_t:s0 4 May 7 13:58 2-instance-0000005a-swtpm.pid srw-------. 1 qemu qemu system_u:object_r:container_ro_file_t:s0 0 May 7 13:58 2-instance-0000005a-swtpm.sock ~~~ It seems that the start of the nova_virtqemud container is the trigger of the change of the SELinux label. But I couldn't find the root cause of the issue. I'm wondering if this is RHEL (kernel, podman) issue or RHOSP issue. Version-Release number of selected component (if applicable): RHOSP 17.1.2 How reproducible: Steps to Reproduce: 1. Deploy overcloud according to https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.1/html/configuring_the_compute_service_for_instance_creation/assembly_configuring-instance-security_vgpu#assembly_configuring-compute-nodes-to-provide-emulated-TPM-devices-for-instances_TPM 2. Create an instance with vTPM. This succeeds. 3. Reboot the compute node and create a new instance with vTPM on the compute node. This fails. 4. Restart tripleo_nova_virtlogd_wrapper.service and create a instance with vTPM. This succeeds. 5. Restart tripleo_nova_virtqemud.service and create an instance with vTPM. This fails. Actual results: Instance creation fails. SELinux label of /run/libvirt/qemu/swtpm/X-instance-XXXXXXXXXX-swtpm.pid is changed to container_ro_file_t when nova_virtqemud starts. Expected results: Instance creation succeeds. SELinux label of /run/libvirt/qemu/swtpm/X-instance-XXXXXXXXXX-swtpm.pid is container_file_t always. Additional info: I found the following BZs which mentions the container_ro_file_t label, but they sounds a bit difference from this issue. - https://bugzilla.redhat.com/show_bug.cgi?id=2122239 - https://bugzilla.redhat.com/show_bug.cgi?id=2219795
Hi, So as you see it's all about /var/log/swtpm location inside the container is labelled incorrectly as "container_ro_file_t" It should be: "container_file_t" I think this is already solved in some update. I'm Ccing a couple of colleagues (Julie and Bogdan) who might know which update it is. Also, this was previously discussed extensively in this bug that I filed in the past: https://bugzilla.redhat.com/show_bug.cgi?id=2093956 -- 'swtpm' binary is denied write/"append" permissions to log files under /var/log/swtpm/
From the SELinux side, as investigated in the other bug Kashyap linked, there isn't much that can be done since svirt_t can already write to container_file_t [1] and as indicated in the description here, once the label is correct the instance can start. It looks like an ordering issue based on which container starts first... A deployment SME should be able to confirm what is going on, and if there was a related patch. [1] https://github.com/redhat-openstack/openstack-selinux/commit/61b604b10af6315bb570b71776b8ccdec884222
Please try the pushed fix 451875
Hello Kashyap & Bogdan, Thanks for the follow-up. I assuming that my customer should be good if the fix is included in 17.1.4 release because they haven't responded since 5th July and support case is auto-closed on 15th July. Regards, Partheeban
*** Bug 2250047 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: RHOSP 17.1.4 (openstack-tripleo-heat-templates) security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:9978
*** Bug 2331316 has been marked as a duplicate of this bug. ***