Bug 2154413

Summary: VMs requiring vTPM fails to create [rhel-8.6.0.z]
Product: Red Hat Enterprise Linux 8 Reporter: RHEL Program Management Team <pgm-rhel-tools>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Yanqiu Zhang <yanqzhan>
Severity: high Docs Contact:
Priority: high    
Version: 8.7CC: acardace, danken, fdeutsch, jdenemar, kbidarka, lmen, lpivarc, mprivozn, sgott, virt-maint, yalzhang, yanqzhan, ycui, ymankad
Target Milestone: rcKeywords: AutomationTriaged, Triaged, ZStream
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: libvirt-8.0.0-5.6.module+el8.6.0+17751+d6559882 Doc Type: Bug Fix
Doc Text:
Cause: when starting or stopping swtpm emulator or vhost-user-gpu helper libvirt would read their pidfile and then check against /proc/[pid]/exe whether the PID is still valid. This works, but also requires CAP_SYS_PTRACE capability which might not be available in restricted containers. Fortunately, there is a way for libvirt to check the PID without accessing /proc and validating the pidfile. Consequence: The CAP_SYS_PTRACE capability was needed, which did not play nicely with rootless containers. Fix: The code was changed so that the pidfile is locked when swtpm or vhost-user-gpu is started. Now, whenever libvirt wants to read the PID it also tries to lock the pidfile. If the pidfile can be locked successfully, it means that the swtpm process which held the file locked is gone and the PID is invalid. And vice verca - if the file can't be locked then the process is still running and the PID is valid. Result: The swtpm PID and/or vhost-user-gpu PID can be read without the CAP_SYS_PTRACE capability.
Story Points: ---
Clone Of: 2152188 Environment:
Last Closed: 2023-01-24 14:39:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2152188    
Bug Blocks:    

Comment 4 Michal Privoznik 2022-12-22 10:22:54 UTC
The old package has:
24637 03:19:00.467638 openat(AT_FDCWD, "/run/libvirt/qemu/swtpm/5-avocado-vt-vm1-swtpm.pid", O_RDONLY) = 40 <0.000019>
24637 03:19:00.467688 read(40, "26087", 21) = 5 <0.000015>
24637 03:19:00.467731 read(40, "", 16)  = 0 <0.000011>
24637 03:19:00.467769 close(40)         = 0 <0.000012>
24637 03:19:00.467806 kill(26087, 0)    = 0 <0.000021>
24637 03:19:00.467860 lstat("/proc/26087/exe", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0 <0.000194>
24637 03:19:00.468099 stat("/proc/26087/exe", {st_mode=S_IFREG|0755, st_size=38888, ...}) = 0 <0.000091>
24637 03:19:00.468228 stat("/usr/bin/swtpm", {st_mode=S_IFREG|0755, st_size=38888, ...}) = 0 <0.000014>

Which corresponds to virPidFileReadPathIfAlive() function. And what's happening here, is the swtpm pid file is opened, the PID value is read (26087) and then /proc/PID/exe symlink is resolved to see if it still points to the swtpm binary. This last part (/prco symlink resolution) needs permissions.

The new package has (in contrast):

27152 03:24:55.951254 openat(AT_FDCWD, "/run/libvirt/qemu/swtpm/2-avocado-vt-vm1-swtpm.pid", O_RDWR) = 38 <0.000024>
27152 03:24:55.951310 fcntl(38, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = -1 EAGAIN (Resource temporarily unavailable) <0.000018>
27152 03:24:55.951365 openat(AT_FDCWD, "/run/libvirt/qemu/swtpm/2-avocado-vt-vm1-swtpm.pid", O_RDONLY) = 39 <0.000017>
27152 03:24:55.951413 read(39, "27481", 21) = 5 <0.000016>
27152 03:24:55.951460 read(39, "", 16)  = 0 <0.000012>
27152 03:24:55.951501 close(39)         = 0 <0.000015>
27152 03:24:55.951545 kill(27481, 0)    = 0 <0.000085>

which corresponds to virPidFileReadPathIfLocked(). And what's happening here is the swtpm pid file is opened and attempted to lock. But this fails (the EAGAIN errno) which means the pid file lock is owned by another process (the swtpm process) which is still running and thus the value in the pid file is still valid. Therefore, the pid file is read (the swtpm pid is 27481). Nothing here requires special permissions. Therefore, we can conclude that the bug is fixed with the new package.

Comment 5 Yanqiu Zhang 2023-01-03 02:24:29 UTC
Thanks Michal very much!

Then we mark pre-verification as PASS.

Comment 8 Yanqiu Zhang 2023-01-10 08:54:29 UTC
Verified with:
libvirt-8.0.0-5.6.module+el8.6.0+17751+d6559882.x86_64
qemu-kvm-6.2.0-11.module+el8.6.0+17576+33ee06a8.7.x86_64
swtpm-0.7.0-3.20211109gitb79fd91.module+el8.6.0+16156+d5629340.x86_64
libtpms-0.9.1-0.20211126git1ff6fe1f43.module+el8.6.0+14480+c0a3aa0f.x86_64

36397 03:13:28.109981 openat(AT_FDCWD, "/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.pid", O_RDWR) = 37 <0.000021>
36397 03:13:28.110031 fcntl(37, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = -1 EAGAIN (Resource temporarily unavailable) <0.000017>
36397 03:13:28.110083 openat(AT_FDCWD, "/run/libvirt/qemu/swtpm/1-avocado-vt-vm1-swtpm.pid", O_RDONLY) = 38 <0.000017>
36397 03:13:28.110130 read(38, "36544", 21) = 5 <0.000016>
36397 03:13:28.110176 read(38, "", 16)  = 0 <0.000012>
36397 03:13:28.110217 close(38)         = 0 <0.000026>
36397 03:13:28.110277 kill(36544, 0)    = 0 <0.000016>
36397 03:13:28.110320 close(37)         = 0 <0.000012>

Comment 12 errata-xmlrpc 2023-01-24 14:39:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:0432