Created attachment 1873398 [details] vm gather with namespace specified. Description of problem: On running gather_vm_details against 4.11, we are seeing the following warning on console: =================== Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (containers "gather", "copy" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers "gather", "copy" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or containers "gather", "copy" must set securityContext.runAsNonRoot=true), seccompProfile (pod or containers "gather", "copy" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost") =================== And rulestable, qemu log information are no longer getting collected. Version-Release number of selected component (if applicable): 4.11.0-224 How reproducible: 100% Steps to Reproduce: 1. Run vm gather commmand: oc adm must-gather --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel8@sha256:e49438a284e8e9121137132f3810ecf49909892574668cb9cd2b19b8b46d08a0 --dest-dir=/tmp/pytest-EX4WtwouaHUoyZacyicyzk/must_gather_test_vm_gather_alternate_namespace0 -- NS=must-gather-alternate gather_vms_details or oc adm must-gather --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel8@sha256:e49438a284e8e9121137132f3810ecf49909892574668cb9cd2b19b8b46d08a0 --dest-dir=/tmp/pytest-EX4WtwouaHUoyZacyicyzk/must_gather_scope_module1 -- gather_vms_details 2. 3. Actual results: rulestable and qemu logs collected are empty. Following error message is seen on mustgather output: ======================= [must-gather-9vccn] POD 2022-04-18T20:43:26.119439364Z cat: /var/log/libvirt/qemu/must-gather-alternate_must-gather-vm-2-1650314512-3492553.log: Permission denied [must-gather-9vccn] POD 2022-04-18T20:43:26.128515074Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:26.193723643Z cat: /var/log/libvirt/qemu/must-gather-alternate_must-gather-vm-3-1650314512-5202754.log: Permission denied [must-gather-9vccn] POD 2022-04-18T20:43:26.195080289Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:26.277463123Z cat: /var/log/libvirt/qemu/must-gather-alternate_must-gather-vm-0-1650314511-7316048.log: Permission denied [must-gather-9vccn] POD 2022-04-18T20:43:26.289674838Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:26.322750933Z cat: /var/log/libvirt/qemu/must-gather-alternate_must-gather-vm-1-1650314511-9379501.log: Permission denied [must-gather-9vccn] POD 2022-04-18T20:43:26.325656916Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:26.370199606Z cat: /var/log/libvirt/qemu/must-gather-alternate_must-gather-vm-4-1650314512-6850836.log: Permission denied [must-gather-9vccn] POD 2022-04-18T20:43:26.371093484Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:27.814192970Z Defaulted container "virt-handler" out of: virt-handler, virt-launcher (init) [must-gather-9vccn] POD 2022-04-18T20:43:28.049858518Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:28.178720318Z Defaulted container "virt-handler" out of: virt-handler, virt-launcher (init) [must-gather-9vccn] POD 2022-04-18T20:43:28.257773254Z Defaulted container "virt-handler" out of: virt-handler, virt-launcher (init) [must-gather-9vccn] POD 2022-04-18T20:43:28.310118977Z Defaulted container "virt-handler" out of: virt-handler, virt-launcher (init) [must-gather-9vccn] POD 2022-04-18T20:43:28.670392002Z Defaulted container "virt-handler" out of: virt-handler, virt-launcher (init) [must-gather-9vccn] POD 2022-04-18T20:43:28.911945137Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:28.919012468Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:28.982849444Z command terminated with exit code 1 [must-gather-9vccn] POD 2022-04-18T20:43:28.995467881Z command terminated with exit code 1 ======================= Expected results: All valid must-gather information should be collected. Additional info: Will attach vm gather output.
The issue is clearly reproducible: must-gather execute this command to fetch qemu logs: https://github.com/kubevirt/must-gather/blob/main/collection-scripts/gather_vms_details#L24 but it fails when executed against CNV 4.11 virt-launcher images: stirabos@t14s:~$ oc exec virt-launcher-testvm-r9bml -n default -c compute -- cat "/var/log/libvirt/qemu/default_virt-launcher-testvm-r9bml.log" > "virt-launcher-testvm-r9bml.qemu.log" cat: /var/log/libvirt/qemu/default_virt-launcher-testvm-r9bml.log: Permission denied command terminated with exit code 1 the reason is that now the pod is running as qemu user: stirabos@t14s:~$ oc exec virt-launcher-testvm-r9bml -n default -c compute -- whoami qemu while /var/log/libvirt is accessible only by root: stirabos@t14s:~$ oc exec virt-launcher-testvm-r9bml -n default -c compute -- ls -l "/var/log" total 88 -rw-r--r--. 1 root root 20961 Apr 14 01:52 dnf.librepo.log -rw-r--r--. 1 root root 50816 Apr 14 01:52 dnf.log -rw-r--r--. 1 root root 7809 Apr 14 01:52 dnf.rpm.log -rw-r--r--. 1 root root 600 Apr 14 01:52 hawkey.log drwx------. 3 root root 18 Apr 14 01:52 libvirt drwxr-xr-x. 1 root root 22 Apr 14 01:50 rhsm drwxr-xr-x. 3 root root 21 Apr 14 01:52 swtpm I think that this is a side effect of the NonRoot feature gate which is now enabled by default: https://github.com/kubevirt/hyperconverged-cluster-operator/pull/1846 @Stu, should this be fixed on the virt-launcher image?
The is still failing with this error against version v4.11.0-459 AssertionError: Following errors found in must-gather dict_keys(['file_not_found']). defaultdict(<class 'dict'>, {'file_not_found': {'must-gather-vm-0-1655142241-572693': ['qemu.log'], 'must-gather-vm-1-1655142241-7908125': ['qemu.log'], 'must-gather-vm-2-1655142242-0629196': ['qemu.log'], 'must-gather-vm-3-1655142242-3891873': ['qemu.log'], 'must-gather-vm-4-1655142242-6430938': ['qemu.log'] This is from must_gather output: 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:47.910589030Z tar: Removing leading `/' from member names 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:47.910589030Z tar: /var/log/libvirt/qemu: Warning: Cannot stat: Permission denied 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:47.910589030Z tar: /var/run/libvirt/qemu/log: Warning: Cannot stat: No such file or directory 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:47.995736930Z tar: Removing leading `/' from member names 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:47.996125170Z tar: /var/log/libvirt/qemu: Warning: Cannot stat: Permission denied 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:47.996125170Z tar: /var/run/libvirt/qemu/log: Warning: Cannot stat: No such file or directory 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:48.111512128Z tar: Removing leading `/' from member names 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:48.111512128Z tar: /var/log/libvirt/qemu: Warning: Cannot stat: Permission denied 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:48.111512128Z tar: /var/run/libvirt/qemu/log: Warning: Cannot stat: No such file or directory 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:48.141320316Z tar: Removing leading `/' from member names 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:48.141320316Z tar: /var/log/libvirt/qemu: Warning: Cannot stat: Permission denied 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:48.141320316Z tar: /var/run/libvirt/qemu/log: Warning: Cannot stat: No such file or directory 13:46:08 [must-gather-wqxnv] POD 2022-06-13T17:45:48.209867896Z tar: Removing leading `/' from member names
Now we are using tar to collect the whole content of /var/run/libvirt/qemu/log/ (for rootless virt-launcher) and /var/log/libvirt/qemu/ (for backward compatibility). This provides a few advantages: - a single command can work for for the scenarios - we honor the original file names - we also collect log-rotated files (.log.0 log.1) for long running VMs Since only one of /var/run/libvirt/qemu/log/ or /var/log/libvirt/qemu/ is actually containing logs (depending on how virt-launcher is configured) the other is supposed to produce a warning that we can safely ignore. We can easily suppress tar stderr to keep the collection logs clean but this can potentially hide real problems. Moving back to ON_QA for further verification.
@stirabos Did the location and name of the qemu log changed? I still don't see any qemu related files under the previous location: ================== [cnv-qe-jenkins@c01-dbn-411-qgqm5-executor must_gather_scope_module0]$ ls /tmp/pytest-DN4RLd9XwiVKgWkAX885t8/must_gather_scope_module0/registry-redhat-io-container-native-virtualization-cnv-must-gather-rhel8-sha256-64915e55c37ab9167e96eb81998aa081c751a0dc0957c6833cb8752a99d4ae9e/namespaces/node-gather-unprivileged/vms/must-gather-vm-2-1656360982-6516314/ virt-launcher-must-gather-vm-2-1656360982-6516314-4vfbz.bridge.txt virt-launcher-must-gather-vm-2-1656360982-6516314-4vfbz.ip.txt virt-launcher-must-gather-vm-2-1656360982-6516314-4vfbz.dumpxml.xml virt-launcher-must-gather-vm-2-1656360982-6516314-4vfbz.ruletables.txt [cnv-qe-jenkins@c01-dbn-411-qgqm5-executor must_gather_scope_module0]$ Attaching the current command and output files.
With https://github.com/kubevirt/kubevirt/pull/7713 the path of the logs in virt-launcher instances changed again, we need also https://github.com/kubevirt/must-gather/pull/155 to collect also from the latest path.
Verified against build CNV-v4.11.0-566, file is getting collected: =============== [cnv-qe-jenkins@c01-db-411-8xdfz-executor must-gather-vm-2-1657737005-1774642]$ ls -lt total 36 drwxr-xr-x. 2 cnv-qe-jenkins cnv-qe-jenkins 6 Jul 13 18:32 log -rw-r--r--. 1 cnv-qe-jenkins cnv-qe-jenkins 1496 Jul 13 18:32 virt-launcher-must-gather-vm-2-1657737005-1774642-ltmz7.ruletables.txt -rw-r--r--. 1 cnv-qe-jenkins cnv-qe-jenkins 2008 Jul 13 18:32 virt-launcher-must-gather-vm-2-1657737005-1774642-ltmz7.bridge.txt -rw-r--r--. 1 cnv-qe-jenkins cnv-qe-jenkins 2179 Jul 13 18:32 virt-launcher-must-gather-vm-2-1657737005-1774642-ltmz7.ip.txt -rw-r--r--. 1 cnv-qe-jenkins cnv-qe-jenkins 9449 Jul 13 18:32 virt-launcher-must-gather-vm-2-1657737005-1774642-ltmz7.dumpxml.xml -rw-------. 1 cnv-qe-jenkins cnv-qe-jenkins 8264 Jul 13 18:32 node-gather-unprivileged_must-gather-vm-2-1657737005-1774642.log [cnv-qe-jenkins@c01-db-411-8xdfz-executor must-gather-vm-2-1657737005-1774642]$
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.11.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:6526