This bug was initially created as a copy of Bug #2214454 I am copying this bug because: Description of problem: must-gather pod will be running oc exec -n virt-handler -- /bin/bash -c "pgrep -f 'virt-launcher .*${vmuid}'" to get the pid of the virt-launcher to feed into nsenter. However, if the must-gather pod is running in the same node where VM is running, pgrep will also get this `oc exec pgrep` pid and will get two pids. ~~~ must-gather pod running in node openshift-master-orion-2 [root@dell-per7525-03 ~]# oc get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES must-gather-n2tzd 2/2 Running 0 16s 10.130.1.43 openshift-master-orion-2 <none> <none> pgrep of VM with uuid 91a7d4cf-5607-47d8-81ee-702e2837b554 which is running in the same node will give two pids: : # oc rsh must-gather-sxkkf Defaulted container "gather" out of: gather, copy sh-4.4# oc exec -n openshift-cnv virt-handler-25twk -- /bin/bash -c "pgrep -f 'virt-launcher .*91a7d4cf-5607-47d8-81ee-702e2837b554'" Defaulted container "virt-handler" out of: virt-handler, virt-launcher (init) 1872042 3357272 3357272 is the oc exec process: root 3360536 0.0 0.2 2151792 84060 pts/0 Sl+ 03:03 0:00 oc exec -n openshift-cnv virt-handler-25twk -- /bin/bash -c pgrep -f 'virt-launcher .*91a7d4cf-5607-47d8-81ee-702e2837b554' ~~~ So nsenter will get two pids and the `nft list ruleset` will fail to collect. Version-Release number of selected component (if applicable): OpenShift Virtualization 4.13.0 How reproducible: 100% Steps to Reproduce: 1. Run must-gather with vms_details: # oc adm must-gather --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel9:v4.13.0 -- /usr/bin/gather --vms_details 2. Check the collected nft rules for VMs which was running in the same node of must-gather pod and this will be empty: "must-gather.local.3686294122690156338/registry-redhat-io-container-native-virtualization-cnv-must-gather-rhel9-sha256-b2193e480a95557ab4b377f3bbde6c111e7c7db2f9927dc18699debfa4d34da1/namespaces/nijin-cnv/vms/centos7-c4xa6uojyeu0osx3/virt-launcher-centos7-c4xa6uojyeu0osx3-q8llx.ruletables.txt" was empty. Actual results: nft rules are not collected if the VMs are running in the node where must-gather is running Expected results: It should collect nft rules. Additional info:
Please backport the fix.
Tested with CNV v4.11.5-64 with the following steps: 1. Created a fedora VM and ran it. 2. Find out the node on which the VM is running. for eg. worker1 # oc get vmi 3. Run must-gather pod in the same node obtained in (2) while the VM is running. # oc adm must-gather --node-name=<worker1> --image==registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel8@sha256:82055386739d09788c5ad9fa70fed7d6f62fcea91a2a9a1dedec86144bfbaf6b -- /usr/bin/gather --vms_details 4. List the contents of the ruletables.txt from the collected must-gather contents # cat must-gather.local.3490267327958448045/registry-redhat-io-container-native-virtualization-cnv-must-gather-rhel8-sha256-82055386739d09788c5ad9fa70fed7d6f62fcea91a2a9a1dedec86144bfbaf6b/namespaces/default/vms/fedora-cute-bear/virt-launcher-fedora-cute-bear-gc5pd.ruletables.txt table ip filter { chain INPUT { type filter hook input priority filter; policy accept; } .... The contents of ruletables.txt is available and not empty.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.11.5 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:4271