Created attachment 1868176 [details] HPP Operator log Created attachment 1868176 [details] HPP Operator log Description of problem: After updating HPP CR with pvcTemplate to run hpp on a specific node, hpp-pool pods that run on other nodes are getting deleted and cleanup-pool-hpp-csi pods are created to make a cleanup. But cleanup pods stay in Completed status and are not getting deleted. Version-Release number of selected component (if applicable): 4.10 How reproducible: Always Steps to Reproduce: 1. Update Node c01-jp410-3-s69ln-worker-0-f2hrj: {'metadata': {'labels': {'hpp-key': 'hpp-val1'}, 'name': 'c01-jp410-3-s69ln-worker-0-f2hrj'}} 2. Update HostPathProvisioner hostpath-provisioner: {'spec': {'workload': {'nodeSelector': {'hpp-key': 'hpp-val1'}}}, 'metadata': {'name': 'hostpath-provisioner'}} 3. Wait till only 1 hpp-pool pod will be running 4. Restore HostPathProvisioner hostpath-provisioner: {'spec': {'workload': {'nodeSelector': {'hpp-key': None}}}, 'metadata': {'name': 'hostpath-provisioner'}} 5. Restore Node c01-jp410-3-s69ln-worker-0-f2hrj: {'metadata': {'labels': {'hpp-key': None}, 'name': 'c01-jp410-3-s69ln-worker-0-f2hrj'}} Actual results: cleanup-pool-hpp-csi pods are Completed, but not deleted $ oc get pods -n openshift-cnv | grep hpp cleanup-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-9xnxc 0/1 Completed 0 3h1m cleanup-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-stcwp 0/1 Completed 0 3h1m hpp-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-f2hr4svnk 1/1 Running 0 26s hpp-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-g2pr64b9c 1/1 Running 0 26s hpp-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-kdk2clx4x 1/1 Running 0 26s $ oc logs -n openshift-cnv cleanup-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-9xnxc {"level":"info","ts":1648114968.5318494,"logger":"mounter","msg":"Go Version: go1.16.12"} {"level":"info","ts":1648114968.5318856,"logger":"mounter","msg":"Go OS/Arch: linux/amd64"} {"level":"error","ts":1648114969.56956,"logger":"mounter","msg":"unable to determine filesystem type on device","error":"exit status 32","stacktrace":"runtime.main\n\t/usr/lib/golang/src/runtime/proc.go:225"} {"level":"info","ts":1648114969.5696182,"logger":"mounter","msg":"Output","out":"umount: /var/hpp-csi-pvc-block/csi: not mounted.\n"} Expected results: cleanup-pool-hpp-csi pods are deleted Additional info: Manual deletion of cleanup pods is not recommended because it might influence CR deletion/installation
Quick look at the code indicates the cleanup job is only deleted if the HPP CR is deleted. It should of course also delete the cleanup job after it is finished when the node selector changes.
Verified on CNV 4.10.3-4, CNV 4.11.0-441
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.10.3 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:5675