Bug 2068269 - HPP cleanup-pool-hpp-csi pods are not deleted
Summary: HPP cleanup-pool-hpp-csi pods are not deleted
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 4.10.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.10.3
Assignee: Alexander Wels
QA Contact: Jenia Peimer
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-24 19:04 UTC by Jenia Peimer
Modified: 2022-07-20 16:01 UTC (History)
4 users (show)

Fixed In Version: 4.10.3-4, 4.11.0-441
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-07-20 16:01:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
HPP Operator log (107.08 KB, text/plain)
2022-03-24 19:04 UTC, Jenia Peimer
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt hostpath-provisioner-operator pull 234 0 None Merged Fix cleanup jobs not being removed when node selector changes 2022-06-17 12:45:42 UTC
Github kubevirt hostpath-provisioner-operator pull 237 0 None Merged [release-v0.12] Fix cleanup jobs not being removed when node selector changes 2022-05-23 17:01:32 UTC
Red Hat Product Errata RHEA-2022:5675 0 None None None 2022-07-20 16:01:27 UTC

Description Jenia Peimer 2022-03-24 19:04:18 UTC
Created attachment 1868176 [details]
HPP Operator log

Created attachment 1868176 [details]
HPP Operator log

Description of problem:
After updating HPP CR with pvcTemplate to run hpp on a specific node, hpp-pool pods that run on other nodes are getting deleted and cleanup-pool-hpp-csi pods are created to make a cleanup. But cleanup pods stay in Completed status and are not getting deleted.

Version-Release number of selected component (if applicable):
4.10

How reproducible:
Always

Steps to Reproduce:

1. Update Node c01-jp410-3-s69ln-worker-0-f2hrj:
{'metadata': {'labels': {'hpp-key': 'hpp-val1'}, 'name': 'c01-jp410-3-s69ln-worker-0-f2hrj'}}

2. Update HostPathProvisioner hostpath-provisioner:
{'spec': {'workload': {'nodeSelector': {'hpp-key': 'hpp-val1'}}}, 'metadata': {'name': 'hostpath-provisioner'}}

3. Wait till only 1 hpp-pool pod will be running

4. Restore HostPathProvisioner hostpath-provisioner:
{'spec': {'workload': {'nodeSelector': {'hpp-key': None}}}, 'metadata': {'name': 'hostpath-provisioner'}}

5. Restore Node c01-jp410-3-s69ln-worker-0-f2hrj:
{'metadata': {'labels': {'hpp-key': None}, 'name': 'c01-jp410-3-s69ln-worker-0-f2hrj'}}


Actual results:
cleanup-pool-hpp-csi pods are Completed, but not deleted

$ oc get pods -n openshift-cnv | grep hpp
cleanup-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-9xnxc   0/1     Completed   0              3h1m
cleanup-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-stcwp   0/1     Completed   0              3h1m
hpp-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-f2hr4svnk   1/1     Running     0              26s
hpp-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-g2pr64b9c   1/1     Running     0              26s
hpp-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-kdk2clx4x   1/1     Running     0              26s

$ oc logs -n openshift-cnv cleanup-pool-hpp-csi-pvc-block-c01-jp410-3-s69ln-worker-0-9xnxc
{"level":"info","ts":1648114968.5318494,"logger":"mounter","msg":"Go Version: go1.16.12"}
{"level":"info","ts":1648114968.5318856,"logger":"mounter","msg":"Go OS/Arch: linux/amd64"}
{"level":"error","ts":1648114969.56956,"logger":"mounter","msg":"unable to determine filesystem type on device","error":"exit status 32","stacktrace":"runtime.main\n\t/usr/lib/golang/src/runtime/proc.go:225"}
{"level":"info","ts":1648114969.5696182,"logger":"mounter","msg":"Output","out":"umount: /var/hpp-csi-pvc-block/csi: not mounted.\n"}

Expected results:
cleanup-pool-hpp-csi pods are deleted

Additional info:
Manual deletion of cleanup pods is not recommended because it might influence CR deletion/installation

Comment 1 Alexander Wels 2022-03-24 19:36:58 UTC
Quick look at the code indicates the cleanup job is only deleted if the HPP CR is deleted. It should of course also delete the cleanup job after it is finished when the node selector changes.

Comment 5 Jenia Peimer 2022-06-26 08:36:38 UTC
Verified on CNV 4.10.3-4, CNV 4.11.0-441

Comment 12 errata-xmlrpc 2022-07-20 16:01:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 4.10.3 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:5675


Note You need to log in before you can comment on or make changes to this bug.