Bug 2121091

Summary: [4.11.0] HPP CR cleanup jobs can't complete when hpp-pool mount wasn't successful
Product: Container Native Virtualization (CNV) Reporter: Yan Du <yadu>
Component: StorageAssignee: Alexander Wels <awels>
Status: CLOSED WONTFIX QA Contact: Jenia Peimer <jpeimer>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.11.0CC: alitke, awels, jpeimer, ngavrilo, yadu
Target Milestone: ---   
Target Release: 4.11.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2118273 Environment:
Last Closed: 2023-01-18 14:20:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2118273    
Bug Blocks:    

Description Yan Du 2022-08-24 12:32:18 UTC
+++ This bug was initially created as a clone of Bug #2118273 +++

Description of problem:
We installed HPP CR with pvcTemplate based on CEPH storage class. CEPH got CriticallyFull and some hpp-pool pods couldn't create. 

Version-Release number of selected component (if applicable):
4.12, 4.11, 4.10

How reproducible:
Only when there's a problem with the underlying storage class

Steps to Reproduce:
1. Create HPP CR with pvcTemplate based on CEPH
2. Use all CEPH storage
3. Delete HPP CR

Actual results:
$ oc get pods -A | grep hpp
openshift-cnv    hpp-pool-29ab9406-85bc665cdb-wqz7j   1/1  Running  0  46h
openshift-cnv    hpp-pool-4356e54b-7ccf5c44d-95tkr    1/1  Running  0  46h
openshift-cnv    hpp-pool-7dfd761c-6ffd959c85-tfqds   0/1  ContainerCreating  0 19m

$ oc delete hostpathprovisioner hostpath-provisioner
hostpathprovisioner.hostpathprovisioner.kubevirt.io "hostpath-provisioner" deleted
(STUCK)

$ oc get jobs -n openshift-cnv 
NAME                    COMPLETIONS   DURATION   AGE
cleanup-pool-4dd1b8bf   0/1           13m        13m
cleanup-pool-d1954b6a   0/1           13m        13m
cleanup-pool-edb68ab8   1/1           6s         13m


Expected results:
HPP CR deleted


Additional info:
As a W/A: delete the cleanup pods manually, they will be recreated and they will complete successfully.


HPP CR:
apiVersion: hostpathprovisioner.kubevirt.io/v1beta1
kind: HostPathProvisioner
metadata:
  name: hostpath-provisioner
spec:
  imagePullPolicy: IfNotPresent
  storagePools: 
    - name: hpp-csi-local-basic
      path: "/var/hpp-csi-local-basic"
    - name: hpp-csi-pvc-block
      pvcTemplate: 
        volumeMode: Block
        storageClassName: ocs-storagecluster-ceph-rbd
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
      path: "/var/hpp-csi-pvc-block"
  workload:
    nodeSelector:
      kubernetes.io/os: linux

Comment 2 Adam Litke 2023-01-18 14:20:13 UTC
Given the relatively low severity of this issue I think we can avoid z-stream backports.  Therefore, I am closing this and will retarget the parent bug to the current release under development (4.13).