2121091 – [4.11.0] HPP CR cleanup jobs can't complete when hpp-pool mount wasn't successful

Bug 2121091 - [4.11.0] HPP CR cleanup jobs can't complete when hpp-pool mount wasn't successful

Summary: [4.11.0] HPP CR cleanup jobs can't complete when hpp-pool mount wasn't succes...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	4.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.11.4
Assignee:	Alexander Wels
QA Contact:	Jenia Peimer
Docs Contact:
URL:
Whiteboard:
Depends On:	2118273
Blocks:
TreeView+	depends on / blocked

Reported:	2022-08-24 12:32 UTC by Yan Du
Modified:	2023-01-18 14:20 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	2118273
Environment:
Last Closed:	2023-01-18 14:20:13 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	CNV-20774	0	None	None	None	2022-12-07 13:20:16 UTC

Description Yan Du 2022-08-24 12:32:18 UTC

+++ This bug was initially created as a clone of Bug #2118273 +++

Description of problem:
We installed HPP CR with pvcTemplate based on CEPH storage class. CEPH got CriticallyFull and some hpp-pool pods couldn't create. 

Version-Release number of selected component (if applicable):
4.12, 4.11, 4.10

How reproducible:
Only when there's a problem with the underlying storage class

Steps to Reproduce:
1. Create HPP CR with pvcTemplate based on CEPH
2. Use all CEPH storage
3. Delete HPP CR

Actual results:
$ oc get pods -A | grep hpp
openshift-cnv    hpp-pool-29ab9406-85bc665cdb-wqz7j   1/1  Running  0  46h
openshift-cnv    hpp-pool-4356e54b-7ccf5c44d-95tkr    1/1  Running  0  46h
openshift-cnv    hpp-pool-7dfd761c-6ffd959c85-tfqds   0/1  ContainerCreating  0 19m

$ oc delete hostpathprovisioner hostpath-provisioner
hostpathprovisioner.hostpathprovisioner.kubevirt.io "hostpath-provisioner" deleted
(STUCK)

$ oc get jobs -n openshift-cnv 
NAME                    COMPLETIONS   DURATION   AGE
cleanup-pool-4dd1b8bf   0/1           13m        13m
cleanup-pool-d1954b6a   0/1           13m        13m
cleanup-pool-edb68ab8   1/1           6s         13m


Expected results:
HPP CR deleted


Additional info:
As a W/A: delete the cleanup pods manually, they will be recreated and they will complete successfully.


HPP CR:
apiVersion: hostpathprovisioner.kubevirt.io/v1beta1
kind: HostPathProvisioner
metadata:
  name: hostpath-provisioner
spec:
  imagePullPolicy: IfNotPresent
  storagePools: 
    - name: hpp-csi-local-basic
      path: "/var/hpp-csi-local-basic"
    - name: hpp-csi-pvc-block
      pvcTemplate: 
        volumeMode: Block
        storageClassName: ocs-storagecluster-ceph-rbd
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
      path: "/var/hpp-csi-pvc-block"
  workload:
    nodeSelector:
      kubernetes.io/os: linux

Comment 2 Adam Litke 2023-01-18 14:20:13 UTC

Given the relatively low severity of this issue I think we can avoid z-stream backports.  Therefore, I am closing this and will retarget the parent bug to the current release under development (4.13).

Note You need to log in before you can comment on or make changes to this bug.