2057157 – [4.10.0] HPP-CSI-PVC fails to bind PVC when node fqdn is long

Bug 2057157 - [4.10.0] HPP-CSI-PVC fails to bind PVC when node fqdn is long

Summary: [4.10.0] HPP-CSI-PVC fails to bind PVC when node fqdn is long

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	4.10.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Alexander Wels
QA Contact:	Jenia Peimer
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2067190
TreeView+	depends on / blocked

Reported:	2022-02-22 19:49 UTC by Lukas Bednar
Modified:	2023-11-13 08:15 UTC (History)
CC List:	10 users (show)
Fixed In Version:	CNV v4.11.0-596
Doc Type:	Known Issue
Doc Text:	Adding known issue to the CNV 4.10 (4.10.1) release notes per https://bugzilla.redhat.com/show_bug.cgi?id=2067190. Important: When this bug if fixed, please change the "Doc Type" to Bug Fix and replace this text with a short description of the fix. Thanks
Clone Of:
Clones:	2067190 (view as bug list)
Environment:
Last Closed:	2022-09-14 19:28:43 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	kubernetes-csi external-provisioner pull 717	None	Merged	fix managed-by label being too long when the node name is long.	2022-04-27 17:19:22 UTC
Github	kubernetes-csi external-provisioner pull 739	None	Merged	[release-3.1] backport 'fix managed-by label being too long when the node name is long.'	2022-06-27 19:40:13 UTC
Red Hat Issue Tracker	CNV-16583	None	None	None	2023-11-13 08:15:13 UTC
Red Hat Product Errata	RHSA-2022:6526	None	None	None	2022-09-14 19:29:07 UTC

Description Lukas Bednar 2022-02-22 19:49:24 UTC

Description of problem:

logs from csi-provisioner container in the hostpath-provisioner-csi pod

E0222 17:52:54.088950       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: unable to parse requirement: values[0][csi.storage.k8s.io/managed-by]: Invalid value: "external-provisioner-TRIMMED": must be no more than 63 characters

Version-Release number of selected component (if applicable):
OCP-4.10.0
CNV-4.10.0-686

How reproducible:

The issue reproduce with both hpp-csi-basic and also with hpp-csi-pvc-block storage class. Using WFFC bindingMode on the storage class. Note that when I change it to Immediate it works.


Steps to Reproduce:
1. Deploy HPP-CSI on cluster which has long fqdn names.
2. Try to bind a PVC
3.

Actual results: Stuck in pending state


Expected results: PVC should bind with PV


Additional info:

apiVersion: v1
items:
- apiVersion: hostpathprovisioner.kubevirt.io/v1beta1
  kind: HostPathProvisioner
  metadata:
    creationTimestamp: "2022-02-22T17:52:08Z"
    finalizers:
    - finalizer.delete.hostpath-provisioner
    generation: 12
    name: hostpath-provisioner
    resourceVersion: "28215"
    uid: 70dcf437-0b0b-4626-b6ae-fd405d397865
  spec:
    imagePullPolicy: IfNotPresent
    storagePools:
    - name: hpp-csi-local-basic
      path: /var/hpp-csi-local-basic
    - name: hpp-csi-pvc-block
      path: /var/hpp-csi-pvc-block
      pvcTemplate:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
        storageClassName: local-block-hpp
        volumeMode: Block
    workload:
      nodeSelector:
        kubernetes.io/os: linux
  status:
    conditions:
    - lastHeartbeatTime: "2022-02-22T17:52:31Z"
      lastTransitionTime: "2022-02-22T17:52:31Z"
      message: Application Available
      reason: Complete
      status: "True"
      type: Available
    - lastHeartbeatTime: "2022-02-22T17:52:31Z"
      lastTransitionTime: "2022-02-22T17:52:31Z"
      status: "False"
      type: Progressing
    - lastHeartbeatTime: "2022-02-22T17:52:31Z"
      lastTransitionTime: "2022-02-22T17:52:09Z"
      status: "False"
      type: Degraded
    observedVersion: v4.10.0
    operatorVersion: v4.10.0
    storagePoolStatuses:
    - name: hpp-csi-local-basic
      phase: Ready
    - claimStatuses:
      - name: hpp-pool-5e8d1dd5
        status:
          accessModes:
          - ReadWriteOnce
          capacity:
            storage: 446Gi
          phase: Bound
      currentReady: 1
      desiredReady: 1
      name: hpp-csi-pvc-block
      phase: Ready
    targetVersion: v4.10.0
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""


---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  creationTimestamp: "2022-02-22T17:52:08Z"
  name: hostpath-csi-basic
  resourceVersion: "27230"
  uid: 54847180-2570-4a32-9a95-4d4d208c5d69
parameters:
  storagePool: hpp-csi-local-basic
provisioner: kubevirt.io.hostpath-provisioner
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer


---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2022-02-22T19:02:09Z"
  name: hostpath-csi-pvc-block
  resourceVersion: "178561"
  uid: e3dead94-111d-469a-a8af-c8796eaf3d54
parameters:
  storagePool: hpp-csi-pvc-block
provisioner: kubevirt.io.hostpath-provisioner
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Comment 3 Alexander Wels 2022-02-23 13:59:36 UTC

The pertinent message is definitely this:

E0222 17:52:54.088950       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: unable to parse requirement: values[0][csi.storage.k8s.io/managed-by]: Invalid value: "external-provisioner-TRIMMED": must be no more than 63 characters

Basically it is saying it is unable to watch and manage the CSIStorageCapacity object because the csi.storage.k8s.io/managed-by label is too long. This object is created by the csi external provisioner and that is one of the side cars in use by the hpp CSI driver. I have opened an issue on the external provisioner [0] 

[0] https://github.com/kubernetes-csi/external-provisioner/issues/707

Comment 5 Yan Du 2022-03-16 13:26:27 UTC

Is it too late to have release note for this?

Comment 6 Yan Du 2022-03-23 13:44:50 UTC

Open a new doc bug for this #bug 2067190

Comment 7 Alexander Wels 2022-03-31 16:57:23 UTC

external-provisioner PR is merged. Just need to get that into a release upstream and then into a release downstream.

Comment 8 Alexander Wels 2022-04-21 12:06:40 UTC

So we have to wait for upstream to release a version with the fix in it. The current latest release is 3.1.0 which does NOT contain the fix. We can't really target a release until we have the upstream release.

Comment 9 Yan Du 2022-04-27 12:04:18 UTC

Alexander, could you please reply once there is an upstream release? so we can target this bug

Comment 13 Alexander Wels 2022-05-02 12:00:28 UTC

I will let you know when a new release happens likely won't happen until upstream k8s 1.25 gets released.

Comment 18 Maya Rashish 2022-08-25 12:32:57 UTC

As of https://gitlab.cee.redhat.com/cpaas-midstream/openshift-virtualization/hco-bundle-registry/-/commit/e97186308828c20bb44b3e3f122dda5614f3d1e3 this is now downstream.
After some digging due to CNV version explorer being down, it seems like CNV v4.11.0-596 bundle contains this commit.
This means it will make its way to CNV 4.11.0. Adjusting target release.

Comment 19 Jenia Peimer 2022-08-31 18:33:32 UTC

Verified on CNV v4.11.0-601

Comment 23 errata-xmlrpc 2022-09-14 19:28:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Virtualization 4.11.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6526

Note You need to log in before you can comment on or make changes to this bug.