Bug 2043028 - the CSI-Addons sidecar is not automatically deployed, requires enabling in Rook ConfigMap
Summary: the CSI-Addons sidecar is not automatically deployed, requires enabling in Ro...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ODF 4.10.0
Assignee: yati padia
QA Contact: Jilju Joy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-20 14:03 UTC by Jilju Joy
Modified: 2023-08-09 17:00 UTC (History)
11 users (show)

Fixed In Version: 4.10.0-132
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-13 18:51:56 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 1458 0 None open enables csi-addons sidecar in configmap 2022-01-21 12:35:50 UTC
Red Hat Product Errata RHSA-2022:1372 0 None None None 2022-04-13 18:52:08 UTC

Description Jilju Joy 2022-01-20 14:03:08 UTC
Description of problem (please be detailed as possible and provide log
snippests):
ReclaimSpaceJob failed due to the error "Controller and Node Client not found".
The PVC is in Bound state and the app-pod where the PVC is attached is in Running state.

$ oc get ReclaimSpaceJob reclaim-pvcrbd1 -o yaml
apiVersion: csiaddons.openshift.io/v1alpha1
kind: ReclaimSpaceJob
metadata:
  creationTimestamp: "2022-01-20T13:16:12Z"
  generation: 1
  name: reclaim-pvcrbd1
  namespace: test-project
  resourceVersion: "84469"
  uid: b3507d9e-8042-4222-aad0-8bc81b1bd8ac
spec:
  backOffLimit: 10
  retryDeadlineSeconds: 900
  target:
    persistentVolumeClaim: pvcrbd1
status:
  completionTime: "2022-01-20T13:16:17Z"
  conditions:
  - lastTransitionTime: "2022-01-20T13:16:17Z"
    message: Controller and Node Client not found
    observedGeneration: 1
    reason: failed
    status: "True"
    type: Failed
  message: Maximum retry limit reached
  reclaimedSpace: "0"
  result: Failed
  retries: 10
  startTime: "2022-01-20T13:16:12Z"


$ oc get pvc
NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
pvcrbd    Bound    pvc-31ae909a-697d-497c-a596-9e1d9906071b   1Mi        RWO            ocs-storagecluster-ceph-rbd   46m
pvcrbd1   Bound    pvc-60b66128-9d0c-449a-824f-9b4ed698a23a   5Gi        RWO            ocs-storagecluster-ceph-rbd   2m48s


============================================================
Version of all relevant components (if applicable):
ODF 4.10.0-113
OCP 4.10.0-0.nightly-2022-01-19-150530

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, RBD Reclaim Space feature is not working.

Is there any workaround available to the best of your knowledge?

Add this in the configmap  rook-ceph-operator-config.
data:
  CSI_ENABLE_CSIADDONS: "true"

Rakshith suggested this workaround. ReclaimSpaceJob succeeded after adding this.


============================================
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Create an RBD PVC and attach it to a pod.
2. Create ReclaimSpaceJob
3. Check the sttaus of ReclaimSpaceJob

Sample ReclaimSpaceJob yaml:
apiVersion: csiaddons.openshift.io/v1alpha1
kind: ReclaimSpaceJob
metadata:
  name: reclaim-pvcrbd1
spec:
  target:
    persistentVolumeClaim: pvcrbd1
  backOffLimit: 10
  retryDeadlineSeconds: 900


Actual results:
ReclaimSpaceJob Failed

Expected results:
ReclaimSpaceJob should succeed.

Additional info:

Comment 3 Niels de Vos 2022-01-20 16:01:14 UTC
Rook is not configured to deploy Ceph-CSI with the csi-addons sidecar by default.

The ConfigMap rook-ceph-operator-config needs to have the `CSI_ENABLE_CSIADDONS: "true"` parameter set. This is a limitation currently inherited from the (upstream) Rook deployment.

It should be possible to have this adjusted by OCS-Operator. Enabling the feature by default has my support :-)

Comment 9 Jilju Joy 2022-02-16 12:13:25 UTC
Verified in version:
ODF 4.10.0-156
OCP 4.10.0-0.nightly-2022-02-15-041303
Tested in AWS

CSI_ENABLE_CSIADDONS parameter in the configmap 'rook-ceph-operator-config' is set to "ture" as it's default value.

$ oc -n openshift-storage get configmap rook-ceph-operator-config -o yaml
apiVersion: v1
data:
  CSI_ENABLE_CSIADDONS: "true"
  CSI_LOG_LEVEL: "5"
  CSI_PLUGIN_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
  CSI_PROVISIONER_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
kind: ConfigMap
metadata:
  creationTimestamp: "2022-02-16T06:31:17Z"
  name: rook-ceph-operator-config
  namespace: openshift-storage
  resourceVersion: "33134"
  uid: 40f327bd-6d35-4136-8593-a343db56e123

Comment 11 errata-xmlrpc 2022-04-13 18:51:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372

Comment 12 Ramakrishnan Periyasamy 2022-08-17 07:10:11 UTC
Hi Jilju, Do we have any automation coverage for this BZ? If no then can we consider this part of Automation Backlogs?

Comment 13 Jilju Joy 2022-08-17 09:34:40 UTC
(In reply to Ramakrishnan Periyasamy from comment #12)
> Hi Jilju, Do we have any automation coverage for this BZ? If no then can we
> consider this part of Automation Backlogs?

We have ReclaimSpace tests automated which will pass only if the value of CSI_ENABLE_CSIADDONS is 'true'. In this case, I think this bug can be considered as covered in test.


Note You need to log in before you can comment on or make changes to this bug.