Bug 2315651

Summary: [Provider Mode] Some pods under openshift-storage namespace does not contain 'node.ocs.openshift.io/storage' toleration
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Jilju Joy <jijoy>
Component: ceph-csi-operatorAssignee: Leela Venkaiah Gangavarapu <lgangava>
Status: CLOSED ERRATA QA Contact: Jilju Joy <jijoy>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.17CC: lgangava, muagarwa, nberry, odf-bz-bot, omitrani, resoni
Target Milestone: ---   
Target Release: ODF 4.17.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: isf-provider
Fixed In Version: 4.17.0-117 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-10-30 14:36:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jilju Joy 2024-09-30 10:40:12 UTC
Description of problem:

The 'ctrlplugin' and 'nodeplugin' pods does not have the toleration "node.ocs.openshift.io/storage" in provider cluster. In the client cluster, in addition to "ctrlplugin" and "nodeplugin" pods, the ceph-csi-controller-manager and csi-addons-controller-manager pods does not have toleration "node.ocs.openshift.io/storage"

>>> check_toleration_on_pods()
The pod openshift-storage.cephfs.csi.ceph.com-ctrlplugin-76db9dcf5lcspz does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.cephfs.csi.ceph.com-ctrlplugin-76db9dcf5wdzxm does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.cephfs.csi.ceph.com-nodeplugin-h7qt4 does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.cephfs.csi.ceph.com-nodeplugin-ldp8t does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.cephfs.csi.ceph.com-nodeplugin-qk2nd does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-ctrlplugin-7f75d86c95-497df does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-ctrlplugin-7f75d86c95-5988p does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-nodeplugin-6662j does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-nodeplugin-jsvmd does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-nodeplugin-l7ztl does not have toleration node.ocs.openshift.io/storage


The outputs given above are from provider cluster 4.17.0-101.


Given below is from client cluster 4.17.0-103. In the client cluster, in addition to "ctrlplugin" and "nodeplugin" pods, the ceph-csi-controller-manager and csi-addons-controller-manager pods does not have toleration "node.ocs.openshift.io/storage". If the usage of the taint is not restricted in client cluster, the toleration will be required on these pods as well.

>>> check_toleration_on_pods()
The pod ceph-csi-controller-manager-74b54fc579-2qcs7 does not have toleration node.ocs.openshift.io/storage
The pod csi-addons-controller-manager-7d7b754d7-2jwcs does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.cephfs.csi.ceph.com-ctrlplugin-bff885c654hfsp does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.cephfs.csi.ceph.com-ctrlplugin-bff885c65df7qc does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.cephfs.csi.ceph.com-nodeplugin-n26j9 does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.cephfs.csi.ceph.com-nodeplugin-rqmkw does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-ctrlplugin-846986475f-kh9cf does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-ctrlplugin-846986475f-q64pd does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-nodeplugin-mwnvv does not have toleration node.ocs.openshift.io/storage
The pod openshift-storage.rbd.csi.ceph.com-nodeplugin-z52bp does not have toleration node.ocs.openshift.io/storage

Version-Release number of selected component (if applicable):
ODF 4.17.0-103 and 4.17.0-101
OCP 4.16 and 4.17

How reproducible:
Always

Steps to Reproduce:
1. In provider-cluster, verify whether the toleration "node.ocs.openshift.io/storage" is present on the pods.
Example command:
% oc get pod <pod name> -o=jsonpath='{.spec.tolerations}' | jq     

Actual results:
Toleration "node.ocs.openshift.io/storage" is not present on some pods.

Expected results:
Pods should have the toleration "node.ocs.openshift.io/storage"

Additional info:

Comment 9 Sunil Kumar Acharya 2024-10-08 13:17:11 UTC
Please update the RDT flag/text appropriately.

Comment 11 errata-xmlrpc 2024-10-30 14:36:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:8676