Description of problem (please be detailed as possible and provide log snippets): This is a negative case where a on deployment pod consuming pv not provisioned by csi and another deployment pod with rbd rwo are on the same node and node fencing is triggered. In this case rook operator goes in a panic state. more details https://github.com/rook/rook/issues/12558 Version of all relevant components (if applicable): 4.14 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? yes Is there any workaround available to the best of your knowledge? No, Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Already in builds.
How can I simulate a situation that Panic when operator is fencing a node where pv is no provisioned by CSI ? Can you please provide reproduction instructions? Was there any automatic test that run and caused this ?
You can use LSO to create the pv(as mentioned in the upstream link) and use that pv to bind the application pod. And follow similar steps afterward >Was there any automatic test that run and caused this? No, it was detected by the upstream user https://github.com/rook/rook/issues/12558
Verified with OCP 4.14.0-0.nightly-2023-11-05-194730 and ODF 4.14.0-161 Created non csi deployment pod and csi deployment pod on same node (compute-0) and shut down the node compute-0 Added taint to compute-0 (oc adm taint nodes compute-0 node.kubernetes.io/out-of-service=nodeshutdown:NoExecute) All pods in openshift-storage namespace came back online after 2-3m delay [jopinto@jopinto new]$ oc get pods -n openshift-storage NAME READY STATUS RESTARTS AGE csi-addons-controller-manager-6749c89487-bww85 2/2 Running 1 5m54s csi-cephfsplugin-hfsqn 2/2 Running 0 9h csi-cephfsplugin-hw4cb 2/2 Running 0 9h csi-cephfsplugin-provisioner-54c89b944d-7svgs 5/5 Running 0 9h csi-cephfsplugin-provisioner-54c89b944d-mv9dt 5/5 Running 0 5m54s csi-rbdplugin-provisioner-669449fdcb-7zff2 6/6 Running 0 9h csi-rbdplugin-provisioner-669449fdcb-m55s2 6/6 Running 0 9h csi-rbdplugin-v4fxr 3/3 Running 0 9h csi-rbdplugin-vzkqg 3/3 Running 0 9h noobaa-core-0 1/1 Running 0 5m50s noobaa-db-pg-0 1/1 Running 0 5m50s noobaa-endpoint-b69796f8-njl74 1/1 Running 0 5m54s noobaa-operator-686c6444d9-9hg9l 2/2 Running 1 5m56s ocs-metrics-exporter-65c7d9bbbb-529f5 1/1 Running 0 9h ocs-operator-5d87659678-g7lkv 1/1 Running 3 (2m4s ago) 5m54s odf-console-674bbff5d9-jw6d7 1/1 Running 0 9h odf-operator-controller-manager-7bf98567cb-gnt8j 2/2 Running 2 (155m ago) 9h rook-ceph-crashcollector-compute-1-5c5bf77958-2pjdr 1/1 Running 0 9h rook-ceph-crashcollector-compute-2-7774c577bf-789lf 1/1 Running 0 9h rook-ceph-exporter-compute-1-55f5d44457-nh58q 1/1 Running 0 9h rook-ceph-exporter-compute-2-6c5c857c9d-nr47l 1/1 Running 0 9h rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-6968bc788lzbz 2/2 Running 2 (61s ago) 9h rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-5d94d89dwkpc7 2/2 Running 2 (28s ago) 9h rook-ceph-mgr-a-79666b4f-7gz5r 2/2 Running 0 9h rook-ceph-mon-a-7bdff6fbf8-f4sdn 0/2 Pending 0 5m54s rook-ceph-mon-b-74d6676bf4-tdjdz 2/2 Running 0 9h rook-ceph-mon-c-6bbcb64766-l22fz 2/2 Running 0 9h rook-ceph-operator-595c4f8ddf-b6swb 1/1 Running 0 5m54s rook-ceph-osd-0-b9779fffb-wg85l 2/2 Running 0 9h rook-ceph-osd-1-864567b969-5h5fh 2/2 Running 0 9h rook-ceph-osd-2-7bf4bf998f-4llvc 0/2 Pending 0 5m56s rook-ceph-osd-prepare-593ac2d7fb3c46046eaebe7605f35856-f976t 0/1 Completed 0 9h rook-ceph-osd-prepare-ec1c36da76df2d2d8ecb6228fbd1c000-6bzj5 0/1 Completed 0 9h rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-6c89f5899v6m 2/2 Running 0 9h rook-ceph-tools-5bbc55fdf-g878r 1/1 Running 0 9h [jopinto@jopinto new]$ oc get pods -n test2 NAME READY STATUS RESTARTS AGE simple-app-7649fdb746-nrtrd 1/1 Running 0 9m39s simple-app1-779d9ddf59-8fgh7 1/1 Running 0 9m39s
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6832