Description of problem (please be detailed as possible and provide log snippests): The status of all the rook-ceph pods not associated with the worker node doesn't change after the worker node is shut down. However, the expectation is that after shutting down a worker node, at least one of the rook-ceph pods' statuses not associated with the worker node will change, or one of the pods will be deleted. Version of all relevant components (if applicable): vSphere UPI OCP 4.16, ODF 4.16 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? No. Is there any workaround available to the best of your knowledge? Yes. After powering on the worker node, the pods return to a Ready state. Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Yes, but it's not consistent. Can this issue reproduce from the UI? No. If this is a regression, please provide more details to justify this: I am not sure. I have seen this error also in ODF 4.15. Steps to Reproduce: 1. Shtting down a worker node 2. Check the status of the rook-ceph pods not associated with the worker node. Actual results: The status of all rook-ceph pods not associated with the worker node didn't change. Expected results: At least one of the rook-ceph pods not associated with the worker node should have its status changed, or one of the pods should be deleted. Additional info: Report portal link: https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/all/20685/991918/991929/log. Versions: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-020vup1cs33-t4a/j-020vup1cs33-t4a_20240423T010622/logs/test_report_1713834101.html
(In reply to Itzhak from comment #0) > Description of problem (please be detailed as possible and provide log > snippests): > Actual results: > The status of all rook-ceph pods not associated with the worker node didn't > change. Slight confusion. Why would the status of the pods, "not" associated with the worker node that was shut down, will change? Are you referring to the pods that were on the node that was shut down? > > Expected results: > At least one of the rook-ceph pods not associated with the worker node > should have its status changed, or one of the pods should be deleted.
TBH, I am not quite sure where I originally initiated this step. I remember we also expect a change from the pods "not" in the worker node. If this doesn't make sense, I can close the bug and delete this step from the ocs-ci test.
To the best of my knowledge, for both the graceful and non-graceful shutdown of the nodes, only the pods that were running on node (that was shutdown) should be affected. It should not affect the status of pods on other nodes. So my suggestion would be to close this BZ. Or if there is a valid reason why ocs-ci is having this test to check the status of "not" associated pods, then please update the BZ with that reasoning. For now, I'll move it to 4.17.
After a discussion with other QE members, we decided that we could remove this step. So, I am closing the BZ and will fix the test in the ocs-ci accordingly.