Description of problem (please be detailed as possible and provide log snippests): ============================================================================= When a worker node is poweredoff/shut-downed the DC app pods on the failed node respinned on another healthy node but the pod is stuck at "ContainerCreating" status due to multi-attach error. Version of all relevant components (if applicable): v4.4.0-428 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? We have a WA at the moment Is there any workaround available to the best of your knowledge? Yes. When the old pods are forcefully deleted using below commands, fedora based DC app pod reached running state. oc delete pod pod-test-rbd-abfb49d50f214d849f5db0f062ba49b0-1-tgjwt --force --grace-period=0 Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: I don't think this is a regression Steps to Reproduce: =================== This issue can be reproduced by running below ocs-ci test, https://github.com/red-hat-storage/ocs-ci/blob/master/tests/manage/z_cluster/nodes/test_automated_recovery_from_failed_nodes_reactive_IPI.py The test script does below steps, 1) Create two fedora based DC app pods using node_selector. 2) Identify both DC app pod and OSD running node and increase the machineset. 3) Wait till the new node comes up and label the node with ocs storage label. 4) Power off the identified node in step-2 from AWS console. 5) Wait till the OCS pods on the failed node failover to other node in the same AZ 6) Fedora based DC app pod should automatically spin on another nodes and reach running state. 7) Do sanity check and health check. Actual results: =============== Fedora based DC app pod stuck at ContainerCreating state due to multi-attach error. pod-test-rbd-abfb49d50f214d849f5db0f062ba49b0-1-jcwhf 0/1 ContainerCreating 0 78m pod-test-rbd-abfb49d50f214d849f5db0f062ba49b0-1-tgjwt 1/1 Terminating 0 89m Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled <unknown> default-scheduler Successfully assigned namespace-test-75a6b474bf9c4e3980dcbd5c43c11813/pod-test-rbd-9cb5a95ea8114995883f664d7dfdc5c1-1-mstsw to ip-10-0-144-141.us-east-2.compute.internal Warning FailedAttachVolume 20m attachdetach-controller Multi-Attach error for volume "pvc-d64c1f43-e14b-46b7-8c40-ef0316ba8639" Volume is already used by pod(s) pod-test-rbd-9cb5a95ea8114995883f664d7dfdc5c1-1-2swbx Warning FailedMount 37s (x9 over 18m) kubelet, ip-10-0-144-141.us-east-2.compute.internal Unable to attach or mount volumes: unmounted volumes=[fedora-vol], unattached volumes=[fedora-vol]: timed out waiting for the condition Expected results: ================= The DC app pod should reach running status on the new node.
Warning FailedAttachVolume 20m attachdetach-controller Multi-Attach error for volume "pvc-d64c1f43-e14b-46b7-8c40-ef0316ba8639" Volume is already used by pod(s) pod-test-rbd-9cb5a95ea8114995883f664d7dfdc5c1-1-2swbx Indeed this is working as per current design. Unless there is an acknowledgment in the API object for removal of pod/node ..etc, the switch will be difficult/risky and this is an extra measure to make sure we don't land on data corruption..etc. However, we are also looking into the improvements we could think of. The fencing of the node is one solution as "Cloud Providers" does. Niels, I am assigning this bug to you based on the experiments you are planning in this area. Please feel free to reassign if required.
Discussed with Niels, this is not soemthing which can be done in 4.5 (or near future). Can be moved out.
This is not much different from https://bugzilla.redhat.com/show_bug.cgi?id=1845666
(In reply to Mudit Agarwal from comment #5) > This is not much different from > https://bugzilla.redhat.com/show_bug.cgi?id=1845666 So why not close as dup?
As Humble mentioned that this issue can be resolved with node fencing, I am duping this bug to https://bugzilla.redhat.com/show_bug.cgi?id=1845666. Though this is an older bug and the other one should have been duped here but https://bugzilla.redhat.com/show_bug.cgi?id=1845666 was specifically opened to address node fencing in ceph-csi and carries a lot more details about the problem hence duping this one there. *** This bug has been marked as a duplicate of bug 1845666 ***