Description of problem: When AWS EBS CSI driver returns "RequestLimitExceeded" to ControllerUnpublishRequest (="detach"), csi-external-attacher marks the volume as detached: I0805 16:05:08.948172 1 connection.go:180] GRPC call: /csi.v1.Controller/ControllerUnpublishVolume I0805 16:05:08.948182 1 connection.go:181] GRPC request: {"node_id":"i-0eb51d1f90e270d91","volume_id":"vol-0766d9cc3c1966b86"} I0805 16:05:15.323947 1 connection.go:183] GRPC response: {} I0805 16:05:15.324406 1 connection.go:184] GRPC error: rpc error: code = Internal desc = Could not detach volume "vol-0766d9cc3c1966b86" from node "i-0eb51d1f90e270d91": could not detach volume "vol-0766d9cc3c1966b86" from node "i-0eb51d1f90e270d91": RequestLimitExceeded: Request limit exceeded. status code: 503, request id: 10ceab6c-4a6d-4da5-add2-d46d4fdb652a I0805 16:05:15.324422 1 csi_handler.go:369] Detached "csi-51ad5c68abe99844c08cfce659c0f8375d8b0255391a6b28b543501577b3dee6" with error rpc error: code = Internal desc = Could not detach volume "vol-0766d9cc3c1966b86" from node "i-0eb51d1f90e270d91": could not detach volume "vol-0766d9cc3c1966b86" from node "i-0 eb51d1f90e270d91": RequestLimitExceeded: Request limit exceeded. status code: 503, request id: 10ceab6c-4a6d-4da5-add2-d46d4fdb652a I0805 16:05:15.324459 1 util.go:70] Marking as detached "csi-51ad5c68abe99844c08cfce659c0f8375d8b0255391a6b28b543501577b3dee6" The volume remains attached to a node and the external-attacher never re-tries to detach the volume. Version-Release number of selected component (if applicable): 4.2.0-0.okd-2019-08-05-143844 How reproducible: rarely Steps to Reproduce: 1. run a pod with AWS EBS volume provided by CSI. 2. delete the pod and hope for RequestLimitExceeded response. Actual results: 1. The volume is still attached to the node (as seen in AWS console) 2. VolumeAttachment Kubernetes object is deleted Expected results: 1. The volume is detached from the node (after a while). 2. VolumeAttachment Kubernetes object is deleted.
Filed https://github.com/kubernetes-csi/external-attacher/pull/165
Created 400 pvc, volumes and pods, but did not meet the AWS "RequestLimitExceeded". Mark this bug to be verified. Will re-open it if this issue happen again.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922