Description of problem: Pod created after re-deployment cannot attach the EBS volume since old pod doesn't terminate to detach the volume and return error about "VolumeInUse: vol-9771d635 is already attached to an instance" in old pod event, but no such issue if pod is scheduled to node ip-172-31-2-203.ec2.internal. Version-Release number of selected component (if applicable): ded-stage-aws atomic-openshift-3.2.0.8-1.git.0.f4edaed.el7.x86_64 How reproducible: Most of times - if pods are scheduled to nodes other than ip-172-31-2-203.ec2.internal Steps to Reproduce: 1. Process mysql-persistent template: $oc process -f https://raw.githubusercontent.com/openshift/origin/master/examples/db-templates/mysql-persistent-template.json | oc create -f - 2. Wait pod is running and change db password via $oc deploy mysql --latest 3. Check pod status 4. Describe pod Actual result: 3. $ oc get pods NAME READY STATUS RESTARTS AGE mysql-1-n4eac 0/1 ContainerCreating 0 2h ip-172-31-2-202.ec2.internal mysql-2-deploy 0/1 Error 0 3h ip-172-31-2-202.ec2.internal 4. Below errors in Event list: Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 58m 58m 1 {default-scheduler } Normal Scheduled Successfully assigned mysql-1-n4eac to ip-172-31-2-203.ec2.internal 57m <invalid> 53 {kubelet ip-172-31-2-203.ec2.internal} Warning FailedMount Unable to mount volumes for pod "mysql-1-n4eac_wzheng2(77057ac5-f7c7-11e5-89b5-0eb4d24322f9)": Could not attach EBS Disk "aws://us-east-1c/vol-9771d635": Error attaching EBS volume: VolumeInUse: vol-9771d635 is already attached to an instance status code: 400, request id: 57m <invalid> 53 {kubelet ip-172-31-2-203.ec2.internal} Warning FailedSync Error syncing pod, skipping: Could not attach EBS Disk "aws://us-east-1c/vol-9771d635": Error attaching EBS volume: VolumeInUse: vol-9771d635 is already attached to an instance status code: 400, request id: Expected results: Pod should be running like it does in node ip-172-31-2-203.ec2.internal Additional info:
I believe this issue is the attach/detach logic handled by https://github.com/kubernetes/kubernetes/issues/20262 Basically, Node A has a recon loop in Kubelet that will detach/unmount orphaned volumes (no pod needs them anymore). Unless and until Kubelet on Node A processes that orphaned volume, it will remain unaccessible for Node B (volume already in use error). The centralized attach/detach controller in the above PR seeks to mitigate this problem.
(In reply to Mark Turansky from comment #2) > I believe this issue is the attach/detach logic handled by > https://github.com/kubernetes/kubernetes/issues/20262 > > Basically, Node A has a recon loop in Kubelet that will detach/unmount > orphaned volumes (no pod needs them anymore). Unless and until Kubelet on > Node A processes that orphaned volume, it will remain unaccessible for Node > B (volume already in use error). > > The centralized attach/detach controller in the above PR seeks to mitigate > this problem. From what I saw in the environment, during the 2nd deploy, the first mysql pod was stucked in 'Terminating' status, the volume was not detached from the instance, at the same time the new mysql pod was trying to mount the same ebs volume too, and it got the error above.
Did you find why it was stuck in "terminating" status? So long as that pod remains on the original node (even in an error state), then its volumes are necessary on that node. Only when the pod is gone completely from a node will Kubelet unmount and detach all volumes.
Clayton (I believe) mentioned on the bug scrub call today that pods stuck in Terminating state was an issue that Derek was chasing and is currently working on a fix.
I think the pod stuck in terminating issue could have been resolved by: https://github.com/kubernetes/kubernetes/pull/23746 The node could have got stuck if the docker image pull for mysql:latest returned a 50x error response at any point each time it was attempting to restart/start the container.
No such issue in ded-stage-aws now, pod is running normally after re-deployment.