Description of problem: Using a dynamic provisioned ebs volume, when the PVC is deleted, PV entered FAILED status, and after some time, pv will be deleted Version-Release number of selected component (if applicable): openshift v3.3.0.27 kubernetes v1.3.0+507d3a7 etcd 2.3.0+git How reproducible: Always Steps to Reproduce: 1.oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/ebs/dynamic-provisioning-pvc.json 2.Create a pod using above pvc 3.After Pod is running, delete the pod and pvc 4.Check pv status [root@ip-172-18-4-224 ~]# oc get pv NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611 1Gi RWO Failed default/ebsc 4m 5. [root@ip-172-18-4-224 ~]# oc describe pv pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611 Name: pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611 Labels: failure-domain.beta.kubernetes.io/region=us-east-1 failure-domain.beta.kubernetes.io/zone=us-east-1d Status: Failed Claim: default/ebsc Reclaim Policy: Delete Access Modes: RWO Capacity: 1Gi Message: Delete of volume "pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611" failed: error deleting EBS volumes: VolumeInUse: Volume vol-2658d981 is currently attached to i-09ba4911 status code: 400, request id: Source: Type: AWSElasticBlockStore (a Persistent Disk resource in AWS) VolumeID: aws://us-east-1d/vol-2658d981 FSType: ext4 Partition: 0 ReadOnly: false Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 10s 10s 1 {persistentvolume-controller } Warning VolumeFailedDelete Delete of volume "pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611" failed: error deleting EBS volumes: VolumeInUse: Volume vol-2658d981 is currently attached to i-09ba4911 status code: 400, request id: Actual results: PV entered "Failed" status Expected results: PV should be "Released" and deleted Additional info: Aug 29 23:25:04 ip-172-18-4-224 atomic-openshift-master: I0829 23:25:04.592697 11569 controller.go:398] volume "pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611" is released and reclaim policy "Delete" will be executed Aug 29 23:25:04 ip-172-18-4-224 atomic-openshift-master: I0829 23:25:04.608665 11569 controller.go:618] volume "pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611" entered phase "Released" Aug 29 23:25:04 ip-172-18-4-224 atomic-openshift-master: I0829 23:25:04.619587 11569 controller.go:1079] isVolumeReleased[pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611]: volume is released Aug 29 23:25:04 ip-172-18-4-224 atomic-openshift-master: I0829 23:25:04.772467 11569 aws_util.go:51] Error deleting EBS Disk volume aws://us-east-1d/vol-2658d981: error deleting EBS volumes: VolumeInUse: Volume vol-2658d981 is currently attached to i-09ba4911 Aug 29 23:25:04 ip-172-18-4-224 atomic-openshift-master: status code: 400, request id: Aug 29 23:25:04 ip-172-18-4-224 atomic-openshift-master: I0829 23:25:04.777807 11569 controller.go:618] volume "pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611" entered phase "Failed" Aug 29 23:25:04 ip-172-18-4-224 atomic-openshift-master: I0829 23:25:04.790657 11569 controller.go:1079] isVolumeReleased[pvc-b3da0e4c-6e60-11e6-b1d8-0e1d1ce05611]: volume is released Aug 29 23:25:05 ip-172-18-4-224 atomic-openshift-master: I0829 23:25:05.154037 11569 aws_util.go:51] Error deleting EBS Disk volume aws://us-east-1d/vol-2658d981: error deleting EBS volumes: VolumeInUse: Volume vol-2658d981 is currently attached to i-09ba4911
Cinder has same issue. https://github.com/kubernetes/kubernetes/issues/31511
I admit the PV enters Failed state, but it should "self-heal" ~30 seconds after the volume is detached from all nodes. Lowering the priority. #xtian/jhou, please confirm and raise priority if a volume is stuck in Failed state while it's detached from all nodes longer than 1 minute.
The issue is reproduced in openshift v3.4.0.19+346a31d kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0
sorry, I forgot to create Origin PR, now it's tracked as https://github.com/openshift/origin/pull/11746.
This is passed on openshift v3.5.0.18+9a5d1aa kubernetes v1.5.2+43a9be4 etcd 3.1.0
@chaoyang, I forgot to mark it as MODIFIED, I am doing it right now. Please move it to VERIFIED if you think it's fixed.
Already test this bug on v3.5.0.18 and it is passed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1129