Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1489603 - Volume unmounted but not being detached from node
Volume unmounted but not being detached from node
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage (Show other bugs)
3.6.1
Unspecified Unspecified
unspecified Severity high
: ---
: 3.9.0
Assigned To: Hemant Kumar
chaoyang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-09-07 17:12 EDT by Hemant Kumar
Modified: 2018-03-28 10:06 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-03-28 10:06:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 10:06 EDT

  None (edit)
Description Hemant Kumar 2017-09-07 17:12:29 EDT
I am seeing many instances of volumes being unmounted but not detached from node on various Openshift clusters. Need to find out why is it happening.
Comment 5 Hemant Kumar 2017-09-14 15:57:58 EDT
As I have stated above, the root cause of this bug was:

1. User created a pod with volume, but volume was stuck in "attaching" state for more than 1 hour.
2. AttachDetach Controller gave up after certain time and this volume was not added to actual_state_of_World of A/D Controller.
3. Eventually attach succeeds but A/D controller no longer knows about this volume. 

Obviously the main thing is - volume shouldn't have been stuck in attaching state for such a long time. We have to work with Amazon to find solution for that problem.
Comment 9 David Caldwell 2017-10-16 09:04:34 EDT
Hey guys, 

Any updates on this issue? 

Is there a workaround?

Thanks,

David.
Comment 10 Hemant Kumar 2017-10-16 14:01:14 EDT
Each instance of this problem is caused by different underlying problem. Can you give some more details about customer's problem?

This bug I opened is caused by - a volume being stuck in "attaching" state too long and then user deletes the pod while waiting for pod to come up. Volume attach eventually succeeds but because attach finishes outside the expiry Window of attach/detach controller, it doesn't know about the volume and hence it never gets detached.

I am not sure if incident you linked is same as what I outlined above. It may be that symptoms are similar from outside but root cause can be different.

I would request you to open a new bug with following details:

1. PV & PVC yaml
2. output of describe pv and pvc
3. Node logs where this happened.
4. Controller log during same time period.
Comment 11 Hemant Kumar 2017-12-19 20:48:15 EST
We have opened a PR against Openshift-3.8 which will cause all dangling volumes to correct itself - https://github.com/openshift/origin/pull/17544

Specific commit that includes the fix is - https://github.com/openshift/origin/pull/17544/commits/2885375c4d0f1738dc45a013e11d64d638f0f050
Comment 13 Hemant Kumar 2018-01-18 17:55:46 EST
Yes that is fine. The fix has been merged in 3.9. Moving to modified.
Comment 15 chaoyang 2018-01-24 02:41:06 EST
This is passed on 
oc v3.9.0-0.23.0
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-14-251.ec2.internal:443
openshift v3.9.0-0.23.0
kubernetes v1.9.1+a0ce1bc657


1. Make sure the pod is in ContainerCreating due to volume could not attach
[root@ip-172-18-14-251 ~]# oc get pods
NAME      READY     STATUS              RESTARTS   AGE
mypod1    0/1       ContainerCreating   0          1h
2. Let the pod become running
[root@ip-172-18-14-251 ~]# oc get pods
NAME      READY     STATUS    RESTARTS   AGE
mypod1    1/1       Running   0          1h
3. Delete the pod, check volume is detached and become available
Comment 18 errata-xmlrpc 2018-03-28 10:06:20 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.