Bug 1493483

Summary: Pod's status is "Terminating" for a long time.
Product: OpenShift Container Platform Reporter: Qin Ping <piqin>
Component: StorageAssignee: Bradley Childs <bchilds>
Status: CLOSED WORKSFORME QA Contact: Jianwei Hou <jhou>
Severity: low Docs Contact:
Priority: unspecified    
Version: 3.7.0CC: aos-bugs, aos-storage-staff
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-02 18:50:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qin Ping 2017-09-20 10:10:24 UTC
Description of problem:
Delete a Pod which uses a local storage(directory not mountpoint), the Pod's status is "Terminating" after about 20 minutes.


Version-Release number of selected component (if applicable):
openshift v3.7.0-0.126.4
kubernetes v1.7.0+80709908fd

How reproducible:
Always

Steps to Reproduce:
1. On one onde, create 2 directories(/mnt/disks/vol1, /mnt/disks/vol2).
2. Ensure 2 PVs was created for vol1, vol2
3. Create a PVC using one of the PVs created step forward.
4. Create a Pod use PVC created step forward.
5. Read/write from/to Pods.
$ oc exec local-volume-pod-2 -- touch /mnt/local/testfile1
$ oc exec local-volume-pod-2 -- touch /mnt/local/testfiles
6. Delete Pod
$ oc delete pod local-volume-pod-2


Actual results:
The Pod's status is "Terminatiing" after about 20 minutes.

Expected results:
Pod was deleted successfully after a few minitues.

Master Log:

Node Log (of failed PODs):
Sep 20 05:51:40 host-8-241-40 atomic-openshift-node: W0920 05:51:40.751977    1742 util.go:87] Warning: "/var/lib/origin/openshift.local.volumes/pods/063cf03e-9de9-11e7-9400-fa163e501ea4/volumes/kubernetes.io~local-volume/local-pv-b77559f6" is not a mountpoint, deleting
Sep 20 05:51:40 host-8-241-40 atomic-openshift-node: E0920 05:51:40.752027    1742 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6\" (\"063cf03e-9de9-11e7-9400-fa163e501ea4\")" failed. No retries permitted until 2017-09-20 05:51:42.752007921 -0400 EDT (durationBeforeRetry 2s). Error: UnmountVolume.TearDown failed for volume "pvol" (UniqueName: "kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6") pod "063cf03e-9de9-11e7-9400-fa163e501ea4" (UID: "063cf03e-9de9-11e7-9400-fa163e501ea4") : remove /var/lib/origin/openshift.local.volumes/pods/063cf03e-9de9-11e7-9400-fa163e501ea4/volumes/kubernetes.io~local-volume/local-pv-b77559f6: device or resource busy
Sep 20 05:51:42 host-8-241-40 atomic-openshift-node: I0920 05:51:42.756039    1742 reconciler.go:186] operationExecutor.UnmountVolume started for volume "pvol" (UniqueName: "kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6") pod "063cf03e-9de9-11e7-9400-fa163e501ea4" (UID: "063cf03e-9de9-11e7-9400-fa163e501ea4")
Sep 20 05:51:42 host-8-241-40 atomic-openshift-node: W0920 05:51:42.756188    1742 util.go:87] Warning: "/var/lib/origin/openshift.local.volumes/pods/063cf03e-9de9-11e7-9400-fa163e501ea4/volumes/kubernetes.io~local-volume/local-pv-b77559f6" is not a mountpoint, deleting
Sep 20 05:51:42 host-8-241-40 atomic-openshift-node: E0920 05:51:42.756248    1742 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6\" (\"063cf03e-9de9-11e7-9400-fa163e501ea4\")" failed. No retries permitted until 2017-09-20 05:51:46.756216316 -0400 EDT (durationBeforeRetry 4s). Error: UnmountVolume.TearDown failed for volume "pvol" (UniqueName: "kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6") pod "063cf03e-9de9-11e7-9400-fa163e501ea4" (UID: "063cf03e-9de9-11e7-9400-fa163e501ea4") : remove /var/lib/origin/openshift.local.volumes/pods/063cf03e-9de9-11e7-9400-fa163e501ea4/volumes/kubernetes.io~local-volume/local-pv-b77559f6: device or resource busy
Sep 20 05:51:44 host-8-241-40 atomic-openshift-node: I0920 05:51:44.221594    1742 openstack_instances.go:45] Claiming to support Instances
Sep 20 05:51:45 host-8-241-40 atomic-openshift-node: W0920 05:51:45.665453    1742 helpers.go:771] eviction manager: no observation found for eviction signal allocatableNodeFs.available
Sep 20 05:51:46 host-8-241-40 atomic-openshift-node: I0920 05:51:46.763666    1742 reconciler.go:186] operationExecutor.UnmountVolume started for volume "pvol" (UniqueName: "kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6") pod "063cf03e-9de9-11e7-9400-fa163e501ea4" (UID: "063cf03e-9de9-11e7-9400-fa163e501ea4")


PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
$ oc get pod
NAME                                       READY     STATUS        RESTARTS   AGE
local-volume-pod-2                         0/1       Terminating   0          2m
local-volume-provisioner-5lfv2             1/1       Running       1          52m
local-volume-provisioner-bootstrap-4bwgb   0/1       Completed     0          52m
local-volume-provisioner-bvgcp             1/1       Running       0          52m
local-volume-provisioner-ghs53             1/1       Running       0          52m

$ oc get pod
NAME                                       READY     STATUS        RESTARTS   AGE
local-volume-pod-2                         0/1       Terminating   0          23m
local-volume-provisioner-5lfv2             1/1       Running       1          1h
local-volume-provisioner-bootstrap-4bwgb   0/1       Completed     0          1h
local-volume-provisioner-bvgcp             1/1       Running       0          1h
local-volume-provisioner-ghs53             1/1       Running       0          1h

Comment 1 Bradley Childs 2017-10-05 21:59:45 UTC
From the logs provided, there are API quota errors... Which doesn't make sense if these are EBS PVs..  

Can you please provide the full PVs, PVCs and POD yaml?

Comment 2 Qin Ping 2017-10-09 05:49:26 UTC
Can not reproduce this bug on the version:
openshift v3.7.0-0.144.2
kubernetes v1.7.6+a08f5eeb62

Comment 3 Bradley Childs 2018-01-02 18:50:09 UTC
Closing per Qin Ping's comment.