Bug 1493483 - Pod's status is "Terminating" for a long time.
Summary: Pod's status is "Terminating" for a long time.
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.7.0
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
: ---
Assignee: Bradley Childs
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-20 10:10 UTC by Qin Ping
Modified: 2018-01-02 18:50 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-02 18:50:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Qin Ping 2017-09-20 10:10:24 UTC
Description of problem:
Delete a Pod which uses a local storage(directory not mountpoint), the Pod's status is "Terminating" after about 20 minutes.


Version-Release number of selected component (if applicable):
openshift v3.7.0-0.126.4
kubernetes v1.7.0+80709908fd

How reproducible:
Always

Steps to Reproduce:
1. On one onde, create 2 directories(/mnt/disks/vol1, /mnt/disks/vol2).
2. Ensure 2 PVs was created for vol1, vol2
3. Create a PVC using one of the PVs created step forward.
4. Create a Pod use PVC created step forward.
5. Read/write from/to Pods.
$ oc exec local-volume-pod-2 -- touch /mnt/local/testfile1
$ oc exec local-volume-pod-2 -- touch /mnt/local/testfiles
6. Delete Pod
$ oc delete pod local-volume-pod-2


Actual results:
The Pod's status is "Terminatiing" after about 20 minutes.

Expected results:
Pod was deleted successfully after a few minitues.

Master Log:

Node Log (of failed PODs):
Sep 20 05:51:40 host-8-241-40 atomic-openshift-node: W0920 05:51:40.751977    1742 util.go:87] Warning: "/var/lib/origin/openshift.local.volumes/pods/063cf03e-9de9-11e7-9400-fa163e501ea4/volumes/kubernetes.io~local-volume/local-pv-b77559f6" is not a mountpoint, deleting
Sep 20 05:51:40 host-8-241-40 atomic-openshift-node: E0920 05:51:40.752027    1742 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6\" (\"063cf03e-9de9-11e7-9400-fa163e501ea4\")" failed. No retries permitted until 2017-09-20 05:51:42.752007921 -0400 EDT (durationBeforeRetry 2s). Error: UnmountVolume.TearDown failed for volume "pvol" (UniqueName: "kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6") pod "063cf03e-9de9-11e7-9400-fa163e501ea4" (UID: "063cf03e-9de9-11e7-9400-fa163e501ea4") : remove /var/lib/origin/openshift.local.volumes/pods/063cf03e-9de9-11e7-9400-fa163e501ea4/volumes/kubernetes.io~local-volume/local-pv-b77559f6: device or resource busy
Sep 20 05:51:42 host-8-241-40 atomic-openshift-node: I0920 05:51:42.756039    1742 reconciler.go:186] operationExecutor.UnmountVolume started for volume "pvol" (UniqueName: "kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6") pod "063cf03e-9de9-11e7-9400-fa163e501ea4" (UID: "063cf03e-9de9-11e7-9400-fa163e501ea4")
Sep 20 05:51:42 host-8-241-40 atomic-openshift-node: W0920 05:51:42.756188    1742 util.go:87] Warning: "/var/lib/origin/openshift.local.volumes/pods/063cf03e-9de9-11e7-9400-fa163e501ea4/volumes/kubernetes.io~local-volume/local-pv-b77559f6" is not a mountpoint, deleting
Sep 20 05:51:42 host-8-241-40 atomic-openshift-node: E0920 05:51:42.756248    1742 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6\" (\"063cf03e-9de9-11e7-9400-fa163e501ea4\")" failed. No retries permitted until 2017-09-20 05:51:46.756216316 -0400 EDT (durationBeforeRetry 4s). Error: UnmountVolume.TearDown failed for volume "pvol" (UniqueName: "kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6") pod "063cf03e-9de9-11e7-9400-fa163e501ea4" (UID: "063cf03e-9de9-11e7-9400-fa163e501ea4") : remove /var/lib/origin/openshift.local.volumes/pods/063cf03e-9de9-11e7-9400-fa163e501ea4/volumes/kubernetes.io~local-volume/local-pv-b77559f6: device or resource busy
Sep 20 05:51:44 host-8-241-40 atomic-openshift-node: I0920 05:51:44.221594    1742 openstack_instances.go:45] Claiming to support Instances
Sep 20 05:51:45 host-8-241-40 atomic-openshift-node: W0920 05:51:45.665453    1742 helpers.go:771] eviction manager: no observation found for eviction signal allocatableNodeFs.available
Sep 20 05:51:46 host-8-241-40 atomic-openshift-node: I0920 05:51:46.763666    1742 reconciler.go:186] operationExecutor.UnmountVolume started for volume "pvol" (UniqueName: "kubernetes.io/local-volume/063cf03e-9de9-11e7-9400-fa163e501ea4-local-pv-b77559f6") pod "063cf03e-9de9-11e7-9400-fa163e501ea4" (UID: "063cf03e-9de9-11e7-9400-fa163e501ea4")


PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
$ oc get pod
NAME                                       READY     STATUS        RESTARTS   AGE
local-volume-pod-2                         0/1       Terminating   0          2m
local-volume-provisioner-5lfv2             1/1       Running       1          52m
local-volume-provisioner-bootstrap-4bwgb   0/1       Completed     0          52m
local-volume-provisioner-bvgcp             1/1       Running       0          52m
local-volume-provisioner-ghs53             1/1       Running       0          52m

$ oc get pod
NAME                                       READY     STATUS        RESTARTS   AGE
local-volume-pod-2                         0/1       Terminating   0          23m
local-volume-provisioner-5lfv2             1/1       Running       1          1h
local-volume-provisioner-bootstrap-4bwgb   0/1       Completed     0          1h
local-volume-provisioner-bvgcp             1/1       Running       0          1h
local-volume-provisioner-ghs53             1/1       Running       0          1h

Comment 1 Bradley Childs 2017-10-05 21:59:45 UTC
From the logs provided, there are API quota errors... Which doesn't make sense if these are EBS PVs..  

Can you please provide the full PVs, PVCs and POD yaml?

Comment 2 Qin Ping 2017-10-09 05:49:26 UTC
Can not reproduce this bug on the version:
openshift v3.7.0-0.144.2
kubernetes v1.7.6+a08f5eeb62

Comment 3 Bradley Childs 2018-01-02 18:50:09 UTC
Closing per Qin Ping's comment.


Note You need to log in before you can comment on or make changes to this bug.