Bug 1339295 - cinder disk doesn't get detached
Summary: cinder disk doesn't get detached
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: hchen
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-24 15:18 UTC by Miheer Salunke
Modified: 2020-01-17 15:46 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, containerized installations of OCP would fail to detach cinder volumes due to the method used to find the volume. OCP has been updated to correctly detach cinder volumes when running in a containerized environment.
Clone Of:
: 1359720 (view as bug list)
Environment:
Last Closed: 2016-08-11 18:38:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1608 0 normal SHIPPED_LIVE Red Hat OpenShift Enterprise 3.2.1.13 bug fix and enhancement update 2016-08-11 22:37:53 UTC

Description Miheer Salunke 2016-05-24 15:18:12 UTC
When a pod is destroyed the cinder disk does not get detached from the instance.



May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.690529   28793 kubelet.go:2443] SyncLoop (SYNC): 1 pods; jenkins-1-4pw73_maci(7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0)
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.690575   28793 kubelet.go:3258] Generating status for "jenkins-1-4pw73_maci(7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0)"
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.690602   28793 kubelet.go:3225] pod waiting > 0, pending
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.690702   28793 manager.go:277] Ignoring same status for pod "jenkins-1-4pw73_maci(7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0)", status: {Phase:Pending Conditions:[{Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2016-05-18 19:03:44 +0200 CEST Reason:ContainersNotReady Message:containers with unready status: [jenkins]}] Message: Reason: HostIP:192.168.192.113 PodIP: StartTime:2016-05-18 19:03:44 +0200 CEST ContainerStatuses:[{Name:jenkins State:{Waiting:0xc2097a8560 Running:<nil> Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:false RestartCount:0 Image:registry.access.redhat.com/openshift3/jenkins-1-rhel7:latest ImageID: ContainerID:}]}
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.695057   28793 cinder.go:235] Cinder SetUp ee0a4fde-97c2-42dc-bc2c-c612cf85a963 to /var/lib/origin/openshift.local.volumes/pods/7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0/volumes/kubernetes.io~cinder/pv-cinder-dzpgg
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.695073   28793 keymutex.go:49] LockKey(...) called for id "ee0a4fde-97c2-42dc-bc2c-c612cf85a963"
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.695082   28793 keymutex.go:52] LockKey(...) for id "ee0a4fde-97c2-42dc-bc2c-c612cf85a963" completed.
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.695095   28793 nsenter_mount.go:175] findmnt: directory /var/lib/origin/openshift.local.volumes/pods/7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0/volumes/kubernetes.io~cinder/pv-cinder-dzpgg does not exist
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.760611   28793 openstack.go:1037] ee0a4fde-97c2-42dc-bc2c-c612cf85a963 kubernetes-dynamic-pv-cinder-dzpgg [map[server_id:fb40d180-2733-4208-8ea3-b9258b9d35bc attachment_id:78c1f486-928c-4b8d-9b3f-c3e4f2e1b68a host_name:<nil> volume_id:ee0a4fde-97c2-42dc-bc2c-c612cf85a963 device:/dev/vdc id:ee0a4fde-97c2-42dc-bc2c-c612cf85a963]]
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: E0518 19:06:11.760649   28793 openstack.go:972] Disk "ee0a4fde-97c2-42dc-bc2c-c612cf85a963" is attached to a different compute: "fb40d180-2733-4208-8ea3-b9258b9d35bc", should be detached before proceeding
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.760657   28793 cinder.go:252] AttachDisk failed: Disk "ee0a4fde-97c2-42dc-bc2c-c612cf85a963" is attached to a different compute: "fb40d180-2733-4208-8ea3-b9258b9d35bc", should be detached before proceeding
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.760663   28793 keymutex.go:58] UnlockKey(...) called for id "ee0a4fde-97c2-42dc-bc2c-c612cf85a963"
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.760668   28793 keymutex.go:65] UnlockKey(...) for id. Mutex found, trying to unlock it. "ee0a4fde-97c2-42dc-bc2c-c612cf85a963"
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.760673   28793 keymutex.go:68] UnlockKey(...) for id "ee0a4fde-97c2-42dc-bc2c-c612cf85a963" completed.
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: E0518 19:06:11.760706   28793 kubelet.go:1796] Unable to mount volumes for pod "jenkins-1-4pw73_maci(7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0)": Disk "ee0a4fde-97c2-42dc-bc2c-c612cf85a963" is attached to a different compute: "fb40d180-2733-4208-8ea3-b9258b9d35bc", should be detached before proceeding; skipping pod
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: E0518 19:06:11.760716   28793 pod_workers.go:138] Error syncing pod 7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0, skipping: Disk "ee0a4fde-97c2-42dc-bc2c-c612cf85a963" is attached to a different compute: "fb40d180-2733-4208-8ea3-b9258b9d35bc", should be detached before proceeding
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.760736   28793 server.go:606] Event(api.ObjectReference{Kind:"Pod", Namespace:"maci", Name:"jenkins-1-4pw73", UID:"7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0", APIVersion:"v1", ResourceVersion:"6481", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Unable to mount volumes for pod "jenkins-1-4pw73_maci(7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0)": Disk "ee0a4fde-97c2-42dc-bc2c-c612cf85a963" is attached to a different compute: "fb40d180-2733-4208-8ea3-b9258b9d35bc", should be detached before proceeding
May 18 19:06:11 node1.openshift.com atomic-openshift-node[28748]: I0518 19:06:11.760775   28793 server.go:606] Event(api.ObjectReference{Kind:"Pod", Namespace:"maci", Name:"jenkins-1-4pw73", UID:"7a9f5e89-1d1a-11e6-b3dc-fa163e5e26b0", APIVersion:"v1", ResourceVersion:"6481", FieldPath:""}): type: 'Warning' reason: 'FailedSync' Error syncing pod, skipping: Disk "ee0a4fde-97c2-42dc-bc2c-c612cf85a963" is attached to a different compute: "fb40d180-2733-4208-8ea3-b9258b9d35bc", should be detached before proceeding

Version

oc v3.2.0.20
kubernetes v1.2.0-36-g4a3f9c5
Steps To Reproduce

    set up containerized OSE on RHEL-AH on top of OSP
    use cinder volume in OSE for example with jenkins-persistent
    works fine
    scale down to 0
    scale up to 1
    pod cant start on a different node

Current Result

pod cant start on a different node
Expected Result

pod starts on different node, uses cinder volume

Comment 1 Miheer Salunke 2016-05-24 15:20:50 UTC
Upstream issue filed- https://github.com/openshift/origin/issues/8926

Comment 2 hchen 2016-05-25 13:58:59 UTC
Disk detach is triggered by pod deletion. If the pod is not removed, the disk is still attached to the node. 

Before you scale up to 1, can you check if the Pod is completely removed? The Pod is not immediately removed when it is told to scale down to 0.

Comment 4 hchen 2016-05-25 15:00:40 UTC
Yes, oc get pods

Comment 9 Marcel Wysocki 2016-05-30 08:06:17 UTC
@hchen, do you need any more information for this ?

Comment 14 hchen 2016-06-14 22:41:50 UTC
Containerize kubelet failed to unmount/detach Cinder volume.
fix proposed to upstream kubernetes
https://github.com/kubernetes/kubernetes/pull/27380

Comment 17 Marcel Wysocki 2016-06-28 10:23:36 UTC
https://github.com/kubernetes/kubernetes/pull/28018

Can we get this into an errata release?

Comment 18 Troy Dawson 2016-07-22 19:39:08 UTC
Can we get a seperate bugzilla for OSE 3.2 versus 3.3.

This has been merged and is in OSE v3.3.0.9 or newer.

Comment 19 Jianwei Hou 2016-07-25 10:53:53 UTC
Verified this on 

oc v3.3.0.9
kubernetes v1.3.0+57fb9ac

Using containerized installation.The cinder volume is detached from the node when the pod is deletedor scale(PV and PVC remaining). Viewing from the openstack console, the volume has 'Available' status.

@tdawson I've cloned this as a 3.3 bug and verified it there(bug 1359720).I'll test that this is fixed in 3.2 as well once we have the fix.

Comment 20 hchen 2016-07-25 14:37:13 UTC
can we close it now?

Comment 27 Jianwei Hou 2016-07-27 07:17:13 UTC
Verified on containerized installation of

openshift v3.2.1.12
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

The cinder volume is detached from the node when the pod is deleted or scaled to 0.

Comment 28 Scott Dodson 2016-08-09 16:00:55 UTC
This bug should be in VERIFIED state as the errata has not yet been released. I cannot set that state so I'm setting ON_QA so that QE can move it to VERIFIED. This will be released as part of v3.2.1.13.

Comment 31 errata-xmlrpc 2016-08-11 18:38:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1608


Note You need to log in before you can comment on or make changes to this bug.