1419607 – iscsi logout even still other pods use iscsi on same node

Bug 1419607 - iscsi logout even still other pods use iscsi on same node

Summary: iscsi logout even still other pods use iscsi on same node

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	3.4.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.4.z
Assignee:	hchen
QA Contact:	Jianwei Hou
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-06 15:15 UTC by Aleks Lazic
Modified:	2020-04-15 15:13 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1426775 1426778 (view as bug list)
Environment:
Last Closed:	2017-03-15 20:02:35 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Origin (Github)	12990	0	None	None	None	2017-02-16 14:57:47 UTC
Red Hat Product Errata	RHBA-2017:0512	0	normal	SHIPPED_LIVE	OpenShift Container Platform 3.4.1.10, 3.3.1.17, and 3.2.1.28 bug fix update	2017-03-16 00:01:17 UTC

Description Aleks Lazic 2017-02-06 15:15:43 UTC

Description of problem:

We have created several iscsi luns and the pv's for this luns.
We have several projects which consumed this pv's via pvc.
The 2 pods runs on 2 of the 3 nodes. This means that 2 iscsi mounts was on 1 node.

On all this nodes was iscsi sessions.

We then have scaled one pod on the node with 2 iscsi mounts down.
Now the second pod crashes because the iscsi session was also gone.

It looks like that the DetachDisk in
https://github.com/openshift/origin/blob/85eb37b34f0657631592356d020cef5a58470f8e/vendor/k8s.io/kubernetes/pkg/volume/iscsi/iscsi_util.go#L165

Makes the logout to early.

Version-Release number of selected component (if applicable):

3.3

Comment 2 Aleks Lazic 2017-02-07 09:39:17 UTC

Hi.

the customer have take a deep look into the github sources and we have seen the following.

Beginning in the function DetachDisk is the fuction getDevicePrefixRefCount called

https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/volume/iscsi/iscsi_util.go#L165

which calls the function getDevicePrefixRefCount
https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/volume/iscsi/iscsi_util.go#L182
https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/volume/iscsi/iscsi_util.go#L66


which calls mounter.List()

https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/util/mount/mount_linux.go#L156

which calls listProcMounts
which calls readProcMounts twice
which calls readProcMountsFrom

https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/util/mount/mount_linux.go#L284

Here are this lines

###
hash := adler32.New()
...
*out = append(*out, mp)
...
return hash.Sum32(), nil
###

which creates the hashes which are compared in

https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/util/mount/mount_linux.go#L264

Based on these comparison result will the iscsiadm logout called.

https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/volume/iscsi/iscsi_util.go#L192

We think now that the logout is called even if there still some mount on this node from running pods.

Comment 15 Troy Dawson 2017-02-21 22:31:11 UTC

This bug needs to be cloned twice.  We need one bug for OCP 3.5, 3.4 and 3.3

Comment 16 Troy Dawson 2017-02-24 20:26:01 UTC

This has been merged into ocp and is in OCP v3.4.1.8 or newer.

Comment 18 Jianwei Hou 2017-02-27 06:03:24 UTC

Verified on 
openshift v3.3.1.15
kubernetes v1.3.0+52492b4
etcd 2.3.0+git

Steps:1
1. Create two Pods on same nodes and mounts same iscsi volume over same session. Make sure their only difference is the LUN number.
2. Verify Pods  are both running.
3. Delete one of the Pods.
4. The remaining Pod is still running.

Comment 19 Jianwei Hou 2017-02-27 06:05:04 UTC

Also verified on 
openshift v3.4.1.8
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

Comment 21 errata-xmlrpc 2017-03-15 20:02:35 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0512

Note You need to log in before you can comment on or make changes to this bug.