Bug 1279485 - ISCSI session not disconnected because of credentials mismatch when not using CHAP
Summary: ISCSI session not disconnected because of credentials mismatch when not using...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.17.10
Hardware: x86_64
OS: Unspecified
high
high
Target Milestone: ovirt-3.6.2
: 4.17.16
Assignee: Nir Soffer
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks: 1140850
TreeView+ depends on / blocked
 
Reported: 2015-11-09 14:45 UTC by Elad
Modified: 2016-03-10 15:27 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-02-18 11:11:57 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-3.6.z+
rule-engine: blocker+
ylavi: planning_ack+
amureini: devel_ack+
acanan: testing_ack+


Attachments (Terms of Use)
/var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.5 host + /var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.6 hos (6.72 MB, application/x-gzip)
2015-11-09 14:45 UTC, Elad
no flags Details
3.5 and 3.6 engine logs (398.79 KB, application/x-gzip)
2015-11-09 15:13 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 48483 0 master MERGED iscsi: Fix credentials initialization when not using CHAP Never
oVirt gerrit 50939 0 ovirt-3.6 MERGED issci: Fix equality and hash when username or password are None 2016-01-03 19:28:26 UTC
oVirt gerrit 50940 0 ovirt-3.6 MERGED iscsi: Fix credentials initialization when not using CHAP 2016-01-03 19:28:39 UTC

Description Elad 2015-11-09 14:45:23 UTC
Created attachment 1091766 [details]
/var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.5 host + /var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.6 hos

Description of problem:
Host doesn't disconnect from the iSCSI sessions when deactivating the last iSCSI domain in the pool or when putting the host in maintenance.

Tested the same flow on rhevm-3.5.6 and the sessions are disconnected on host/domain maintenance.

Version-Release number of selected component (if applicable):
vdsm-4.17.10.1-0.el7ev.noarch
iscsi-initiator-utils-6.2.0.873-32.el7.x86_64
rhevm-3.6.0.3-0.1.el6.noarch

How reproducible:
Occurs on 2 different 3.6 setups. Not on 3.5.6

Steps to Reproduce:
1. Put host in maintenance while it is connected to a pool with an iSCSI domain


Actual results:
Host changes to maintenance status but the sessions remain open.


disconnectStorageServer is called:


Thread-2245::INFO::2015-11-09 13:21:25,314::logUtils::48::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=3, spUUID=u'00000001-0001-0001-0001-000000000004', conList=[{u'id': u'bc9ed0f2-893a-4767-bd6a-0e4b5d5675cf', u'connection': u'10.35.160.106', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.b6', u'user': u'', u'tpgt': u'3', u'password': '********', u'port': u'3260'}, {u'id': u'8d3a491d-0a4a-4f22-8c7c-f6b12c3fca15', u'connection': u'10.35.160.105', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.a7', u'user': u'', u'tpgt': u'2', u'password': '********', u'port': u'3260'}, {u'id': u'2897a350-48bc-4153-a43e-5602532b2a7c', u'connection': u'10.35.160.104', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.a6', u'user': u'', u'tpgt': u'1', u'password': '********', u'port': u'3260'}, {u'id': u'7c6450c3-5829-447c-ba97-a2cad9b8d030', u'connection': u'10.35.160.107', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.b7', u'user': u'', u'tpgt': u'4', u'password': '********', u'port': u'3260'}], options=None)



'Session mismatches':

Thread-2245::DEBUG::2015-11-09 13:21:25,406::storageServer::525::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ["target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.105'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.107'", 'cred mismatch', "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.104'"]
Thread-2245::DEBUG::2015-11-09 13:21:25,407::storageServer::543::Storage.Server.ISCSI::(disconnect) not connected
Thread-2245::INFO::2015-11-09 13:21:25,407::storageServer::538::Storage.Server.ISCSI::(disconnect) disconnecting
Thread-2245::DEBUG::2015-11-09 13:21:25,500::storageServer::525::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ['cred mismatch', "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.107'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.104'"]
Thread-2245::DEBUG::2015-11-09 13:21:25,500::storageServer::543::Storage.Server.ISCSI::(disconnect) not connected
Thread-2245::INFO::2015-11-09 13:21:25,500::storageServer::538::Storage.Server.ISCSI::(disconnect) disconnecting
Thread-2245::DEBUG::2015-11-09 13:21:25,589::storageServer::525::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ["target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.105'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.107'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.225'", 'cred mismatch']
Thread-2245::DEBUG::2015-11-09 13:21:25,589::storageServer::543::Storage.Server.ISCSI::(disconnect) not connected
Thread-2245::INFO::2015-11-09 13:21:25,589::storageServer::538::Storage.Server.ISCSI::(disconnect) disconnecting
Thread-2245::DEBUG::2015-11-09 13:21:25,680::storageServer::525::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ["target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.105'", 'cred mismatch', "target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.104'"]
Thread-2245::DEBUG::2015-11-09 13:21:25,680::storageServer::543::Storage.Server.ISCSI::(disconnect) not connected


Expected results:
Host should disconnect the iSCSI session on host / last iSCSI domain deactivation.

Additional info: 
/var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.5 host
+
/var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.6 host


***Note that on the 3.5 host, after disconnectStorageServer, 3 sessions remain open, 4 get closed***


===============================
disconnectStorageServer on 3.5 host:

Thread-3247::INFO::2015-11-09 16:42:43,071::logUtils::44::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=3, spUUID='b7014a1f-f5a7-4dfb-bb0f-491dd19596c0', conList=[{'connection': '10.35.160
.106', 'iqn': 'iqn.1992-04.com.emc:cx.ckm00121000438.b6', 'user': '', 'tpgt': '3', 'password': '******', 'id': '14fcf820-85e0-4264-8781-f9b4772948ed', 'port': '3260'}, {'connection': '10.35.160.104', 'iqn': 'iqn.1
992-04.com.emc:cx.ckm00121000438.a6', 'user': '', 'tpgt': '1', 'password': '******', 'id': '60932489-2a7c-4f69-85ea-07ae1a45b6e4', 'port': '3260'}, {'connection': '10.35.160.105', 'iqn': 'iqn.1992-04.com.emc:cx.ck
m00121000438.a7', 'user': '', 'tpgt': '2', 'password': '******', 'id': 'c50f1766-f95d-47a5-b570-56514ec1100e', 'port': '3260'}, {'connection': '10.35.160.107', 'iqn': 'iqn.1992-04.com.emc:cx.ckm00121000438.b7', 'u
ser': '', 'tpgt': '4', 'password': '******', 'id': 'd9518d62-5e44-476c-add5-d8f7430516d7', 'port': '3260'}], options=None)
Thread-3247::DEBUG::2015-11-09 16:42:43,124::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) /usr/bin/sudo -n /sbin/iscsiadm -m session -r 58 -u (cwd None)
Thread-3247::DEBUG::2015-11-09 16:42:43,185::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> = ''; <rc> = 0
Thread-3247::INFO::2015-11-09 16:42:43,185::iscsi::557::Storage.ISCSI::(setRpFilterIfNeeded) iSCSI iface.net_ifacename not provided. Skipping.
Thread-3247::DEBUG::2015-11-09 16:42:43,277::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) /usr/bin/sudo -n /sbin/iscsiadm -m session -r 62 -u (cwd None)
Thread-3247::DEBUG::2015-11-09 16:42:43,333::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> = ''; <rc> = 0
Thread-3247::INFO::2015-11-09 16:42:43,334::iscsi::557::Storage.ISCSI::(setRpFilterIfNeeded) iSCSI iface.net_ifacename not provided. Skipping.
Thread-3247::DEBUG::2015-11-09 16:42:43,359::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) /usr/bin/sudo -n /sbin/iscsiadm -m session -r 56 -u (cwd None)
Thread-3247::DEBUG::2015-11-09 16:42:43,433::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> = ''; <rc> = 0
Thread-3247::INFO::2015-11-09 16:42:43,434::iscsi::557::Storage.ISCSI::(setRpFilterIfNeeded) iSCSI iface.net_ifacename not provided. Skipping.
Thread-3247::DEBUG::2015-11-09 16:42:43,481::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) /usr/bin/sudo -n /sbin/iscsiadm -m session -r 60 -u (cwd None)
Thread-3247::DEBUG::2015-11-09 16:42:43,552::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> = ''; <rc> = 0

Comment 1 Elad 2015-11-09 15:13:43 UTC
Created attachment 1091793 [details]
3.5 and 3.6 engine logs

Comment 2 Tal Nisan 2015-11-10 14:01:16 UTC
Elad, it looks a bit similar to bug 1146115, does it happen in the same scenario or also when there are no other consumers for the same connections?

Comment 3 Elad 2015-11-10 17:27:53 UTC
According to 1146115#c19, it is for a scenario of more than one iscsi domain in the DC while here I found that the sessions remain open with a single iscsi domain.

Comment 4 Red Hat Bugzilla Rules Engine 2015-11-11 11:47:36 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 5 Nir Soffer 2015-11-12 09:31:50 UTC
Looking in 3_6 session before the disconnect:

$ egrep 'Target:|Portal:' iscsiadm_sessions_before_disconnectStorageServer 

Target: iqn.1992-04.com.emc:cx.ckm00121000438.a7 (non-flash)
	Current Portal: 10.35.160.105:3260,2
	Persistent Portal: 10.35.160.105:3260,2
Target: iqn.1992-04.com.emc:cx.ckm00121000438.b7 (non-flash)
	Current Portal: 10.35.160.107:3260,4
	Persistent Portal: 10.35.160.107:3260,4
Target: iqn.1992-04.com.emc:cx.ckm00121000438.b6 (non-flash)
	Current Portal: 10.35.160.106:3260,3
	Persistent Portal: 10.35.160.106:3260,3
Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 (non-flash)
	Current Portal: 10.35.146.129:3260,1
	Persistent Portal: 10.35.146.129:3260,1
Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c01 (non-flash)
	Current Portal: 10.35.146.161:3260,1
	Persistent Portal: 10.35.146.161:3260,1
Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c04 (non-flash)
	Current Portal: 10.35.146.193:3260,1
	Persistent Portal: 10.35.146.193:3260,1
Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 (non-flash)
	Current Portal: 10.35.146.225:3260,1
	Persistent Portal: 10.35.146.225:3260,1
Target: iqn.1992-04.com.emc:cx.ckm00121000438.a6 (non-flash)
	Current Portal: 10.35.160.104:3260,1
	Persistent Portal: 10.35.160.104:3260,1

Engines asked to disconnect these connections:

[{u'connection': u'10.35.160.106',
  u'id': u'bc9ed0f2-893a-4767-bd6a-0e4b5d5675cf',
  u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.b6',
  u'password': '********',
  u'port': u'3260',
  u'tpgt': u'3',
  u'user': u''},
 {u'connection': u'10.35.160.105',
  u'id': u'8d3a491d-0a4a-4f22-8c7c-f6b12c3fca15',
  u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.a7',
  u'password': '********',
  u'port': u'3260',
  u'tpgt': u'2',
  u'user': u''},
 {u'connection': u'10.35.160.104',
  u'id': u'2897a350-48bc-4153-a43e-5602532b2a7c',
  u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.a6',
  u'password': '********',
  u'port': u'3260',
  u'tpgt': u'1',
  u'user': u''},
 {u'connection': u'10.35.160.107',
  u'id': u'7c6450c3-5829-447c-ba97-a2cad9b8d030',
  u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.b7',
  u'password': '********',
  u'port': u'3260',
  u'tpgt': u'4',
  u'user': u''}]

Searching for sessions, we had these mismatches:

Trying to disconnect 10.35.160.106:

["target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.105'",
 "target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.107'",
 'cred mismatch',
 "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.129'",
 "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.161'",
 "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.193'",
 "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.225'",
 "target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.104'"]

Trying to disconnect 10.35.160.105:

['cred mismatch',
 "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.107'",
 "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.106'",
 "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.129'",
 "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.161'",
 "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.193'",
 "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.225'",
 "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.104'"]

Trying to disconnect 10.35.160.104:

["target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.105'",
 "target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.107'",
 "target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.106'",
 "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.129'",
 "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.161'",
 "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.193'",
 "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.225'",
 'cred mismatch']

Trying to disconnect 10.35.160.107:

["target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.105'",
 'cred mismatch',
 "target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.106'",
 "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.129'",
 "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.161'",
 "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.193'",
 "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.225'",
 "target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.104'"]

We can clearly see that each connection is matched by target.portal.hostname
since each address we look for exits in all other attempts.

So it seems that the issue is caused by "cred mismatch".

The issue is engine user and password and stored username and passwords are
hidden (to prevent persisting of sensitive data in the logs).

To debug this, we need to add logs revealing both the sent user and password
values and the stored values we get from the active session, and repeat this
test.

Elad, would you like to try to add these logs and repeat the test?

Comment 6 Nir Soffer 2015-11-12 10:42:49 UTC
Elad, can you update version field with the version of vdsm you tested?

Comment 7 Aharon Canan 2015-11-12 11:40:46 UTC
(In reply to Nir Soffer from comment #6)
> Elad, can you update version field with the version of vdsm you tested?

You have it in the description already...

Comment 8 Nir Soffer 2015-11-12 14:53:20 UTC
Caused by mismatch in the way we initialize credentials when not using CHAP:

Adding more information when we have a cred mismatch show:

Thread-151::DEBUG::2015-11-12 15:26:25,236::storageServer::538::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ["cred mismatch u'':u'' != None:None", "target.portal.hostname mismatch: '10.35.146.129' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.146.129' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.146.129' != '10.35.146.225'"]

This info is *never* shown in vdsm log due to security reasons.

On the request side, we are using username='', and password='', and on the iscsi side, None, None.

Comment 9 Nir Soffer 2015-11-12 14:54:15 UTC
Tal, we need to backport this to 3.6 - do we need any flags/clones?

Comment 10 Tal Nisan 2015-11-15 09:49:33 UTC
No, all is set, feel free to backport

Comment 11 Allon Mureinik 2015-11-19 14:38:49 UTC
Returning to POST - this patch requires backporting to the 3.6 branch.

Comment 12 Elad 2016-01-17 15:10:51 UTC
Tested using:
rhevm-3.6.2.5-0.1.el6.noarch
vdsm-4.17.17-0.el7ev.noarch
iscsi-initiator-utils-6.2.0.873-32.el7.x86_64

iSCSI sessions are not disconnected while deactivating the last iSCSI storage domain in the pool. 

For the scenario of host maintenance while having an active iSCSI domain in the pool, the sessions are being closed.

Nir, this doesn't seem right to me. The sessions should be closed on last iSCSI domain deactivation.

Comment 13 Nir Soffer 2016-01-17 15:18:43 UTC
(In reply to Elad from comment #12)
> Tested using:
> rhevm-3.6.2.5-0.1.el6.noarch
> vdsm-4.17.17-0.el7ev.noarch
> iscsi-initiator-utils-6.2.0.873-32.el7.x86_64
> 
> iSCSI sessions are not disconnected while deactivating the last iSCSI
> storage domain in the pool. 
> 
> For the scenario of host maintenance while having an active iSCSI domain in
> the pool, the sessions are being closed.
> 
> Nir, this doesn't seem right to me. The sessions should be closed on last
> iSCSI domain deactivation.

This is not related to this bug, we have an another bug for managing iscsi
connections. This bug is *only* for the issue when not using chap.

If engine asked to disconnect a connection, it must be disconnected. If engine
did not ask to disconnect, it should not be disconnected (bug in engine).

Please check the logs and verify that vdsm disconnected everything engine ask
it to disconnect. Do we have the "cred mismatch" that we had before?

Ala, do you remember the engine bug about managing connection?

Comment 14 Ala Hino 2016-01-17 15:24:52 UTC
Bug 1146115

Comment 15 Elad 2016-01-17 15:38:10 UTC
Thanks Nir.

No iSCSI credentials mismatch while vdsm attempts to disconnect its iSCSI sessions. 

Relevant for host maintenance as described in comment #12.

Tested using:
rhevm-3.6.2.5-0.1.el6.noarch
vdsm-4.17.17-0.el7ev.noarch
iscsi-initiator-utils-6.2.0.873-32.el7.x86_64


Note You need to log in before you can comment on or make changes to this bug.