Created attachment 1091766 [details] /var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.5 host + /var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.6 hos Description of problem: Host doesn't disconnect from the iSCSI sessions when deactivating the last iSCSI domain in the pool or when putting the host in maintenance. Tested the same flow on rhevm-3.5.6 and the sessions are disconnected on host/domain maintenance. Version-Release number of selected component (if applicable): vdsm-4.17.10.1-0.el7ev.noarch iscsi-initiator-utils-6.2.0.873-32.el7.x86_64 rhevm-3.6.0.3-0.1.el6.noarch How reproducible: Occurs on 2 different 3.6 setups. Not on 3.5.6 Steps to Reproduce: 1. Put host in maintenance while it is connected to a pool with an iSCSI domain Actual results: Host changes to maintenance status but the sessions remain open. disconnectStorageServer is called: Thread-2245::INFO::2015-11-09 13:21:25,314::logUtils::48::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=3, spUUID=u'00000001-0001-0001-0001-000000000004', conList=[{u'id': u'bc9ed0f2-893a-4767-bd6a-0e4b5d5675cf', u'connection': u'10.35.160.106', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.b6', u'user': u'', u'tpgt': u'3', u'password': '********', u'port': u'3260'}, {u'id': u'8d3a491d-0a4a-4f22-8c7c-f6b12c3fca15', u'connection': u'10.35.160.105', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.a7', u'user': u'', u'tpgt': u'2', u'password': '********', u'port': u'3260'}, {u'id': u'2897a350-48bc-4153-a43e-5602532b2a7c', u'connection': u'10.35.160.104', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.a6', u'user': u'', u'tpgt': u'1', u'password': '********', u'port': u'3260'}, {u'id': u'7c6450c3-5829-447c-ba97-a2cad9b8d030', u'connection': u'10.35.160.107', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.b7', u'user': u'', u'tpgt': u'4', u'password': '********', u'port': u'3260'}], options=None) 'Session mismatches': Thread-2245::DEBUG::2015-11-09 13:21:25,406::storageServer::525::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ["target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.105'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.107'", 'cred mismatch', "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.104'"] Thread-2245::DEBUG::2015-11-09 13:21:25,407::storageServer::543::Storage.Server.ISCSI::(disconnect) not connected Thread-2245::INFO::2015-11-09 13:21:25,407::storageServer::538::Storage.Server.ISCSI::(disconnect) disconnecting Thread-2245::DEBUG::2015-11-09 13:21:25,500::storageServer::525::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ['cred mismatch', "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.107'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.104'"] Thread-2245::DEBUG::2015-11-09 13:21:25,500::storageServer::543::Storage.Server.ISCSI::(disconnect) not connected Thread-2245::INFO::2015-11-09 13:21:25,500::storageServer::538::Storage.Server.ISCSI::(disconnect) disconnecting Thread-2245::DEBUG::2015-11-09 13:21:25,589::storageServer::525::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ["target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.105'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.107'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.225'", 'cred mismatch'] Thread-2245::DEBUG::2015-11-09 13:21:25,589::storageServer::543::Storage.Server.ISCSI::(disconnect) not connected Thread-2245::INFO::2015-11-09 13:21:25,589::storageServer::538::Storage.Server.ISCSI::(disconnect) disconnecting Thread-2245::DEBUG::2015-11-09 13:21:25,680::storageServer::525::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ["target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.105'", 'cred mismatch', "target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.104'"] Thread-2245::DEBUG::2015-11-09 13:21:25,680::storageServer::543::Storage.Server.ISCSI::(disconnect) not connected Expected results: Host should disconnect the iSCSI session on host / last iSCSI domain deactivation. Additional info: /var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.5 host + /var/log/ + 'iscsiadm -m session -P3' output before and after disconnectStorageServer from 3.6 host ***Note that on the 3.5 host, after disconnectStorageServer, 3 sessions remain open, 4 get closed*** =============================== disconnectStorageServer on 3.5 host: Thread-3247::INFO::2015-11-09 16:42:43,071::logUtils::44::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=3, spUUID='b7014a1f-f5a7-4dfb-bb0f-491dd19596c0', conList=[{'connection': '10.35.160 .106', 'iqn': 'iqn.1992-04.com.emc:cx.ckm00121000438.b6', 'user': '', 'tpgt': '3', 'password': '******', 'id': '14fcf820-85e0-4264-8781-f9b4772948ed', 'port': '3260'}, {'connection': '10.35.160.104', 'iqn': 'iqn.1 992-04.com.emc:cx.ckm00121000438.a6', 'user': '', 'tpgt': '1', 'password': '******', 'id': '60932489-2a7c-4f69-85ea-07ae1a45b6e4', 'port': '3260'}, {'connection': '10.35.160.105', 'iqn': 'iqn.1992-04.com.emc:cx.ck m00121000438.a7', 'user': '', 'tpgt': '2', 'password': '******', 'id': 'c50f1766-f95d-47a5-b570-56514ec1100e', 'port': '3260'}, {'connection': '10.35.160.107', 'iqn': 'iqn.1992-04.com.emc:cx.ckm00121000438.b7', 'u ser': '', 'tpgt': '4', 'password': '******', 'id': 'd9518d62-5e44-476c-add5-d8f7430516d7', 'port': '3260'}], options=None) Thread-3247::DEBUG::2015-11-09 16:42:43,124::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) /usr/bin/sudo -n /sbin/iscsiadm -m session -r 58 -u (cwd None) Thread-3247::DEBUG::2015-11-09 16:42:43,185::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> = ''; <rc> = 0 Thread-3247::INFO::2015-11-09 16:42:43,185::iscsi::557::Storage.ISCSI::(setRpFilterIfNeeded) iSCSI iface.net_ifacename not provided. Skipping. Thread-3247::DEBUG::2015-11-09 16:42:43,277::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) /usr/bin/sudo -n /sbin/iscsiadm -m session -r 62 -u (cwd None) Thread-3247::DEBUG::2015-11-09 16:42:43,333::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> = ''; <rc> = 0 Thread-3247::INFO::2015-11-09 16:42:43,334::iscsi::557::Storage.ISCSI::(setRpFilterIfNeeded) iSCSI iface.net_ifacename not provided. Skipping. Thread-3247::DEBUG::2015-11-09 16:42:43,359::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) /usr/bin/sudo -n /sbin/iscsiadm -m session -r 56 -u (cwd None) Thread-3247::DEBUG::2015-11-09 16:42:43,433::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> = ''; <rc> = 0 Thread-3247::INFO::2015-11-09 16:42:43,434::iscsi::557::Storage.ISCSI::(setRpFilterIfNeeded) iSCSI iface.net_ifacename not provided. Skipping. Thread-3247::DEBUG::2015-11-09 16:42:43,481::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) /usr/bin/sudo -n /sbin/iscsiadm -m session -r 60 -u (cwd None) Thread-3247::DEBUG::2015-11-09 16:42:43,552::iscsiadm::97::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> = ''; <rc> = 0
Created attachment 1091793 [details] 3.5 and 3.6 engine logs
Elad, it looks a bit similar to bug 1146115, does it happen in the same scenario or also when there are no other consumers for the same connections?
According to 1146115#c19, it is for a scenario of more than one iscsi domain in the DC while here I found that the sessions remain open with a single iscsi domain.
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Looking in 3_6 session before the disconnect: $ egrep 'Target:|Portal:' iscsiadm_sessions_before_disconnectStorageServer Target: iqn.1992-04.com.emc:cx.ckm00121000438.a7 (non-flash) Current Portal: 10.35.160.105:3260,2 Persistent Portal: 10.35.160.105:3260,2 Target: iqn.1992-04.com.emc:cx.ckm00121000438.b7 (non-flash) Current Portal: 10.35.160.107:3260,4 Persistent Portal: 10.35.160.107:3260,4 Target: iqn.1992-04.com.emc:cx.ckm00121000438.b6 (non-flash) Current Portal: 10.35.160.106:3260,3 Persistent Portal: 10.35.160.106:3260,3 Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 (non-flash) Current Portal: 10.35.146.129:3260,1 Persistent Portal: 10.35.146.129:3260,1 Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c01 (non-flash) Current Portal: 10.35.146.161:3260,1 Persistent Portal: 10.35.146.161:3260,1 Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c04 (non-flash) Current Portal: 10.35.146.193:3260,1 Persistent Portal: 10.35.146.193:3260,1 Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 (non-flash) Current Portal: 10.35.146.225:3260,1 Persistent Portal: 10.35.146.225:3260,1 Target: iqn.1992-04.com.emc:cx.ckm00121000438.a6 (non-flash) Current Portal: 10.35.160.104:3260,1 Persistent Portal: 10.35.160.104:3260,1 Engines asked to disconnect these connections: [{u'connection': u'10.35.160.106', u'id': u'bc9ed0f2-893a-4767-bd6a-0e4b5d5675cf', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.b6', u'password': '********', u'port': u'3260', u'tpgt': u'3', u'user': u''}, {u'connection': u'10.35.160.105', u'id': u'8d3a491d-0a4a-4f22-8c7c-f6b12c3fca15', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.a7', u'password': '********', u'port': u'3260', u'tpgt': u'2', u'user': u''}, {u'connection': u'10.35.160.104', u'id': u'2897a350-48bc-4153-a43e-5602532b2a7c', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.a6', u'password': '********', u'port': u'3260', u'tpgt': u'1', u'user': u''}, {u'connection': u'10.35.160.107', u'id': u'7c6450c3-5829-447c-ba97-a2cad9b8d030', u'iqn': u'iqn.1992-04.com.emc:cx.ckm00121000438.b7', u'password': '********', u'port': u'3260', u'tpgt': u'4', u'user': u''}] Searching for sessions, we had these mismatches: Trying to disconnect 10.35.160.106: ["target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.105'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.107'", 'cred mismatch', "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.106' != '10.35.160.104'"] Trying to disconnect 10.35.160.105: ['cred mismatch', "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.107'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.105' != '10.35.160.104'"] Trying to disconnect 10.35.160.104: ["target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.105'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.107'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.104' != '10.35.146.225'", 'cred mismatch'] Trying to disconnect 10.35.160.107: ["target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.105'", 'cred mismatch', "target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.106'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.129'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.146.225'", "target.portal.hostname mismatch: '10.35.160.107' != '10.35.160.104'"] We can clearly see that each connection is matched by target.portal.hostname since each address we look for exits in all other attempts. So it seems that the issue is caused by "cred mismatch". The issue is engine user and password and stored username and passwords are hidden (to prevent persisting of sensitive data in the logs). To debug this, we need to add logs revealing both the sent user and password values and the stored values we get from the active session, and repeat this test. Elad, would you like to try to add these logs and repeat the test?
Elad, can you update version field with the version of vdsm you tested?
(In reply to Nir Soffer from comment #6) > Elad, can you update version field with the version of vdsm you tested? You have it in the description already...
Caused by mismatch in the way we initialize credentials when not using CHAP: Adding more information when we have a cred mismatch show: Thread-151::DEBUG::2015-11-12 15:26:25,236::storageServer::538::Storage.Server.ISCSI::(getSessionInfo) Session mismatches: ["cred mismatch u'':u'' != None:None", "target.portal.hostname mismatch: '10.35.146.129' != '10.35.146.161'", "target.portal.hostname mismatch: '10.35.146.129' != '10.35.146.193'", "target.portal.hostname mismatch: '10.35.146.129' != '10.35.146.225'"] This info is *never* shown in vdsm log due to security reasons. On the request side, we are using username='', and password='', and on the iscsi side, None, None.
Tal, we need to backport this to 3.6 - do we need any flags/clones?
No, all is set, feel free to backport
Returning to POST - this patch requires backporting to the 3.6 branch.
Tested using: rhevm-3.6.2.5-0.1.el6.noarch vdsm-4.17.17-0.el7ev.noarch iscsi-initiator-utils-6.2.0.873-32.el7.x86_64 iSCSI sessions are not disconnected while deactivating the last iSCSI storage domain in the pool. For the scenario of host maintenance while having an active iSCSI domain in the pool, the sessions are being closed. Nir, this doesn't seem right to me. The sessions should be closed on last iSCSI domain deactivation.
(In reply to Elad from comment #12) > Tested using: > rhevm-3.6.2.5-0.1.el6.noarch > vdsm-4.17.17-0.el7ev.noarch > iscsi-initiator-utils-6.2.0.873-32.el7.x86_64 > > iSCSI sessions are not disconnected while deactivating the last iSCSI > storage domain in the pool. > > For the scenario of host maintenance while having an active iSCSI domain in > the pool, the sessions are being closed. > > Nir, this doesn't seem right to me. The sessions should be closed on last > iSCSI domain deactivation. This is not related to this bug, we have an another bug for managing iscsi connections. This bug is *only* for the issue when not using chap. If engine asked to disconnect a connection, it must be disconnected. If engine did not ask to disconnect, it should not be disconnected (bug in engine). Please check the logs and verify that vdsm disconnected everything engine ask it to disconnect. Do we have the "cred mismatch" that we had before? Ala, do you remember the engine bug about managing connection?
Bug 1146115
Thanks Nir. No iSCSI credentials mismatch while vdsm attempts to disconnect its iSCSI sessions. Relevant for host maintenance as described in comment #12. Tested using: rhevm-3.6.2.5-0.1.el6.noarch vdsm-4.17.17-0.el7ev.noarch iscsi-initiator-utils-6.2.0.873-32.el7.x86_64