Bug 1387085 - Hosted-Engine iSCSI target logged in twice on activated Host (to be solved via new HE installation flow)
Summary: Hosted-Engine iSCSI target logged in twice on activated Host (to be solved vi...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.0.3
Hardware: x86_64
OS: Linux
high
low
Target Milestone: ovirt-4.2.0
: ---
Assignee: Maor
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On: 1393902 1455169
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-20 03:50 UTC by Germano Veit Michel
Modified: 2021-06-10 11:38 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-15 17:49:33 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1193961 0 high CLOSED [RFE] [hosted-engine] [iSCSI multipath] Support hosted engine deployment based on multiple iSCSI initiators 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1267807 0 medium CLOSED [RFE] HostedEngine - support for multiple iscsi targets 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2018:1489 0 None None None 2018-05-15 17:51:04 UTC

Internal Links: 1193961 1267807

Description Germano Veit Michel 2016-10-20 03:50:11 UTC
Description of problem:

When the host boots up it connects to the Hosted Engine storage, according to the configurations in /etc/ovirt-hosted-engine/hosted-engine.conf.

# cat /etc/ovirt-hosted-engine/hosted-engine.conf | egrep 'iqn|connection'
iqn=iqn.2003-01.org.linux-iscsi.storage.x8664:hostedengine
connectionUUID=97d0b390-c93c-4ab9-a418-64f26608a691

The result is:

360014052d03c99fec334a71a88308fb6 dm-8 LIO-ORG ,hostedengine    
size=60G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:0:0  sdf 8:80  active ready running

tcp: [1] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:hostedengine

Fine. But then the host is activated, the engine tells vdsm to connect to the same storage again. I believe the problem is that vdsm does not recognize this is the same storage that the ha-agent asked it to connect to, and connects again:

360014052d03c99fec334a71a88308fb6 dm-8 LIO-ORG ,hostedengine    
size=60G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:0:0  sdf 8:80  active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 8:0:0:0  sdg 8:96  active ready running

tcp: [1] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:hostedengine
tcp: [2] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:hostedengine

Putting the host into maintenance mode makes the "Activation" connection disconnect, and it goes back to 1 (the one requested by ha-agent).

I believe this is not desirable, as later adding some fancier iscsi bond multipath configurations (load balancing perhaps?) might think these are actually two different paths, when they are just two identical connections that do not provide any form of performance of reliability benefits, in fact, it's just wasting resources.

Version-Release number of selected component (if applicable):
ovirt-engine-4.0.4
vdsm-4.18.11-1.el7ev.x86_64

How reproducible:
100%

Comment 1 Sandro Bonazzola 2016-11-07 10:05:40 UTC
Allon, what's the issue here? I see there are 2 storage listed above.

Comment 2 Allon Mureinik 2016-11-07 13:33:24 UTC
(In reply to Sandro Bonazzola from comment #1)
> Allon, what's the issue here? I see there are 2 storage listed above.
Germano?

Comment 3 Germano Veit Michel 2016-11-08 05:29:35 UTC
(In reply to Allon Mureinik from comment #2)
> (In reply to Sandro Bonazzola from comment #1)
> > Allon, what's the issue here? I see there are 2 storage listed above.
> Germano?

Hmmm....I thought comment #0 was quite clear. Sorry.

In the case of Hosted-Engine Storage Domain, we are logging in TWICE to the exact same thing (one when ha-agent asks vdsm to connect to storage, and another when rhev-m activates the host). It's not big deal (Low Severity), but I am afraid this is not right.

I found this in my env and I think it is not desirable to have this duplicate entry. If one configures some fancier multipath (failover/load balancing stuff) it may get in the way. Ideally when the host is activated the host should not login again to the exact same target it's already logged in, via the same IP and connection. Not sure if vdsm should figure this out or the engine shouldn't tell vdsm to connect again.

All clear?

Comment 4 Tal Nisan 2016-11-22 16:38:01 UTC
Simone, any idea why the double login?

Comment 5 Simone Tiraboschi 2016-11-22 16:47:40 UTC
Exactly as in the bug description:
the hosted-engine storage domain gets logged in once by ovirt-ha-agent just after the boot and then it gets logged it a second time by the engine once we have an engine.

Not sure we wan't to remove this behavior as it's the base for:
https://bugzilla.redhat.com/show_bug.cgi?id=1267807

The idea is to simply let the agent connect the first/initial path, and then let the engine (once it's active) connect other using the iSCSI bond features.

Comment 6 Simone Tiraboschi 2017-07-31 15:48:16 UTC
I think that the point is that ovirt-ha-agent ignores the undocumented 'netIfaceName' parameter.

See: https://bugzilla.redhat.com/show_bug.cgi?id=1193961#c33

Comment 7 Allon Mureinik 2017-11-15 15:33:40 UTC
Pushing to 4.2.1, as the zero-node installation is still not ready

Comment 8 Yaniv Lavi 2017-12-27 13:11:24 UTC
Zero node should solve this, please test with that flow.

Comment 9 Nikolai Sednev 2018-03-18 15:53:49 UTC
Works for me on these components on ansible deployment:
ovirt-hosted-engine-ha-2.2.7-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.13-1.el7ev.noarch
rhvm-appliance-4.2-20180202.0.el7.noarch
Linux 3.10.0-861.el7.x86_64 #1 SMP Wed Mar 14 10:21:01 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

I saw only single active session towards the storage after deployment was finished.

alma03 ~]# cat /etc/ovirt-hosted-engine/hosted-engine.conf | egrep 'iqn|connection'
connectionUUID=e29cf818-5ee5-46e1-85c1-8aeefa33e95d
iqn=iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00
alma03 ~]# iscsiadm -m session
tcp: [1] 10.35.146.129:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 (non-flash)
alma03 ~]# multipath -ll
3514f0c5a51601655 dm-0 XtremIO ,XtremApp        
size=70G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='queue-length 0' prio=1 status=active
  `- 6:0:0:1 sdb 8:16 active ready running

Comment 14 errata-xmlrpc 2018-05-15 17:49:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1489

Comment 15 Franta Kust 2019-05-16 13:05:51 UTC
BZ<2>Jira Resync


Note You need to log in before you can comment on or make changes to this bug.