Description of problem: When the host boots up it connects to the Hosted Engine storage, according to the configurations in /etc/ovirt-hosted-engine/hosted-engine.conf. # cat /etc/ovirt-hosted-engine/hosted-engine.conf | egrep 'iqn|connection' iqn=iqn.2003-01.org.linux-iscsi.storage.x8664:hostedengine connectionUUID=97d0b390-c93c-4ab9-a418-64f26608a691 The result is: 360014052d03c99fec334a71a88308fb6 dm-8 LIO-ORG ,hostedengine size=60G features='0' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=1 status=active | `- 7:0:0:0 sdf 8:80 active ready running tcp: [1] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:hostedengine Fine. But then the host is activated, the engine tells vdsm to connect to the same storage again. I believe the problem is that vdsm does not recognize this is the same storage that the ha-agent asked it to connect to, and connects again: 360014052d03c99fec334a71a88308fb6 dm-8 LIO-ORG ,hostedengine size=60G features='0' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=1 status=active | `- 7:0:0:0 sdf 8:80 active ready running |-+- policy='service-time 0' prio=1 status=enabled | `- 8:0:0:0 sdg 8:96 active ready running tcp: [1] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:hostedengine tcp: [2] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:hostedengine Putting the host into maintenance mode makes the "Activation" connection disconnect, and it goes back to 1 (the one requested by ha-agent). I believe this is not desirable, as later adding some fancier iscsi bond multipath configurations (load balancing perhaps?) might think these are actually two different paths, when they are just two identical connections that do not provide any form of performance of reliability benefits, in fact, it's just wasting resources. Version-Release number of selected component (if applicable): ovirt-engine-4.0.4 vdsm-4.18.11-1.el7ev.x86_64 How reproducible: 100%
Allon, what's the issue here? I see there are 2 storage listed above.
(In reply to Sandro Bonazzola from comment #1) > Allon, what's the issue here? I see there are 2 storage listed above. Germano?
(In reply to Allon Mureinik from comment #2) > (In reply to Sandro Bonazzola from comment #1) > > Allon, what's the issue here? I see there are 2 storage listed above. > Germano? Hmmm....I thought comment #0 was quite clear. Sorry. In the case of Hosted-Engine Storage Domain, we are logging in TWICE to the exact same thing (one when ha-agent asks vdsm to connect to storage, and another when rhev-m activates the host). It's not big deal (Low Severity), but I am afraid this is not right. I found this in my env and I think it is not desirable to have this duplicate entry. If one configures some fancier multipath (failover/load balancing stuff) it may get in the way. Ideally when the host is activated the host should not login again to the exact same target it's already logged in, via the same IP and connection. Not sure if vdsm should figure this out or the engine shouldn't tell vdsm to connect again. All clear?
Simone, any idea why the double login?
Exactly as in the bug description: the hosted-engine storage domain gets logged in once by ovirt-ha-agent just after the boot and then it gets logged it a second time by the engine once we have an engine. Not sure we wan't to remove this behavior as it's the base for: https://bugzilla.redhat.com/show_bug.cgi?id=1267807 The idea is to simply let the agent connect the first/initial path, and then let the engine (once it's active) connect other using the iSCSI bond features.
I think that the point is that ovirt-ha-agent ignores the undocumented 'netIfaceName' parameter. See: https://bugzilla.redhat.com/show_bug.cgi?id=1193961#c33
Pushing to 4.2.1, as the zero-node installation is still not ready
Zero node should solve this, please test with that flow.
Works for me on these components on ansible deployment: ovirt-hosted-engine-ha-2.2.7-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.13-1.el7ev.noarch rhvm-appliance-4.2-20180202.0.el7.noarch Linux 3.10.0-861.el7.x86_64 #1 SMP Wed Mar 14 10:21:01 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 (Maipo) I saw only single active session towards the storage after deployment was finished. alma03 ~]# cat /etc/ovirt-hosted-engine/hosted-engine.conf | egrep 'iqn|connection' connectionUUID=e29cf818-5ee5-46e1-85c1-8aeefa33e95d iqn=iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 alma03 ~]# iscsiadm -m session tcp: [1] 10.35.146.129:3260,1 iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 (non-flash) alma03 ~]# multipath -ll 3514f0c5a51601655 dm-0 XtremIO ,XtremApp size=70G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='queue-length 0' prio=1 status=active `- 6:0:0:1 sdb 8:16 active ready running
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1489
BZ<2>Jira Resync