1193961 – [RFE] [hosted-engine] [iSCSI multipath] Support hosted engine deployment based on multiple iSCSI initiators

Bug 1193961 - [RFE] [hosted-engine] [iSCSI multipath] Support hosted engine deployment based on multiple iSCSI initiators

Summary: [RFE] [hosted-engine] [iSCSI multipath] Support hosted engine deployment base...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	ovirt-hosted-engine-setup
Classification:	oVirt
Component:	Plugins.Block
Sub Component:
Version:	---
Hardware:	x86_64
OS:	Unspecified
Priority:	high
Severity:	high with 2 votes
Target Milestone:	ovirt-4.2.5
Target Release:	---
Assignee:	Simone Tiraboschi
QA Contact:	Nikolai Sednev
Docs Contact:
URL:
Whiteboard:
Depends On:	1455169 1503799
Blocks:	RHV_DR RHV_HE_on_iscsi_multipath 1534978
TreeView+	depends on / blocked

Reported:	2015-02-18 15:55 UTC by Elad
Modified:	2020-08-20 20:35 UTC (History)
CC List:	25 users (show)
Fixed In Version:
Clone Of:
Clones:	RHV_HE_on_iscsi_multipath (view as bug list)
Environment:
Last Closed:	2018-08-09 10:48:54 UTC
oVirt Team:	Integration
Embargoed:
Dependent Products:
Flags:	rule-engine: ovirt-4.2+ sherold: Triaged+ rule-engine: exception+ nsednev: testing_plan_complete+ ylavi: planning_ack+ rule-engine: devel_ack+ mavital: testing_ack+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1149579	high	CLOSED	[RFE][hosted-engine-setup] [iSCSI support] allow selecting more than one iSCSI target	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1387085	high	CLOSED	Hosted-Engine iSCSI target logged in twice on activated Host (to be solved via new HE installation flow)	2021-06-10 11:42:15 UTC
Red Hat Bugzilla	1613763	high	CLOSED	[BLOCKED][RFE] - Allow ISCSI bonding in hosted engine setup	2022-04-20 16:08:39 UTC
Red Hat Knowledge Base (Solution)	2138951	None	None	None	2016-01-25 21:51:58 UTC
oVirt gerrit	82352	master	MERGED	iscsi: multipath: support multiple iSCSI interfaces	2020-12-18 03:11:03 UTC
oVirt gerrit	92782	master	MERGED	conf: support optional values in conf files	2020-12-18 03:11:03 UTC
oVirt gerrit	92835	master	MERGED	better explain g/set-shared-config option	2020-12-18 03:11:03 UTC
oVirt gerrit	92836	master	MERGED	conf: skip commented lines	2020-12-18 03:11:03 UTC
oVirt gerrit	93075	ovirt-hosted-engine-setup-2.2	MERGED	better explain g/set-shared-config option	2020-12-18 03:11:33 UTC
oVirt gerrit	93076	v2.2.z	MERGED	conf: support optional values in conf files	2020-12-18 03:11:33 UTC
oVirt gerrit	93077	v2.2.z	MERGED	conf: skip commented lines	2020-12-18 03:11:34 UTC
oVirt gerrit	93078	v2.2.z	MERGED	iscsi: multipath: support multiple iSCSI interfaces	2020-12-18 03:11:04 UTC

Internal Links: 1149579 1387085 1613763

Description Elad 2015-02-18 15:55:10 UTC

Description of problem:

Currently, there is no option to deploy hosted engine using several iSCSI initiators. 
It should be allowed to create multiple iSCSI storage connections to the storage server using several NICs on the host.
iSCSI multipath is supported since RHEV-3.4 for a regular RHEV environment.

Comment 1 Sandro Bonazzola 2015-02-26 08:11:08 UTC

Simone, this may be already working on master, can you take a look?

Comment 2 Simone Tiraboschi 2015-03-23 16:21:45 UTC

It already works if you manually configure multipathing before launching hosted-engine --deploy

Example with 2 NIC on the host, 2 NIC the iSCSI host and a single portal on both the interface:

On the host:

 # iscsiadm -m iface -I eth0 --op=new
 # iscsiadm -m iface -I eth1 --op=new
 # iscsiadm -m discovery -t st -p 192.168.1.125:3260
 # iscsiadm -m discovery -t st -p 192.168.2.125:3260
 # iscsiadm --mode node --portal 192.168.1.125:3260,1 --login
 # iscsiadm --mode node --portal 192.168.2.125:3260,1 --login


Than hosted-engine --deploy reports:

          Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: iscsi
          Please specify the iSCSI portal IP address: 192.168.1.125
          Please specify the iSCSI portal port [3260]: 
          Please specify the iSCSI portal user: 
          Please specify the target name (iqn.2015-03.com.redhat:simone1, iqn.2015-03.com.redhat:simone1) [iqn.2015-03.com.redhat:simone1]: 
          The following luns have been found on the requested target:
                [1]     33000000017538b4b       56GiB   FreeBSD iSCSI Disk
                        status: used, paths: 4 active
         
                [2]     33000000031f26ca3       24GiB   FreeBSD iSCSI Disk
                        status: used, paths: 4 active
         
                [3]     33000000022a29f57       16GiB   FreeBSD iSCSI Disk
                        status: free, paths: 4 active
         
                [4]     330000000d0c91c54       1GiB    FreeBSD iSCSI Disk
                        status: free, paths: 4 active
         
                [5]     330000000c399efa0       1GiB    FreeBSD iSCSI Disk
                        status: free, paths: 4 active
         
                [6]     330000000e5380848       1GiB    FreeBSD iSCSI Disk
                        status: free, paths: 4 active
         
          Please select the destination LUN (1, 2, 3, 4, 5, 6) [1]: 


Is it enough?

Comment 3 Yaniv Lavi 2015-04-07 13:32:08 UTC

*** Bug 1149579 has been marked as a duplicate of this bug. ***

Comment 4 Sandro Bonazzola 2015-04-30 13:54:02 UTC

Need an answer to comment #2 before acknowledging this.

Comment 5 Nir Soffer 2015-09-14 14:07:44 UTC

(In reply to Simone Tiraboschi from comment #2)
> It already works if you manually configure multipathing before launching
> hosted-engine --deploy

Why do you have to configure manually? this configuration is done by engine, and probably need to be part of hosted engine installation.

Please check how engine configures multipathing and do the same in hosted engine setup.

Note that the engine vm may need this configuration later, so maybe you like
to boostrap the system without multipathing, and configure multipathing from
engine later.

Multipathing configuration and persistence on both host and eninge is handled
by networking team, please consult them about this change.

Comment 6 Sandro Bonazzola 2015-10-05 12:18:34 UTC

*** Bug 1267807 has been marked as a duplicate of this bug. ***

Comment 10 Sandro Bonazzola 2016-08-17 09:41:07 UTC

Yaniv, can you please review this bug?

Comment 11 Yaniv Lavi 2016-08-21 11:29:46 UTC

(In reply to Sandro Bonazzola from comment #10)
> Yaniv, can you please review this bug?

We should look into allow to configure this post setup via the iscsi bonding feature of the storage domains. Can you please look into this with storage and SLA teams?

Comment 12 Sandro Bonazzola 2016-08-23 13:26:17 UTC

I'll sync with them, thanks.

Comment 18 Red Hat Bugzilla Rules Engine 2017-01-21 08:56:08 UTC

This request has been proposed for two releases. This is invalid flag usage. The ovirt-future release flag has been cleared. If you wish to change the release flag, you must clear one release flag and then set the other release flag to ?.

Comment 22 Elad 2017-07-10 13:25:11 UTC

I'm not sure how this bug is ON_QA. 

The intention was to have the hosted_storage SD being created over multiple iSCSI initiators during HE deployment.

With the current state, the iSCSI bond can't exclude ovirtmgmt network from being part of it even though it was excluded by the user. 
So, hosted_storage SD will remain connected through ovirtmgmt interface and the whole concept of network separation where MGMT and data networks are separated cannot be achieved.

Simone, Yaniv, considering the above, please add your input.
Thanks

Comment 23 Simone Tiraboschi 2017-07-24 09:49:03 UTC

(In reply to Elad from comment #22)
> I'm not sure how this bug is ON_QA. 
> 
> The intention was to have the hosted_storage SD being created over multiple
> iSCSI initiators during HE deployment.

Here the idea is to use a single initiator at hosted-engine deployment time and let the user configure the others via iSCSI bonding feature from the engine when it's running.

> With the current state, the iSCSI bond can't exclude ovirtmgmt network from
> being part of it even though it was excluded by the user.

Sorry, why?

Comment 24 Elad 2017-07-24 13:53:09 UTC

Because in practice, upon iSCSI bond creation, a bond that doesn't include the ovirtmgmt network, while the ovirtmgmt network is being used for the connection to the iSCSI target that exposes the hosted_storage LUN, the connection to the storage via ovirtmgmt remains open.

Comment 25 Simone Tiraboschi 2017-07-25 10:06:33 UTC

(In reply to Elad from comment #24)
> Because in practice, upon iSCSI bond creation, a bond that doesn't include
> the ovirtmgmt network, while the ovirtmgmt network is being used for the
> connection to the iSCSI target that exposes the hosted_storage LUN, the
> connection to the storage via ovirtmgmt remains open.

Nothing is enforcing that the iSCSI target is in the same subnet used for the management network.
If the storage subnet and the management one are distinct, the host will simply use a NIC in storage subnet to reach the iSCSI target according to its routing table.

At that point you should be able to create two storage logical networks in the engine and configure the iSCSI bonding over it without the need to involve the management network.

Comment 26 Nikolai Sednev 2017-07-25 10:50:43 UTC

(In reply to Simone Tiraboschi from comment #25)
> (In reply to Elad from comment #24)
> > Because in practice, upon iSCSI bond creation, a bond that doesn't include
> > the ovirtmgmt network, while the ovirtmgmt network is being used for the
> > connection to the iSCSI target that exposes the hosted_storage LUN, the
> > connection to the storage via ovirtmgmt remains open.
> 
> Nothing is enforcing that the iSCSI target is in the same subnet used for
> the management network.
> If the storage subnet and the management one are distinct, the host will
> simply use a NIC in storage subnet to reach the iSCSI target according to
> its routing table.
> 
> At that point you should be able to create two storage logical networks in
> the engine and configure the iSCSI bonding over it without the need to
> involve the management network.

The initial SHE deployment does enforcing that, because mgmt network being used as iSCSI storage network during the deployment. You can't exclude management network, while its already being used by iSCSI storage on host, on which HE-VM is deployed and running.

To switch HE-VM from using management network as its iSCSI you can create multiple storage-dedicated networks assigned to different NICs, but this won't enforce already running HE-VM iSCSI TCP session from active management network on host level.

Even in case, where you will power-off the engine, it won't influence active iSCSI TCP storage connection, which is running over management network from the host, you will have to forcefully disconnect an active iSCSI session and to block management network from reaching iSCSI HE LUN from the management network on host's level, in other words you will have to force disconnect active iSCSI storage session running over management network and then to deny any other connections from management network to iSCSI target, to get HE-VM connected to iSCSI target from dedicated storage networks at all times.

This will also require enforcing this on all other ha-hosts.

Comment 28 Red Hat Bugzilla Rules Engine 2017-07-30 15:01:32 UTC

Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 33 Simone Tiraboschi 2017-07-31 15:34:31 UTC

Ok, I think I got the issue analyzing Nikolai's env where he created an iSCSI bond using interfaces bond0 and enp5s0f1.

The engine is sending:
2017-07-31 18:02:40,176+0300 INFO  (jsonrpc/2) [vdsm.api] START connectStorageServer(domType=3, spUUID=u'00000001-0001-0001-0001-000000000311', conList=[{u'netIfaceName': u'bond0', u'id': u'51eb2856-28f0-455d-93a4-14b571e34352', u'connection': u'10.35.160.161', u'iqn': u'iqn.2005-10.org.freenas.ctl:iscsi-data-sd-lun1-target', u'user': u'', u'tpgt': u'1', u'ifaceName': u'bond0', u'password': '********', u'port': u'3260'}, {u'netIfaceName': u'bond0', u'id': u'343a7be5-73c0-4b65-9949-d918c6408a5d', u'connection': u'10.35.163.48', u'iqn': u'iqn.2005-10.org.freenas.ctl:iscsi-he-deployment-disk', u'user': u'', u'tpgt': u'1', u'ifaceName': u'bond0', u'password': '********', u'port': u'3260'}, {u'netIfaceName': u'enp5s0f1', u'id': u'ac64d6af-6ad7-4575-83c8-38e95eec9581', u'connection': u'10.35.160.161', u'iqn': u'iqn.2005-10.org.freenas.ctl:iscsi-data-sd-lun1-target', u'user': u'', u'tpgt': u'1', u'ifaceName': u'enp5s0f1', u'password': '********', u'port': u'3260'}, {u'netIfaceName': u'enp5s0f1', u'id': u'3f98d3fc-cc14-4571-ab8c-cfb7abe44917', u'connection': u'10.35.163.48', u'iqn': u'iqn.2005-10.org.freenas.ctl:iscsi-he-deployment-disk', u'user': u'', u'tpgt': u'1', u'ifaceName': u'enp5s0f1', u'password': '********', u'port': u'3260'}], options=None) from=::ffff:10.35.72.51,50890, flow_id=18c7a406, task_id=c3dee31e-a7c9-48b6-a8ba-0da205bc433b (api:46)


While ovirt-ha-agent simply sends:
2017-07-31 18:11:15,761+0300 INFO  (jsonrpc/4) [vdsm.api] START connectStorageServer(domType=3, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'29d4ea67-c692-4b59-8e0b-9e5b16d13c84', u'connection': u'10.35.163.48', u'iqn': u'iqn.2005-10.org.freenas.ctl:iscsi-he-deployment-disk', u'portal': u'1', u'user': u'', u'password': '********', u'port': u'3260'}], options=None) from=::1,57874, task_id=942f6534-a8d8-4efa-a19d-c6ebfb6abb98 (api:46)

Which lead to :
2017-07-31 18:11:15,804+0300 INFO  (jsonrpc/4) [storage.ISCSI] iSCSI iface.net_ifacename not provided. Skipping. (iscsi:590)


So the difference is that the engine is using, in its connectStorageServer, an undocumented (according to https://github.com/oVirt/vdsm/blob/master/lib/vdsm/api/vdsm-api.yml#L360 ) parameter called 'netIfaceName'.

This means that if we want to have also ovirt-ha-agent consuming the iSCSI bonding configuration when the engine VM is down (for any reason), ovirt-ha-agent should know the iSCSI bonding configuration and since the engine VM is down (and the shared storage still not accessible) it should be saved locally on the host.
But since the user could edit the iSCSI bonding configuration from the engine this should also update the local configuration on the host for ovirt-ha-agent and so on...

Comment 34 Yaniv Lavi 2017-08-10 11:32:35 UTC

Please mind this RFE in the redesign you are doing for the HE installation and mount options.

Comment 35 Nir Soffer 2018-01-21 20:08:42 UTC

Why don't we configure all this via engine as part of the new installation flow?

1. Start bootstrap engine
2. Add the first host
3. Configure iscsi bonding for the he storage domain
   (up to this point, there should be no difference from standard engine flow)
4. Make the configuration available for hosted engine
5. Move the boostrap vm to shared storage

When hosted engine agent connect to storage after boot, use the same configuration
and vdsm apis to connect to storage and bring up the engine vm.

I user modify the hosted engine storage domain configuration, the new configuration
should be available to hosted engine agents.

Simone, can you explain what is missing?

Comment 36 Nir Soffer 2018-01-21 20:28:51 UTC

(In reply to Simone Tiraboschi from comment #33)
> So the difference is that the engine is using, in its connectStorageServer,
> an undocumented (according to
> https://github.com/oVirt/vdsm/blob/master/lib/vdsm/api/vdsm-api.yml#L360 )
> parameter called 'netIfaceName'.

Yes, this parameter is not documented, this is a bug. Can you file a bug to
document it?

> This means that if we want to have also ovirt-ha-agent consuming the iSCSI
> bonding configuration when the engine VM is down (for any reason),
> ovirt-ha-agent should know the iSCSI bonding configuration and since the
> engine VM is down (and the shared storage still not accessible) it should be
> saved locally on the host.

The configuration should be save in shared storage just like any other
configuration need to bootstrap the system, for example, mount options.

Comment 37 Simone Tiraboschi 2018-02-21 18:00:50 UTC

(In reply to Nir Soffer from comment #36)
> > This means that if we want to have also ovirt-ha-agent consuming the iSCSI
> > bonding configuration when the engine VM is down (for any reason),
> > ovirt-ha-agent should know the iSCSI bonding configuration and since the
> > engine VM is down (and the shared storage still not accessible) it should be
> > saved locally on the host.
> 
> The configuration should be save in shared storage just like any other
> configuration need to bootstrap the system, for example, mount options.

The issue is here: we need that info to connect the shared storage so it cannot be saved on the shared storage but it has to be locally replicated on each host and updated on all the hosts each the user edit iSCSI bonds on engine side.

We can also just send all the permutations but, since vdsm will try all of them sequentially, if some paths are implicitly invalid we are going for sure to pay N time the connection timeout.

E.g:

host with 3 nics on 3 distinct /24 IPv4 subnets and 3 iSCSI initiators:
192.168.1.100
192.168.2.100
192.168.3.100

SAN with 3 iSCSI portals on the same 3 subnets:
192.168.1.200
192.168.2.200
192.168.3.200

3 paths are valid:
192.168.1.100 -> 192.168.1.200, 192.168.2.100 -> 192.168.2.200, 192.168.3.100 -> 192.168.3.200
the other 6 are not.

If we don't keep the initiator/iSCSI bond mapping updated on all the hosts we are going to pay, in this example, 6 time the iSCSI connect timeout.

Comment 38 Sandro Bonazzola 2018-03-14 09:56:32 UTC

Re-targeting to 4.2.4 and reducing priority to high according to comment #37.
It's too risky to rush this patch in given it will cause considerable change in the bootstrap process and still miss design on how to keep updated distributed copy of the multipath mappings.

Comment 39 Sandro Bonazzola 2018-07-16 13:58:12 UTC

Re-targeting to 4.2.6 being next build blockers only and this not being considered a blocker for 4.2.5.

Comment 40 Sandro Bonazzola 2018-07-17 06:29:28 UTC

Change is already in.

Comment 41 Nikolai Sednev 2018-07-31 16:00:29 UTC

I've tried to separate iSCSI traffic over iSCSI bond and got all interfaces participating in iSCSI sessions. Instead of taking all iSCSI traffic over enp3s0f1, traffic was distributed over default interface enp3s0f0 and enp3s0f1 in this pattern:

alma04 ~]# iscsiadm -m session -P1
Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c01 (non-flash)
        Current Portal: 10.35.146.161:3260,1
        Persistent Portal: 10.35.146.161:3260,1
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.1994-05.com.redhat:20179eabd3d
                Iface IPaddress: 10.35.92.4
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c00 (non-flash)
        Current Portal: 10.35.146.129:3260,1
        Persistent Portal: 10.35.146.129:3260,1
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.1994-05.com.redhat:20179eabd3d
                Iface IPaddress: 10.35.92.4
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c04 (non-flash)
        Current Portal: 10.35.146.193:3260,1
        Persistent Portal: 10.35.146.193:3260,1
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.1994-05.com.redhat:20179eabd3d
                Iface IPaddress: 10.35.92.4
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 3
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
Target: iqn.2008-05.com.xtremio:xio00153500071-514f0c50023f6c05 (non-flash)
        Current Portal: 10.35.146.225:3260,1
        Persistent Portal: 10.35.146.225:3260,1
                **********
                Interface:
                **********
                Iface Name: enp3s0f1
                Iface Transport: tcp
                Iface Initiatorname: iqn.1994-05.com.redhat:20179eabd3d
                Iface IPaddress: 10.35.71.93
                Iface HWaddress: <empty>
                Iface Netdev: enp3s0f1
                SID: 4
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE

I can't verify this bug, due to the fact that its still impossible to separate iSCSI traffic from flowing over management interface.
Tested on these components on hosts:
rhvm-appliance-4.2-20180727.1.el7.noarch, 
Engine setup within the rhvm-appliance-4.2-20180727.1.el7.noarch is ovirt-engine-setup-4.2.5.2-0.1.el7ev.noarch.
ovirt-hosted-engine-ha-2.2.16-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.24-1.el7ev.noarch
rhvm-appliance-4.2-20180727.1.el7.noarch
ovirt-vmconsole-1.0.4-1.el7ev.noarch
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

-------------
Summary:
-------------
1.At present, there is no way to completely separate HE's storage traffic between default management network and storage traffic dedicated iSCSI bonded networks.
 
2.There is no way to detach default management network interface from HE storage target without loosing HE-VM on hosts.

3.There is no way to choose during HE's deployment stage, which multiple interfaces should be used for MPIO (aka iSCSI Bond).

4.There is no prediction on which of the networks "iscsi dedicated network" or the default management network, the HE's storage traffic will be forwarded,once there were created separate storage dedicated networks for HE's storage domain iSCSI target.

Due to the fact, that this RFE does not cover the customer requirements, I'm moving it back to assigned.

Comment 42 Simone Tiraboschi 2018-08-09 10:48:54 UTC

Let's close this with current state and let's add another patch to automatically configure iscsi bonding from setup as for 1613763

Comment 43 Ai90iV 2020-08-20 20:35:37 UTC

 Dear All,

Is there any progress in adding of iSCSI Bonding from Setup?

Note You need to log in before you can comment on or make changes to this bug.

ai90iv
alexander
bugs
cshao
didi
ebenahar
fgarciad
istein
lpeer
lsurette
lveyde
mavital
mkalinin
msivak
mwest
nsoffer
redhat
sbonazzo
srevivo
stirabos
usurse
weiwang
ycui
ylavi
yzhao