Bug 1474904 - iSCSI Multipath IO issues: vdsm tries to connect to unreachable paths. [NEEDINFO]
iSCSI Multipath IO issues: vdsm tries to connect to unreachable paths.
Status: CLOSED NOTABUG
Product: vdsm
Classification: oVirt
Component: Services (Show other bugs)
4.19.20
x86_64 All
unspecified Severity high with 1 vote (vote)
: ovirt-4.1.6
: ---
Assigned To: Maor
Raz Tamir
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-25 11:24 EDT by Vinícius Ferrão
Modified: 2017-09-03 09:04 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-09-03 09:04:44 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
tnisan: needinfo? (vinicius)
rule-engine: ovirt‑4.1+


Attachments (Terms of Use)

  None (edit)
Description Vinícius Ferrão 2017-07-25 11:24:24 EDT
Description of problem:
Hello,

iSCSI Multipath simply does not work on oVirt/RHV. As detected by some guys over the oVirt mailing list: "the OVirt implementation of iSCSI-Bonding assumes that all network interfaces in the bond can connect/reach all targets, including those in the other net(s). The fact that you use separate, isolated networks means that this is not the case in your setup (and not in mine). I am not sure if this is a bug, a design flaw or a feature, but as a result of this OVirt's iSCSI-Bonding does not work".

Since the log files are too bug, I've upload it to a web server, here's the link: http://www.if.ufrj.br/~ferrao/ovirt

Version-Release number of selected component (if applicable):
[root@ovirt3 ~]# imgbase w
2017-07-25 12:18:09,402 [INFO] You are on rhvh-4.1-0.20170706.0+1

[root@ovirt3 ~]# rpm -qa | grep -i vdsm
vdsm-xmlrpc-4.19.20-1.el7ev.noarch
vdsm-hook-vmfex-dev-4.19.20-1.el7ev.noarch
vdsm-client-4.19.20-1.el7ev.noarch
vdsm-hook-openstacknet-4.19.20-1.el7ev.noarch
vdsm-yajsonrpc-4.19.20-1.el7ev.noarch
vdsm-python-4.19.20-1.el7ev.noarch
vdsm-cli-4.19.20-1.el7ev.noarch
vdsm-hook-vhostmd-4.19.20-1.el7ev.noarch
vdsm-4.19.20-1.el7ev.x86_64
vdsm-gluster-4.19.20-1.el7ev.noarch
vdsm-hook-fcoe-4.19.20-1.el7ev.noarch
vdsm-jsonrpc-4.19.20-1.el7ev.noarch
vdsm-api-4.19.20-1.el7ev.noarch
vdsm-hook-ethtool-options-4.19.20-1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Installed oVirt Node 4.1.3 with the following network settings:

eno1 and eno2 on a 802.3ad (LACP) Bond, creating a bond0 interface.
eno3 with 9216 MTU.
eno4 with 9216 MTU.
vlan11 on eno3 with 9216 MTU and fixed IP addresses.
vlan12 on eno4 with 9216 MTU and fixed IP addresses.

eno3 and eno4 are my iSCSI MPIO Interfaces, completelly segregated, on different switches.

2. Started the installation of Self-hosted engine after three hours of waiting, because: https://bugzilla.redhat.com/show_bug.cgi?id=1454536

3. Selected iSCSI as default interface for Hosted Engine. Everything was fine.

4. On the Hosted Engine I’ve done the following:

a. System > Data Centers > Default > Networks
. Created iSCSI1 with VLAN 11 and MTU 9216, removed VM Network option.
. Created iSCSI2 with VLAN 12 and MTU 9216, removed VM Network option.

b. System > Data Centers > Default > Clusters > Default > Hosts > ovirt3.cc.if.ufrj.br (my machine)

Selected Setup Host Networks and moved iSCSI1 to eno3 and iSCSI2 to eno4. Both icons gone green, indicating an “up” state.

c. System > Data Centers > Default > Clusters

Selected Logical Networks and them Manage Network. Removed the Required checkbox from both iSCSI connections.

d. System > Data Centers > Default > Storage

Added an iSCSI Share with two initiators. Both shows up correctly.

e. System > Data Centers

Now the iSCSI Multipath tab is visible. Selected it and added an iSCSI Bond:
. iSCSI1 and iSCSI2 selected on Logical Networks.
. Two iqn’s selected on Storage Targets.

5. oVirt just goes down. VDSM gets crazy and everything “crashes”. iSCSI is still alive, since we can still talk with the Self Hosted Engine, but **NOTHING** works. If the iSCSI Bond is removed everything regenerates to a usable state.

Actual results:
Broken iSCSI connectivity and everything going down on oVirt HE

Expected results:
MPIO on iSCSI paths.

Additional info:
I've got a trial of RHV just to confirm that the bug exists in RHV too. That's why "imgbase w" shows RHV-H. Just in case, my storage system is FreeNAS on common x86_64 hardware. It's tested and working as expected, and confirmed to be working with MPIO in other hypervisor solutions.
Comment 1 Maor 2017-07-25 16:48:49 EDT
Hi Vinícius,

I first want to clarify the issue:
You have two interfaces eno3.11 and eno4.12
iface eno3.11 can only login into 192.168.11.14
iface eno4.12 can only login to 192.168.12.14
The host become non operational since eno3.11 fail to login to 192.168.12.14 and eno4.12 fail login to 192.168.11.14.
What you are suggests, is that oVirt will be able to group a network to a target and make those groups work with the iSCSI multipath.
Is this correct?

Another question (If the above is indeed the case), what will happen if you will configure 2 iSCSI bonds, each one with a its specific network and target?
Comment 2 Vinícius Ferrão 2017-07-27 00:58:23 EDT
That's the exactly topology, just to be extremely precise, both networks are /28 so it's addressable from 1 to 14.

Here's a drawing of the topology:

          +---------------+               
          |    FreeNAS    |
          +---------------+
             |         |
             |         | 10GbE Links
             |         |
+---------------+   +---------------+
| Nexus 3048 #1 |   | Nexus 3048 #2 |
+---------------+   +---------------+
             |         |
             |         | 1GbE Links
             |         |
          +---------------+
       5x |   oVirt/RHV   | 
          +---------------+

VLAN11: 192.168.11.0/28
VLAN12: 192.168.12.0/28
oVirt Servers from 1 to 5, Storage on 14.

So as you can see this is classic MPIO iSCSI topology. Plain layer2 domain without any routing. VLAN11 only exists on the first Nexus and VLAN12 only exists on the second Nexus. 

Changing the configuration as you requested brings up the storages once again. But I don't know if it's working with MPIO or not. It does not makes sense to configure this way. At this moment I have two iSCSI Multipaths. Here's the photos:

http://www.if.ufrj.br/~ferrao/ovirt/separate-mpio0.png
http://www.if.ufrj.br/~ferrao/ovirt/separate-mpio1.png
http://www.if.ufrj.br/~ferrao/ovirt/separate-mpio2.png

Multipath -ll report two paths but only one active:
[root@ovirt3 ~]# multipath -ll
36589cfc0000003400be2813b01be08c3 dm-24 FreeNAS ,iSCSI Disk      
size=10T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 9:0:0:0  sdc 8:32 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 10:0:0:0 sdd 8:48 active ready running
36589cfc0000007e77a9df827b65176f2 dm-12 FreeNAS ,iSCSI Disk      
size=200G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 8:0:0:0  sdb 8:16 active ready running

If more information is needed, please let me know.
Comment 3 Maor 2017-07-27 02:59:37 EDT
(In reply to Vinícius Ferrão from comment #2)
> That's the exactly topology, just to be extremely precise, both networks are
> /28 so it's addressable from 1 to 14.
> 
> Here's a drawing of the topology:
> 
>           +---------------+               
>           |    FreeNAS    |
>           +---------------+
>              |         |
>              |         | 10GbE Links
>              |         |
> +---------------+   +---------------+
> | Nexus 3048 #1 |   | Nexus 3048 #2 |
> +---------------+   +---------------+
>              |         |
>              |         | 1GbE Links
>              |         |
>           +---------------+
>        5x |   oVirt/RHV   | 
>           +---------------+
> 
> VLAN11: 192.168.11.0/28
> VLAN12: 192.168.12.0/28
> oVirt Servers from 1 to 5, Storage on 14.
> 
> So as you can see this is classic MPIO iSCSI topology. Plain layer2 domain
> without any routing. VLAN11 only exists on the first Nexus and VLAN12 only
> exists on the second Nexus. 
> 
> Changing the configuration as you requested brings up the storages once
> again. But I don't know if it's working with MPIO or not. It does not makes
> sense to configure this way. At this moment I have two iSCSI Multipaths.
> Here's the photos:
> 
> http://www.if.ufrj.br/~ferrao/ovirt/separate-mpio0.png
> http://www.if.ufrj.br/~ferrao/ovirt/separate-mpio1.png
> http://www.if.ufrj.br/~ferrao/ovirt/separate-mpio2.png

It doesn't mean that you have two different iSCSI multipath in your Host.
This is only the oVirt configuration, eventually it is translated to iscsiadm login command and the multipath in your Host is still one process which should be connected with its two nics, but with this configuration you can achieve the the isolation level you want to configure.

> 
> Multipath -ll report two paths but only one active:
> [root@ovirt3 ~]# multipath -ll
> 36589cfc0000003400be2813b01be08c3 dm-24 FreeNAS ,iSCSI Disk      
> size=10T features='0' hwhandler='0' wp=rw
> |-+- policy='service-time 0' prio=1 status=active
> | `- 9:0:0:0  sdc 8:32 active ready running
> `-+- policy='service-time 0' prio=1 status=enabled
>   `- 10:0:0:0 sdd 8:48 active ready running
> 36589cfc0000007e77a9df827b65176f2 dm-12 FreeNAS ,iSCSI Disk      
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='service-time 0' prio=1 status=active
>   `- 8:0:0:0  sdb 8:16 active ready running
> 
> If more information is needed, please let me know.

Can you please also share the following output:
   iscsiadm -m session -P 3
Comment 4 Vinícius Ferrão 2017-07-27 03:28:26 EDT
Maor, there is:

[root@ovirt3 ~]# iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
version 6.2.0.873-35
Target: iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-he (non-flash)
	Current Portal: 192.168.12.14:3260,1
	Persistent Portal: 192.168.12.14:3260,1
		**********
		Interface:
		**********
		Iface Name: default
		Iface Transport: tcp
		Iface Initiatorname: iqn.1994-05.com.redhat:89c40b169bd9
		Iface IPaddress: 192.168.12.3
		Iface HWaddress: <empty>
		Iface Netdev: <empty>
		SID: 1
		iSCSI Connection State: LOGGED IN
		iSCSI Session State: LOGGED_IN
		Internal iscsid Session State: NO CHANGE
		*********
		Timeouts:
		*********
		Recovery Timeout: 5
		Target Reset Timeout: 30
		LUN Reset Timeout: 30
		Abort Timeout: 15
		*****
		CHAP:
		*****
		username: <empty>
		password: ********
		username_in: <empty>
		password_in: ********
		************************
		Negotiated iSCSI params:
		************************
		HeaderDigest: None
		DataDigest: None
		MaxRecvDataSegmentLength: 262144
		MaxXmitDataSegmentLength: 131072
		FirstBurstLength: 131072
		MaxBurstLength: 16776192
		ImmediateData: Yes
		InitialR2T: Yes
		MaxOutstandingR2T: 1
		************************
		Attached SCSI devices:
		************************
		Host Number: 8	State: running
		scsi8 Channel 00 Id 0 Lun: 0
			Attached scsi disk sdb		State: running
Target: iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-mpio (non-flash)
	Current Portal: 192.168.11.14:3260,1
	Persistent Portal: 192.168.11.14:3260,1
		**********
		Interface:
		**********
		Iface Name: eno3.11
		Iface Transport: tcp
		Iface Initiatorname: iqn.1994-05.com.redhat:89c40b169bd9
		Iface IPaddress: 192.168.11.3
		Iface HWaddress: <empty>
		Iface Netdev: eno3.11
		SID: 2
		iSCSI Connection State: LOGGED IN
		iSCSI Session State: LOGGED_IN
		Internal iscsid Session State: NO CHANGE
		*********
		Timeouts:
		*********
		Recovery Timeout: 5
		Target Reset Timeout: 30
		LUN Reset Timeout: 30
		Abort Timeout: 15
		*****
		CHAP:
		*****
		username: <empty>
		password: ********
		username_in: <empty>
		password_in: ********
		************************
		Negotiated iSCSI params:
		************************
		HeaderDigest: None
		DataDigest: None
		MaxRecvDataSegmentLength: 262144
		MaxXmitDataSegmentLength: 131072
		FirstBurstLength: 131072
		MaxBurstLength: 16776192
		ImmediateData: Yes
		InitialR2T: Yes
		MaxOutstandingR2T: 1
		************************
		Attached SCSI devices:
		************************
		Host Number: 9	State: running
		scsi9 Channel 00 Id 0 Lun: 0
			Attached scsi disk sdc		State: running
	Current Portal: 192.168.12.14:3260,1
	Persistent Portal: 192.168.12.14:3260,1
		**********
		Interface:
		**********
		Iface Name: eno4.12
		Iface Transport: tcp
		Iface Initiatorname: iqn.1994-05.com.redhat:89c40b169bd9
		Iface IPaddress: 192.168.12.3
		Iface HWaddress: <empty>
		Iface Netdev: eno4.12
		SID: 3
		iSCSI Connection State: LOGGED IN
		iSCSI Session State: LOGGED_IN
		Internal iscsid Session State: NO CHANGE
		*********
		Timeouts:
		*********
		Recovery Timeout: 5
		Target Reset Timeout: 30
		LUN Reset Timeout: 30
		Abort Timeout: 15
		*****
		CHAP:
		*****
		username: <empty>
		password: ********
		username_in: <empty>
		password_in: ********
		************************
		Negotiated iSCSI params:
		************************
		HeaderDigest: None
		DataDigest: None
		MaxRecvDataSegmentLength: 262144
		MaxXmitDataSegmentLength: 131072
		FirstBurstLength: 131072
		MaxBurstLength: 16776192
		ImmediateData: Yes
		InitialR2T: Yes
		MaxOutstandingR2T: 1
		************************
		Attached SCSI devices:
		************************
		Host Number: 10	State: running
		scsi10 Channel 00 Id 0 Lun: 0
			Attached scsi disk sdd		State: running
Comment 5 Maor 2017-07-27 05:16:23 EDT
That seems to be ok,
Target iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-mpio is connected through the following interfaces:
  iface eno4.12 with portal 192.168.12.14:3260,1 
  iface eno3.11 with portal 192.168.11.14:3260,1

Regarding what you asked before about two paths but only one active, it seems to be the iscsi multipath normal behavior.
active means that the path group currently receiving I/O requests and enabled means that the path groups to try if the active path group has no paths in the ready state.

Can you try to test it, check if that answers your requirement.
Comment 6 Vinícius Ferrão 2017-07-27 15:40:38 EDT
Maor, this works. I'm not sure if the multipath is in fact load balancing and if it will failover in case of a link failure. I should pull the cable to see if a VM keeps running. I can do this on the following days, since I don't have physical access to the datacenter this moment.

But there's a problem. What's the point of iSCSI Multipath tab on the hosted engine? If I remove everything on this tab the output of "multipath -ll" and "iscsiadm -m session -P3" are exactly the same.

So I really don't get it. What I'm missing? Why the iSCSI Multipath tab exists if the results are the same?

[root@ovirt3 ~]# iscsiadm -m session
tcp: [1] 192.168.12.14:3260,1 iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-he (non-flash)
tcp: [2] 192.168.12.14:3260,1 iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-mpio (non-flash)
tcp: [3] 192.168.11.14:3260,1 iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-mpio (non-flash)

[root@ovirt3 ~]# multipath -ll
36589cfc0000003400be2813b01be08c3 dm-26 FreeNAS ,iSCSI Disk      
size=10T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 9:0:0:0  sdc 8:32 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 10:0:0:0 sdd 8:48 active ready running
36589cfc0000007e77a9df827b65176f2 dm-12 FreeNAS ,iSCSI Disk      
size=200G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 8:0:0:0  sdb 8:16 active ready running

Thanks,
V.
Comment 7 Maor 2017-07-30 04:08:20 EDT
(In reply to Vinícius Ferrão from comment #6)
> Maor, this works. I'm not sure if the multipath is in fact load balancing
> and if it will failover in case of a link failure. I should pull the cable
> to see if a VM keeps running. I can do this on the following days, since I
> don't have physical access to the datacenter this moment.

That is basically what the linux multipath should support (regardless of the oVirt configuration)

Can you please share an update regarding the test results.

> 
> But there's a problem. What's the point of iSCSI Multipath tab on the hosted
> engine?
> If I remove everything on this tab the output of "multipath -ll" and
> "iscsiadm -m session -P3" are exactly the same.

hosted engine with iSCSI bond is not officially supported yet, you can track the open bug through: https://bugzilla.redhat.com/1193961

The current design of the iSCSI multipath is that the engine does not disconnect from network interfaces while the storage domain is active. (see https://bugzilla.redhat.com/show_bug.cgi?id=1094144#c2)

It should disconnect the network interfaces when the storage domain/Host is moving to maintenance (Though, there is an issue which I encountered while verifying your scenario, see https://bugzilla.redhat.com/show_bug.cgi?id=1476030)

> 
> So I really don't get it. What I'm missing? Why the iSCSI Multipath tab
> exists if the results are the same?

See my previous answer, the behavior should be to deactivate the storage domain/Host again, although as I mentioned before, there is an open issue on that (see BZ1476030)

> 
> [root@ovirt3 ~]# iscsiadm -m session
> tcp: [1] 192.168.12.14:3260,1 iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-he
> (non-flash)
> tcp: [2] 192.168.12.14:3260,1
> iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-mpio (non-flash)
> tcp: [3] 192.168.11.14:3260,1
> iqn.2017-07.br.ufrj.if.cc.storage.ctl:ovirt-mpio (non-flash)
> 
> [root@ovirt3 ~]# multipath -ll
> 36589cfc0000003400be2813b01be08c3 dm-26 FreeNAS ,iSCSI Disk      
> size=10T features='0' hwhandler='0' wp=rw
> |-+- policy='service-time 0' prio=1 status=active
> | `- 9:0:0:0  sdc 8:32 active ready running
> `-+- policy='service-time 0' prio=1 status=enabled
>   `- 10:0:0:0 sdd 8:48 active ready running
> 36589cfc0000007e77a9df827b65176f2 dm-12 FreeNAS ,iSCSI Disk      
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='service-time 0' prio=1 status=active
>   `- 8:0:0:0  sdb 8:16 active ready running
> 
> Thanks,
> V.
Comment 8 Uwe Laverenz 2017-08-01 05:01:42 EDT
Hi Maor,

I guess there is some kind of difference or misunderstanding of what we (the users/admins) expect from "iSCSI Bonding" and what you (the developers) intended.

If you add an iSCSI storage to OVirt (2 networks, 2 targets), you already get both paths enabled (active/passive) even without "iSCSI Bonding".

What I expected from creating an iSCSI Bond was to be able to control what kind of load balancing or failover policy (fixed, least recently used, round robin) should be used for the storage domain.

What actually seems to happen is that an iSCSI Bond is something the vdsm uses to monitor the storage paths?! While creating an iSCSI Bond the system tries to change the network and/or multipathd configuration in a way that leads to a system failure, at least when you use separated storage networks.

However, the question is whether we don't understand the concept of iSCSI Bonding in OVirt or the described behaviour is a bug. :)

thanks,
Uwe
Comment 9 Maor 2017-08-01 08:21:14 EDT
(In reply to Uwe Laverenz from comment #8)
> Hi Maor,
> 
> I guess there is some kind of difference or misunderstanding of what we (the
> users/admins) expect from "iSCSI Bonding" and what you (the developers)
> intended.
> 
> If you add an iSCSI storage to OVirt (2 networks, 2 targets), you already
> get both paths enabled (active/passive) even without "iSCSI Bonding".


Are you referring to 2 network interfaces on the storage server? or 2 network interfaces on the host?
If you do not declare an iSCSI bond in oVirt, the engine will not pass VDSM the non-required network interfaces to connect to it, and iscsiadm will not be connected with the new network interface of the host.

> 
> What I expected from creating an iSCSI Bond was to be able to control what
> kind of load balancing or failover policy (fixed, least recently used, round
> robin) should be used for the storage domain.


That should be done before you create the iSCSI bond in the engine, thorough  the iSCSI multipath in the linux host (see https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/DM_Multipath/index.html#mpio_configfile )

> 
> What actually seems to happen is that an iSCSI Bond is something the vdsm
> uses to monitor the storage paths?! While creating an iSCSI Bond the system
> tries to change the network and/or multipathd configuration in a way that
> leads to a system failure, at least when you use separated storage networks.

iSCSI Bond in the engine configure the non-required network interfaces that should be connected to the iSCSI storage domain.
You can have 10 non-required networks, but if you will configure only 2 of them in the iSCSI bond then the iscsiadm will only connect with those 2 which you configured.

VDSM does monitor those networks and if all of those networks fail to connect with the iSCSI storage domain, then it should go to non-active state.

The system should not change the multipathd configuration, it should only use the connect command to connect to those network interfaces, multipathd should run on the host regardless of iSCSI bond.

> 
> However, the question is whether we don't understand the concept of iSCSI
> Bonding in OVirt or the described behaviour is a bug. :)
> 
> thanks,
> Uwe

Please let me know if anything is still unclear
Comment 10 Vinícius Ferrão 2017-08-04 16:17:22 EDT
Hello Maor, I've answered through email but the messages aren't attached to the issue. "Reanswering":

About the tests: I was able to test the architecture today. It "failedover" successfully. I’m not sure if MPIO is working for speeding up the bandwidth, I was not able to simulate a huge traffic to see if it would get more than gigabit speeds over two paths. But this is almost good.

On the issue with hosted-engine: Perhaps I’ve explained the question in a non understandable format. Sorry.

I’m not running the Self Hosted Engine with multiparty because as you said it’s unsupported. So you can see that I’ve three iSCSI connection. One is the Self Hosted Engine and the remaining two are the multipaths to a generic LUN just for testing.

What I’m talking about is the necessity of configuring the iSCSI Multipath Tab on the web interface of oVirt. I’ve removed everything from this tab and the behavior was the same as the configuration that you’ve asked me to do, with the iSCSI1 network only selected for path 1 and the iSCSI2 network only for the second path.

So the question again is: what is the purpose of this tab? On documentation it says clearly that this should be configured to multipath works, but this does not appears to be 100% accurate, since the result was the same configuring this (in the way you’ve said) and not configuring at all.

Thanks,
V.
Comment 11 Uwe Laverenz 2017-08-07 09:21:12 EDT
Hello Maor,

(In reply to Maor from comment #9)
> Are you referring to 2 network interfaces on the storage server? or 2
> network interfaces on the host?

I use both: 2 separate network interfaces on the storage server and 2 separate network interfaces on the host. The interfaces are connected to 2 separate networks/VLANs, one network card per network. These networks are dedicated storage networks and therefor aren't routed or otherwise reachable from other networks.

When I connect my host to the iSCSI storage server I first connect to the portal on the first network and then to the portal on the second network. The host connects to both portals and uses both network interfaces. The host detects automatically that there are 2 paths for each LUN. So you already have "multipath" at this point, configured as active/passive. The only missing detail is the setting of round robin policy which can be easily be done via multipath.conf.


> If you do not declare an iSCSI bond in oVirt, the engine will not pass VDSM
> the non-required network interfaces to connect to it, and iscsiadm will not
> be connected with the new network interface of the host.

With the described setup above, all network interfaces for iSCSI are connected already and iscsiadm uses them, there aren't any unconnected "non-required" interfaces left.

> That should be done before you create the iSCSI bond in the engine, thorough
> the iSCSI multipath in the linux host (see
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/
> html-single/DM_Multipath/index.html#mpio_configfile )

This is configured in the "defaults" section in "/etc/multipath.conf". The problem is, that VDSM overwrites this file unless you put a "# VDSM PRIVATE" into it. So you can't get "round robin" policy without keeping VDSM out.


> iSCSI Bond in the engine configure the non-required network interfaces that
> should be connected to the iSCSI storage domain.
> You can have 10 non-required networks, but if you will configure only 2 of
> them in the iSCSI bond then the iscsiadm will only connect with those 2
> which you configured.

What would be the correct order of configuration? In order to use iSCSI Bonding I would only connect the interface in the first network, declare the second network interface "non-required" and instead of connecting it in the storage domain dialog I use it in an iSCSI bond?

> VDSM does monitor those networks and if all of those networks fail to
> connect with the iSCSI storage domain, then it should go to non-active state.

The standard setup without iSCSI Bond is not being monitored by VDSM as far as network connectivity is concerned? But I guess the availability of the storage device ist monitored (I/O errors)?

> The system should not change the multipathd configuration, it should only
> use the connect command to connect to those network interfaces, multipathd
> should run on the host regardless of iSCSI bond.

VDSM overwrites the multipathd config file unless you tell him not to. So it does change the multipathd configuration.

> Please let me know if anything is still unclear

Sorry, I still don't understand what problem iSCSI Bonding would solve in my setup. The only thing might be the monitoring of the network connectivity but IMHO the control of failover policy is the job of multipathd.

IMHO a settings dialog for VDSM's creation of the multipath configuration file would be more useful than iSCSI Bonding.

Thank you,
Uwe
Comment 12 Maor 2017-08-10 05:10:45 EDT
(In reply to Uwe Laverenz from comment #11)
> Hello Maor,
> 
> (In reply to Maor from comment #9)
> > Are you referring to 2 network interfaces on the storage server? or 2
> > network interfaces on the host?
> 
> I use both: 2 separate network interfaces on the storage server and 2
> separate network interfaces on the host. The interfaces are connected to 2
> separate networks/VLANs, one network card per network. These networks are
> dedicated storage networks and therefor aren't routed or otherwise reachable
> from other networks.
> 
> When I connect my host to the iSCSI storage server I first connect to the
> portal on the first network and then to the portal on the second network.
> The host connects to both portals and uses both network interfaces. The host
> detects automatically that there are 2 paths for each LUN. So you already
> have "multipath" at this point, configured as active/passive. The only
> missing detail is the setting of round robin policy which can be easily be
> done via multipath.conf.
> 
> 
> > If you do not declare an iSCSI bond in oVirt, the engine will not pass VDSM
> > the non-required network interfaces to connect to it, and iscsiadm will not
> > be connected with the new network interface of the host.
> 
> With the described setup above, all network interfaces for iSCSI are
> connected already and iscsiadm uses them, there aren't any unconnected
> "non-required" interfaces left.

Can you please try this with a new DC after the host will logout from all the networks using iscsadm.
It could be that those networks were still connected (BZ1476030)

> 
> > That should be done before you create the iSCSI bond in the engine, thorough
> > the iSCSI multipath in the linux host (see
> > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/
> > html-single/DM_Multipath/index.html#mpio_configfile )
> 
> This is configured in the "defaults" section in "/etc/multipath.conf". The
> problem is, that VDSM overwrites this file unless you put a "# VDSM PRIVATE"
> into it. So you can't get "round robin" policy without keeping VDSM out.

Can you please open an RFE for that issue.

> 
> 
> > iSCSI Bond in the engine configure the non-required network interfaces that
> > should be connected to the iSCSI storage domain.
> > You can have 10 non-required networks, but if you will configure only 2 of
> > them in the iSCSI bond then the iscsiadm will only connect with those 2
> > which you configured.
> 
> What would be the correct order of configuration? In order to use iSCSI
> Bonding I would only connect the interface in the first network, declare the
> second network interface "non-required" and instead of connecting it in the
> storage domain dialog I use it in an iSCSI bond?

basically yes, non-required networks should be declared in iSCSI bond.
Here is an example how to do that:
1. Add an iSCSI storage domain in maintenance mode
2. Configure an iSCSI bond with the non-required networks
3. Active the iSCSI storage domain

> 
> > VDSM does monitor those networks and if all of those networks fail to
> > connect with the iSCSI storage domain, then it should go to non-active state.
> 
> The standard setup without iSCSI Bond is not being monitored by VDSM as far
> as network connectivity is concerned? But I guess the availability of the
> storage device ist monitored (I/O errors)?

The standard setup monitors the default network connectivity with the storage domains, if you want also to use non-required networks for iSCSI storage domain you should declare those as part of the iSCSI bond.


> 
> > The system should not change the multipathd configuration, it should only
> > use the connect command to connect to those network interfaces, multipathd
> > should run on the host regardless of iSCSI bond.
> 
> VDSM overwrites the multipathd config file unless you tell him not to. So it
> does change the multipathd configuration.
> 
> > Please let me know if anything is still unclear
> 
> Sorry, I still don't understand what problem iSCSI Bonding would solve in my
> setup. The only thing might be the monitoring of the network connectivity
> but IMHO the control of failover policy is the job of multipathd.
> 
> IMHO a settings dialog for VDSM's creation of the multipath configuration
> file would be more useful than iSCSI Bonding.
> 
> Thank you,
> Uwe
Comment 13 Maor 2017-08-28 09:23:03 EDT
Hi,

Is there anything else which is still unclear?
Comment 14 Vinícius Ferrão 2017-08-28 12:24:42 EDT
Hello Maor, I've received the message about the phone call but I was on travel and just forget to answer it. Sorry!

The points explained by Uwe still persists, about the design decisions of multipath handling and it would/should behave.

I'm not sure if Uwe contacted you for more details or not.
Comment 15 Maor 2017-08-29 06:26:52 EDT
You can say that the iSCSI bond is mainly for monitoring and set the storage domain as non-operational once the the Host can not connect to the storage using the non-required networks.
Regarding multipathd.conf IIRC there should be some comment which you can add in the file that VDSM will not run over your configuration
Comment 16 Maor 2017-08-29 06:44:25 EDT
(In reply to Maor from comment #15)
> You can say that the iSCSI bond is mainly for monitoring and set the storage
> domain as non-operational once the the Host can not connect to the storage
> using the non-required networks.
> Regarding multipathd.conf IIRC there should be some comment which you can
> add in the file that VDSM will not run over your configuration

See this comment in multipath.py on VDSM:
# The second line of multipath.conf may contain PRIVATE_TAG. This means
# vdsm-tool should never change the conf file even when using the --force flag.

IIUC That means that if you will add this private tag (# VDSM PRIVATE) on top of your configuration VDSM will not run over it.
Comment 17 Maor 2017-09-03 09:04:44 EDT
It seems that the configuration of the iSCSI bond in the engine was resolved. There are still unclear issues from the user point of view but those can be discussed in the users mailing list or by a phone meeting.

Closing the bug for now, let's move back the discussion of the other unclear issues to the mailing list.

Note You need to log in before you can comment on or make changes to this bug.