Red Hat Bugzilla – Bug 980139
Need to add deps on kernel] vdsm iscsi failover taking too long during controller maintenance
Last modified: 2016-03-09 14:17:59 EST
Description of problem:
iSCSI multipath failover delays result in performance issues and guest being marked as 'Down'. Path failover is taking too long during controller failure or maintenance.
Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Hypervisor release 6.4 (20130501.0.el6_4)
Steps to Reproduce:
1. Present iSCSI storage from ALUA configured storage
2. Perform storage controller reboot
3. multipath failover is delayed and results in guest stability issues
36005076802808538a00000000000009b dm-10 IBM,2145
size=2.0T features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 1:0:0:3 sde 8:64 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
`- 2:0:0:3 sdm 8:192 active ready running
Storage domain <domain> experienced a high latency of 28.7959890366 seconds from host <rhev-host>. This may cause performance and functional issues. Please consult your Storage Administrator.
Fast failover time with no latency messages logged and no impact to guests.
- Reduce the replacement_timeout since all iSCSI storage is default candidate for multipath:
node.session.timeo.replacement_timeout = 120
Default is 2 minutes.
replacement_timeout will control how long to wait for session re-establishment
before failing pending SCSI commands and commands that are being operated on by
the SCSI layer's error handler up to a higher level like multipath or to
an application if multipath is not being used.
Current work-around is to manually reduce this values within the iscsi target record db.
Is this something vdsm should handle? or something to send to dmm?
This looks like a generic iscsi/dmm issue to me (default seems wrong when using dmm). If this is wrong, please recommend what the value should be for virt use case and move back.
In RHEL6, multipath doesn't override iscsi's default parameters. The instructions for setting up iscsi devices for multipath are at /usr/share/docs/iscsi-initiator-utils-<version>/README: Chapter 8.1
In RHEL7, multipath will automatically set the recovery_tmo iscsi sysfs parameter to the value defined for fast_io_fail_tmo in /etc/multipath.conf. The recovery_tmo parameter is normally set by the node.session.timeo.replacement_timeout configuration parameter, so this is equivalent to modifying the node.session.timeo.replacement_timeout for just the multipath devices.
It's possible to backport this, but it will most likely have to wait for RHEL-6.6
If I understand comment 6 correctly, then this problem can probably be solved when we change the defaults at build-time.
Because of this I am moving this back to RHEV-H.
I don't think this problem is limited to node only. It's a generic rhel/vdsm issue (changing component/whiteboard).
I verified that with device-mapper-multipath-0.4.9-78.el6.x86_64 the recovery timeout is now properly configured:
# iscsiadm -m session -P 2
- Recovery Timeout: 120
+ Recovery Timeout: 5
In the iscsi database (/var/lib/iscsi) we still have the default value:
node.session.timeo.replacement_timeout = 120
but then it's properly configured once multipath takes over.
We'll keep this bz on vdsm to track the spec requirement changes, e.g.:
Requires: device-mapper-multipath >= 0.4.9-78
(In reply to Federico Simoncelli from comment #18)
> Requires: device-mapper-multipath >= 0.4.9-78
This will only be available in RHEL 6.6 - lets wait for that in order to consume it.
(In reply to Allon Mureinik from comment #19)
> (In reply to Federico Simoncelli from comment #18)
> > Requires: device-mapper-multipath >= 0.4.9-78
> This will only be available in RHEL 6.6 - lets wait for that in order to
> consume it.
Pushing out to RHEV 3.6.0 based on this.
We can always pull it back in if device-mapper-multipath becomes available sooner.
Testing fix for bug 880738 show that we must fix this first.
I blocked access to the storage server using iptables - this is probably the worst case, since removed device should return some error from the storage server. This test just drop all packets sent to the server.
With default node.session.timeo.replacement_timeout = 120, "multipath -ll" will block for 2 minutes before it fails.
With node.session.timeo.replacement_timeout = 10, "multipath -ll" failes after 15-25 seconds.
Ben, can we have a backport for 6.5/7.0?
If not, how about adding a udev rule to modify the timeout? The rule can do this:
echo 5 > /sys/class/iscsi_session/xxx/recovery_tmo
Attached patch handle only ISCSI devices, which can be handled when adding iscsi nodes in /var/lib/iscsi/nodes/. This keep the change localized to devices used by vdsm.
FC devices should be handled differently, probably using udev rule.
Moving to new since we only have partial fix and we are waiting for platform to fully correct this.
Nir, Allon, Yaniv D. - the bug made it to 7.2 (BZ 1139038) - anything blocking this bug?
(In reply to Yaniv Kaul from comment #35)
> Nir, Allon, Yaniv D. - the bug made it to 7.2 (BZ 1139038) - anything
> blocking this bug?
We are waiting until this kernel is released. Vdsm will need to require
this kernel version.
Nir - we are good to go - seems like the blocking bugs were resolved long time ago. Please continue with this one.
(In reply to Yaniv Kaul from comment #37)
> Nir - we are good to go - seems like the blocking bugs were resolved long
> time ago. Please continue with this one.
From the bugs:
(In reply to Yaniv Kaul from comment #38)
> (In reply to Yaniv Kaul from comment #37)
> > Nir - we are good to go - seems like the blocking bugs were resolved long
> > time ago. Please continue with this one.
> From the bugs:
> F21: 4.1.6
> F22: 4.1.6
We depend on these version now (see bug 1253790).
> F23: 4.2.0-rc8
> rawhide: 4.2.0-rc8
Not supported yet.
We cannot depend on this in ovirt-3.6, which must run on el 7.1. We will depend on this kernel when el 7.2 is released.
I updated the vdsm patch to depend on the 7.1.z version:
But this kernel is not available yet on Centos; we can see here
that the latest kernel is kernel-3.10.0-229.14.1.el7.x86_64.rpm
which do not include this fix.
Currently we separate rhel and contos dependencies, so we cannot
depend on package which is not available on centos yet.
We have different requirement now for rhel and centos. This is now fixed for
rhel because the package is available, and not fixed on centos, because
nobody care to public the fixed package there.
We will require the fixed kernel on centos when it is available.
Nir, can you please add some doctext on the impact this has on the customer?
Target Milestone/Release -
Can you please set it right?
Recovery timeout is now 15 seconds:
[root@green-vdsb ~]# iscsiadm -m session -P 2
Recovery Timeout: 15
Target Reset Timeout: 30
LUN Reset Timeout: 30
Abort Timeout: 15
[root@green-vdsb ~]# cat /sys/class/iscsi_session/session1/recovery_tmo
vdsm requires kernel >= 3.10.0-229.17.1.el7:
[root@green-vdsb ~]# repoquery --requires vdsm |grep kernel
kernel >= 3.10.0-229.17.1.el7
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.