Red Hat Bugzilla – Bug 1272083
Consume fix for "iscsi_session recovery_tmo revert back to default when a path becomes active"
Last modified: 2017-11-19 05:40:15 EST
+++ This bug was initially created as a clone of Bug #1253790 +++
Description of problem:
iSCSI default replacement_timeout is 120 seconds, resulting in too slow
iSCSI failover in multipath setup. In vdsm, this may lead to blocking of
multiple unrelated vdsm threads for many minutes, when lvm, multipath
ore scsi scan operation blocks.
This issue was resolved in multipath (bug 1099932), by configuring iscsi
session recovery_tmo sysfs attribute to multipath fast_io_fail_tmo
value, (5 seconds in vdsm configuration). However, this configuration
was reverted to the default 120 seconds after a device went down an up
again, or after restart of the iscsid daemon (bug 1139038).
This issue was fixed in kernel-3.10.0-295.el7. In this version, setting
session recovery_tmo using sysfs overrides the default value defined in
iscsid configuration file.
Vdsm need to require a kernel containing a fix for this issue on Fedora
versions including this fix.
--- Additional comment from Eyal Edri on 2015-10-14 18:38:14 IDT ---
shouldn't this bug has 3.5.z? flag set if the target milestone is set to 3.5.6?
trying to understand how clone candidates are treated in the new classification
--- Additional comment from Allon Mureinik on 2015-10-15 10:45:06 IDT ---
(In reply to Eyal Edri from comment #1)
> shouldn't this bug has 3.5.z? flag set if the target milestone is set to
> trying to understand how clone candidates are treated in the new
Yeah, probably so.
This is waiting for qa-ack+ so we can clone it.
Gil - can you assist?
Manualy cloning as the job refuses to clone vdsm bugs for some obscure reason.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Aren't this bug and the one below  the same?
(In reply to Aharon Canan from comment #2)
> Aren't this bug and the one below  the same?
>  https://bugzilla.redhat.com/show_bug.cgi?id=1273421
No. bug 1273421 is about multipath fix, which was fixed for some time, but
was not enough. Multipath was configuring devices properly, but once a device
was becoming faulty (e.g. network issue), and active again, iscsid was
overriding device configuration using the default (120 seconds). This
was fixed lately in the kernel and now we require that kernel.
But this is exactly what I checked in the other bug, that we require kernel (Please see comment #4 on https://bugzilla.redhat.com/show_bug.cgi?id=1273421)
anyway, just to be sure on both and not to missing something,
Can you please approve and add verification steps?
(In reply to Aharon Canan from comment #4)
> But this is exactly what I checked in the other bug, that we require kernel
> (Please see comment #4 on
> anyway, just to be sure on both and not to missing something,
> Can you please approve and add verification steps?
You are correct it the same bug - but different products. This is an ovirt
bug, and bug 1273421 is a RHEV bug.
The fix is the same fix, requiring the right kernel for RHEL/Fedora.
Verified using vt18.2
oVirt 3.5.6 has been released and the bz verified, moving to closed current release.