Bug 1254292

Summary: Multipath is not correctly identifying iscsi devices, and misconfiguring them.
Product: Red Hat Enterprise Linux 7 Reporter: Ben Marzinski <bmarzins>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: Lin Li <lilin>
Severity: unspecified Docs Contact:
Priority: high    
Version: 7.1CC: agk, bmarzins, bmcclain, gklein, hannsj_uhl, heinzm, lilin, mhoyer, msnitzer, nsoffer, prajnoha, snagar, ylavi, zkabelac
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: device-mapper-multipath-0.4.9-82.el7 Doc Type: Bug Fix
Doc Text:
Cause: To determine if a device was an iscsi device, multipath was checking the "tgtname" sysfs file. The file name has changed to be "targetname". This meant that multipath was not correctly identifying iscsi devices as such, and would not use the iscsi specific configuration functions. Consequence: Multipath would not correctly set fast_io_fail_tmo for iscsi device. Fix: Multipath now checks for both "tgtname" and "targetname" Result: Multipath correctly identifies iscsi devices as such, and configures them correctly.
Story Points: ---
Clone Of:
: 1256074 1267131 (view as bug list) Environment:
Last Closed: 2015-11-19 12:57:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1154205, 1255173, 1256074, 1261141, 1267131    

Description Ben Marzinski 2015-08-17 15:55:49 UTC
Description of problem:
Multipath's code for setting fast_io_fail_tmo, dev_loss_tmo, and the device target_name, is specific to the type of device that it is running on.  In this case, multipath isn't correctly identifying an iscsi device as such, and so these values aren't getting set.


Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-77.el7 

How reproducible:
Always on effected setups.  Not sure exactly what's required for a reproducing setup yet.

Steps to Reproduce:
1. configure multipath with

fast_io_fail_tmo 5

2. create some multipath devices
3. check /sys/class/iscsi_session/session<X>/recovery_tmo

to verify that multipath did update the recovery timeout to 5 for the appropriate sessions

Actual results:
value still at default

Expected results:
value updated to multipath.conf configured value

Additional info:
Looking at the verbose output, you can see that the tgt_node_name never got set.

Aug 14 00:48:45 | Discover device /sys/devices/platform/host4/session2/target4:0
:0/4:0:0:1/block/sda
Aug 14 00:48:45 | sda: not found in pathvec  
Aug 14 00:48:45 | sda: mask = 0x3f
Aug 14 00:48:45 | sda: dev_t = 8:0
Aug 14 00:48:45 | open '/sys/devices/platform/host4/session2/target4:0:0/4:0:0:1
/block/sda/size'
Aug 14 00:48:45 | sda: size = 104857600
Aug 14 00:48:45 | sda: vendor = Red Cat
Aug 14 00:48:45 | sda: product = VIRTUAL-DISK
Aug 14 00:48:45 | sda: rev = 0001
Aug 14 00:48:45 | sda: h:b:t:l = 4:0:0:1
Aug 14 00:48:45 | open '/sys/devices/platform/host4/session2/target4:0:0/4:0:0:1/state'
Aug 14 00:48:45 | sda: path state = running  

Aug 14 00:48:45 | sda: 51200 cyl, 64 heads, 32 sectors/track, start at 0
Aug 14 00:48:45 | sda: serial =                               beaf71
Aug 14 00:48:45 | sda: get_state
Aug 14 00:48:45 | sda: path checker = directio (internal default)
Aug 14 00:48:45 | sda: checker timeout = 30000 ms (sysfs setting)
Aug 14 00:48:45 | directio: called for 800   
Aug 14 00:48:45 | directio: called in synchronous mode
Aug 14 00:48:45 | directio: starting new request
Aug 14 00:48:45 | directio: io finished 4096/0
Aug 14 00:48:45 | sda: state = up
Aug 14 00:48:45 | sda: uid_attribute = ID_SERIAL (internal default)
Aug 14 00:48:45 | sda: uid = 1IET_00070001 (udev)
Aug 14 00:48:45 | sda: detect_prio = 1 (config file default)
Aug 14 00:48:45 | sda: prio = const (internal default)
Aug 14 00:48:45 | sda: prio =  (internal default)
Aug 14 00:48:45 | sda: const prio = 1


Right after

Aug 14 00:48:45 | sda: h:b:t:l = 4:0:0:1

there should be a line with

tgt_node_name = <name>

If multipath identified the device as a known type (which iscsi is).

Comment 2 Nir Soffer 2015-08-17 17:23:45 UTC
Reproduce on rhel 7.1:
device-mapper-multipath-0.4.9-77.el7_1.1.x86_64

# mpathconf
multipath is enabled
find_multipaths is disabled
user_friendly_names is disabled
dm_multipath module is loaded
multipathd is running

# cat /etc/multipath.conf
defaults {
    polling_interval            5
    no_path_retry               fail
    user_friendly_names         no
    flush_on_last_del           yes
    fast_io_fail_tmo            5
    dev_loss_tmo                30
    max_fds                     4096
}

devices {
    device {
        all_devs                yes
        no_path_retry           fail
    }
}

# cat /sys/class/iscsi_session/session*/recovery_tmo
120
120

In multipath -ll I get also strange error (this host does not have fc hardware):

# multipath -ll
Aug 17 20:18:37 | vda: No fc_host device for 'host-1'
Aug 17 20:18:37 | vdb: No fc_host device for 'host-1'
Aug 17 20:18:37 | vda: No fc_host device for 'host-1'
Aug 17 20:18:37 | vdb: No fc_host device for 'host-1'
Aug 17 20:18:37 | vda: No fc_remote_port device for 'rport--1:-1-0'
Aug 17 20:18:37 | vdb: No fc_remote_port device for 'rport--1:-1-0'
1IET_00070001 dm-3 Red Cat ,VIRTUAL-DISK    
size=50G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 7:0:0:1  sda 8:0    active ready running
1IET_0006000a dm-93 Red Cat ,VIRTUAL-DISK    
size=20G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 8:0:0:10 sdt 65:48  active ready running

The storage "server" is running rhel 6.6 with tgt.

This machine is a vm. vda and vdb are virtual disks using virtio driver,
running on virt-manager on fedora 21 host.

Comment 3 Nir Soffer 2015-08-17 17:30:36 UTC
Reproduce again on rhel 7.2:
device-mapper-multipath-0.4.9-81.el7.x86_64

Using same configuration as in comment 2

Comment 4 Ben Marzinski 2015-08-17 19:01:12 UTC
It seems a the target name sysfs parameter for the iscsi device target name has changed from "tgtname" to "targetname".  I'm going to change the code to check for both, that fixes the issue for me.

Comment 5 Yaniv Lavi 2015-08-18 11:43:07 UTC
Bronce, can we mark this one as blocker?
It breaks RHEV ISCSI storage.

Comment 7 Nir Soffer 2015-08-18 14:52:13 UTC
Bronce, can we have this fix for 7.1.z.

Comment 9 Ben Marzinski 2015-08-19 18:26:43 UTC
fixed.

Comment 12 Lin Li 2015-09-16 09:59:16 UTC
Reproduced on RHEL-7.1; device-mapper-multipath-0.4.9-77.el7
steps:
1.#yum -y install device-mapper device-mapper-multipath
2.#rpm -qa | grep multipath
  device-mapper-multipath-libs-0.4.9-77.el7.x86_64
  device-mapper-multipath-0.4.9-77.el7.x86_64
3.#mpathconf --enable
4.#service multipathd restart
5.#configure multipath 
6.# mpathconf 
  multipath is enabled
  find_multipaths is disabled
  user_friendly_names is disabled
  dm_multipath module is loaded
  multipathd is running
7.# cat /etc/multipath.conf 
defaults {
    polling_interval            5
    no_path_retry               fail
    user_friendly_names         no
    flush_on_last_del           yes
    fast_io_fail_tmo            5
    dev_loss_tmo                30
    max_fds                     4096
}
devices {
    device {
        all_devs                yes
        no_path_retry           fail
    }
}
8.service multipathd reload
9.cat /sys/class/iscsi_session/session1/recovery_tmo 
120


verified on RHEL-7.2;  device-mapper-multipath-0.4.9-82.el7
steps:
1.#yum -y install device-mapper device-mapper-multipath
2.#rpm -qa | grep multipath
device-mapper-multipath-libs-0.4.9-82.el7.x86_64
device-mapper-multipath-0.4.9-82.el7.x86_64
3.#mpathconf --enable
4.#service multipathd restart
5.#configure multipath 
6.# mpathconf 
  multipath is enabled
  find_multipaths is disabled
  user_friendly_names is disabled
  dm_multipath module is loaded
  multipathd is running
7.# cat /etc/multipath.conf 
defaults {
    polling_interval            5
    no_path_retry               fail
    user_friendly_names         no
    flush_on_last_del           yes
    fast_io_fail_tmo            5
    dev_loss_tmo                30
    max_fds                     4096
}
devices {
    device {
        all_devs                yes
        no_path_retry           fail
    }
}
8.service multipathd reload
9.cat /sys/class/iscsi_session/session2/recovery_tmo 
   5

Comment 13 Lin Li 2015-09-16 10:08:39 UTC
verified on hp-dl385g7-05.rhts.eng.nay.redhat.com.
change to verified.

Comment 16 errata-xmlrpc 2015-11-19 12:57:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2132.html