Bug 1469944

Summary: [RHEL7.4][DM-Multipath] 'user_friendly_names' does not work
Product: Red Hat Enterprise Linux 7 Reporter: xhe <xhe>
Component: device-mapper-multipathAssignee: LVM and device-mapper development team <lvm-team>
Status: CLOSED NOTABUG QA Contact: Storage QE <storage-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.5CC: agk, bmarzins, heinzm, lilin, msnitzer, prajnoha, yizhan
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-13 01:10:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description xhe@redhat.com 2017-07-12 06:34:37 UTC
Description of problem:
no alias of mpath appears when set user_friendly_names=yes

Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-111.el7.x86_64
kernel-3.10.0-693.el7.x86_64

How reproducible:
often

Steps to Reproduce:
1. start multipath service
2. config user_friendly_names=yes
# multipathd show config
defaults {
        verbosity 2
        polling_interval 5
        max_polling_interval 20
        reassign_maps "yes"
        multipath_dir "/lib64/multipath"
        path_selector "service-time 0"
        path_grouping_policy "failover"
        uid_attribute "ID_SERIAL"
        prio "const"
        prio_args ""
        features "0"
        path_checker "directio"
        alias_prefix "mpath" ---> set alias
        user_friendly_names "yes" ---> set 'yes'
        failback "manual"
        rr_min_io 1000
        rr_min_io_rq 1
        max_fds 1048576
        rr_weight "uniform"
        queue_without_daemon "no"
        flush_on_last_del "no"
...
3. reload multipath config
# multipath -r

Actual results:
# multipath -r
Jul 12 02:17:24 | mpatha: ignoring map
reload: 360a9800042566643352b476d67496d30 undef NETAPP  ,LUN             
size=30G features='3 queue_if_no_path pg_init_retries 50' hwhandler='0' wp=undef
|-+- policy='service-time 0' prio=50 status=undef
| |- 3:0:0:0 sdb 8:16  active ready running
| `- 4:0:1:0 sdn 8:208 active ready running
`-+- policy='service-time 0' prio=10 status=undef
  |- 3:0:1:0 sdf 8:80  active ready running
  `- 4:0:0:0 sdj 8:144 active ready running
reload: 360a9800042566643352b476d67496e52 undef NETAPP  ,LUN             
size=2.0G features='3 queue_if_no_path pg_init_retries 50' hwhandler='0' wp=undef
|-+- policy='service-time 0' prio=50 status=undef
| |- 3:0:0:1 sdc 8:32  active ready running
| `- 4:0:1:1 sdo 8:224 active ready running
`-+- policy='service-time 0' prio=10 status=undef
  |- 3:0:1:1 sdg 8:96  active ready running
  `- 4:0:0:1 sdk 8:160 active ready running
reload: 360a9800042566643352b476d67496e54 undef NETAPP  ,LUN             
size=2.0G features='3 queue_if_no_path pg_init_retries 50' hwhandler='0' wp=undef
|-+- policy='service-time 0' prio=50 status=undef
| |- 3:0:0:2 sdd 8:48  active ready running
| `- 4:0:1:2 sdp 8:240 active ready running
`-+- policy='service-time 0' prio=10 status=undef
  |- 3:0:1:2 sdh 8:112 active ready running
  `- 4:0:0:2 sdl 8:176 active ready running
reload: 360a9800042566643352b476d67496e56 undef NETAPP  ,LUN             
size=2.0G features='3 queue_if_no_path pg_init_retries 50' hwhandler='0' wp=undef
|-+- policy='service-time 0' prio=50 status=undef
| |- 3:0:0:3 sde 8:64  active ready running
| `- 4:0:1:3 sdq 65:0  active ready running
`-+- policy='service-time 0' prio=10 status=undef
  |- 3:0:1:3 sdi 8:128 active ready running
  `- 4:0:0:3 sdm 8:192 active ready running
[root@storageqe-53 hba]# multipath -l
360a9800042566643352b476d67496e56 dm-6 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| |- 3:0:0:3 sde 8:64  active undef unknown
| `- 4:0:1:3 sdq 65:0  active undef unknown
`-+- policy='service-time 0' prio=0 status=enabled
  |- 3:0:1:3 sdi 8:128 active undef unknown
  `- 4:0:0:3 sdm 8:192 active undef unknown
360a9800042566643352b476d67496e54 dm-5 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| |- 3:0:0:2 sdd 8:48  active undef unknown
| `- 4:0:1:2 sdp 8:240 active undef unknown
`-+- policy='service-time 0' prio=0 status=enabled
  |- 3:0:1:2 sdh 8:112 active undef unknown
  `- 4:0:0:2 sdl 8:176 active undef unknown
360a9800042566643352b476d67496e52 dm-4 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| |- 3:0:0:1 sdc 8:32  active undef unknown
| `- 4:0:1:1 sdo 8:224 active undef unknown
`-+- policy='service-time 0' prio=0 status=enabled
  |- 3:0:1:1 sdg 8:96  active undef unknown
  `- 4:0:0:1 sdk 8:160 active undef unknown
360a9800042566643352b476d67496d30 dm-3 NETAPP  ,LUN             
size=30G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| |- 3:0:0:0 sdb 8:16  active undef unknown
| `- 4:0:1:0 sdn 8:208 active undef unknown
`-+- policy='service-time 0' prio=0 status=enabled
  |- 3:0:1:0 sdf 8:80  active undef unknown
  `- 4:0:0:0 sdj 8:144 active undef unknown

************ snip - dmesg *******************
[81887.899554] device-mapper: table: 253:7: multipath: error getting device
[81887.907040] device-mapper: ioctl: error adding target to table
[81887.936143] sd 3:0:0:0: alua: port group 00 state A non-preferred supports TolUsNA
[81887.936475] sd 4:0:1:0: alua: port group 00 state A non-preferred supports TolUsNA
[81887.966108] sd 3:0:0:1: alua: port group 00 state A non-preferred supports TolUsNA
[81887.966459] sd 4:0:1:1: alua: port group 00 state A non-preferred supports TolUsNA
[81887.995112] sd 3:0:0:2: alua: port group 00 state A non-preferred supports TolUsNA
[81887.995374] sd 4:0:1:2: alua: port group 00 state A non-preferred supports TolUsNA
[81888.018896] sd 3:0:0:3: alua: port group 00 state A non-preferred supports TolUsNA
[81888.019196] sd 4:0:1:3: alua: port group 00 state A non-preferred supports TolUsNA
[81965.387196] sd 3:0:0:0: alua: port group 00 state A non-preferred supports TolUsNA
[81965.387523] sd 4:0:1:0: alua: port group 00 state A non-preferred supports TolUsNA
************ snip *******************

Expected results:

The alias mpatha, mpathb, mpathX should appear.
# multipath -l
mpatha 360a9800042566643352b476d67496e56 dm-6 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| |- 3:0:0:3 sde 8:64  active undef unknown
| `- 4:0:1:3 sdq 65:0  active undef unknown
`-+- policy='service-time 0' prio=0 status=enabled
  |- 3:0:1:3 sdi 8:128 active undef unknown
  `- 4:0:0:3 sdm 8:192 active undef unknown
mpathb 360a9800042566643352b476d67496e54 dm-5 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| |- 3:0:0:2 sdd 8:48  active undef unknown
| `- 4:0:1:2 sdp 8:240 active undef unknown
...


Additional info:

Comment 1 Zhang Yi 2017-07-12 07:39:59 UTC
That's because the default configuration for NETAPP was set to no[2], the definition can be found here[1]
[1]
/root/rpmbuild/BUILD/multipath-tools-130222/libmultipath/hwtable.c
        /*
         * NETAPP controller family
         *
         * Maintainer : Dave Wysochanski
         * Mail : davidw
         */
        {
                .vendor        = "NETAPP",
                .product       = "LUN.*",
                .features      = "3 queue_if_no_path pg_init_retries 50",
                .hwhandler     = DEFAULT_HWHANDLER,
                .pgpolicy      = GROUP_BY_PRIO,
                .pgfailback    = -FAILBACK_IMMEDIATE,
                .flush_on_last_del = FLUSH_ENABLED,
                .rr_weight     = RR_WEIGHT_NONE,
                .no_path_retry = NO_PATH_RETRY_UNDEF,
                .minio         = 128,
                .dev_loss      = MAX_DEV_LOSS_TMO,
                .checker_name  = TUR,
                .prio_name     = PRIO_ONTAP,
                .prio_args     = NULL,
                .retain_hwhandler = RETAIN_HWHANDLER_ON,
                .user_friendly_names = USER_FRIENDLY_NAMES_OFF,
                .detect_prio   = DETECT_PRIO_ON,
        },

[2]
#multipathd show config | grep -15b \"NETAPP\"
13090-	device {
13100:		vendor "NETAPP"
13118-		product "LUN.*"
13136-		path_grouping_policy "group_by_prio"
13175-		path_checker "tur"
13196-		features "3 queue_if_no_path pg_init_retries 50"
13247-		hardware_handler "0"
13270-		prio "ontap"
13285-		failback immediate
13306-		rr_weight "uniform"
13328-		rr_min_io 128
13344-		flush_on_last_del "yes"
13370-		dev_loss_tmo "infinity"
13396-		user_friendly_names no
13421-		retain_attached_hw_handler yes
13454-		detect_prio yes
13472-	}

You can append bellow to /etc/multipath.conf
devices {
	device {
		vendor "NETAPP"
		product "LUN.*"
		path_grouping_policy "group_by_prio"
		path_checker "tur"
		features "3 queue_if_no_path pg_init_retries 50"
		hardware_handler "0"
		prio "ontap"
		failback immediate
		rr_weight "uniform"
		rr_min_io 128
		flush_on_last_del "yes"
		dev_loss_tmo "infinity"
		user_friendly_names yes 
		retain_attached_hw_handler yes
		detect_prio yes
	}
}

After above steps and restart the service, it will be changed to yes. 

#multipathd show config | grep -15b \"NETAPP\"
19343-	device {
19353:		vendor "NETAPP"
19371-		product "LUN.*"
19389-		path_grouping_policy "group_by_prio"
19428-		path_checker "tur"
19449-		features "3 queue_if_no_path pg_init_retries 50"
19500-		hardware_handler "0"
19523-		prio "ontap"
19538-		failback immediate
19559-		rr_weight "uniform"
19581-		rr_min_io 128
19597-		flush_on_last_del "yes"
19623-		dev_loss_tmo "infinity"
19649-		user_friendly_names yes        ----> changed to yes
19675-		retain_attached_hw_handler yes
19708-		detect_prio yes
19726-	}

And also"user_friendly_names yes" works well.
# multipath -ll
mpathe (360a9800042566643352b476d67496e56) dm-6 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=enabled
| |- 3:0:0:3 sde 8:64  active ready running
| `- 4:0:1:3 sdq 65:0  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 3:0:1:3 sdi 8:128 active ready running
  `- 4:0:0:3 sdm 8:192 active ready running
mpathd (360a9800042566643352b476d67496e54) dm-5 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=enabled
| |- 3:0:0:2 sdd 8:48  active ready running
| `- 4:0:1:2 sdp 8:240 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 3:0:1:2 sdh 8:112 active ready running
  `- 4:0:0:2 sdl 8:176 active ready running
mpathc (360a9800042566643352b476d67496e52) dm-4 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=enabled
| |- 3:0:0:1 sdc 8:32  active ready running
| `- 4:0:1:1 sdo 8:224 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 3:0:1:1 sdg 8:96  active ready running
  `- 4:0:0:1 sdk 8:160 active ready running
mpathb (360a9800042566643352b476d67496d30) dm-3 NETAPP  ,LUN             
size=30G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=enabled
| |- 3:0:0:0 sdb 8:16  active ready running
| `- 4:0:1:0 sdn 8:208 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 3:0:1:0 sdf 8:80  active ready running
  `- 4:0:0:0 sdj 8:144 active ready running

Comment 2 xhe@redhat.com 2017-07-12 09:12:16 UTC
It's true. The definition of this option is OFF (.user_friendly_names = USER_FRIENDLY_NAMES_OFF) in the source of device-mapper-multipath-0.4.9-111, therefore users have to set it to 'yes' for device of NETAPP in multipath.conf if need alias. Otherwise the default{} section in multipath.config doesn't work. 

However I am just curious about why this option of NETAPP is OFF in source code. In my opinion, the general logic of process will be like that: Firstly read the option (e.g user_friendly_names) from section of NETAPP device{}, if not found, process will go to read the section of default{}, if still not found, will treat finally this option as OFF. But now current process only read it from the section NETAPP device{}, didn't check any other sections. 

Please correct me if I made mistake!

Thanks, Xiaonan

Comment 3 Ben Marzinski 2017-07-12 20:23:07 UTC
The way that multipath processes the options is that it first looks at the multipaths section to see if the option is set, if not it looks at the devices section to see if the option is set, if not it looks at the defaults section to see if the option is set, if not, it uses the compiled in default (if applicable).

What you are seeing is multipath working just like this. It found "user_friendly_names no" in the netapp config in the devices section. That has a higher priority than the defaults section, so it stops there. The reason that this is set is that we let the device vendor make the final say in how their device is configured (although we do offer advice).  Netapp doesn't want its devices to use user-friendly names. They made this decision so long ago, that I don't recall the reasoning anymore.

Unless you have another issue, this looks like it should be closed NOTABUG.