Bug 1809748

Summary: Blacklist exception entry property "(SCSI_IDENT_|ID_WWN)" causes path states to be incorrectly displayed as active faulty
Product: Red Hat Enterprise Linux 8 Reporter: Amaresh <amaresh.thamizhakaran>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED DUPLICATE QA Contact: Lin Li <lilin>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.1CC: agk, amaresh.thamizhakaran, bmarzins, heinzm, jkachuck, lilin, mknutson, msnitzer, phinchman, prajnoha, renikuttan.babu, zkabelac
Target Milestone: rc   
Target Release: 8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-06-03 14:24:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sos report from the host none

Description Amaresh 2020-03-03 19:37:42 UTC
Description of problem:
On RHEL8.x (8.0 and 8.1)systems with LUNs mapped from Nimble Storage Controllers, the Blacklist exception entry property "(SCSI_IDENT_|ID_WWN)" in multipath.conf causes the paths to be displayed as faulty in multipath -ll output. 

See the output below
[root@rtp-hpe-ops07 ~]# multipath -ll
mpatha (204e7317153404e246c9ce900d54f5505) dm-0 ##,##
size=250G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| |- #:#:#:# sdb 8:16 active faulty running
| |- #:#:#:# sdc 8:32 active faulty running
| |- #:#:#:# sdg 8:96 active faulty running
| `- #:#:#:# sdh 8:112 active faulty running
`-+- policy='service-time 0' prio=0 status=enabled
  |- #:#:#:# sdd 8:48 active faulty running
  |- #:#:#:# sde 8:64 active faulty running
  |- #:#:#:# sdf 8:80 active faulty running
  `- #:#:#:# sdi 8:128 active faulty running

Multipath.conf looks like below:
defaults {
    user_friendly_names yes
}
blacklist {
}
blacklist_exceptions {
        property "(SCSI_IDENT_|ID_WWN)"
}
devices {
    device {
        product              "Server"
        failback             immediate
        path_grouping_policy group_by_prio
        no_path_retry        30
        dev_loss_tmo         infinity
        hardware_handler     "1 alua"
        fast_io_fail_tmo     5
        rr_min_io_rq         1
        vendor               "Nimble"
        rr_weight            uniform
        path_checker         tur
        prio                 "alua"
        path_selector        "service-time 0"
    }
}

Once the blacklist exception entry is commented out the multipath -ll output reports the path states correctly. See below:

[root@rtp-hpe-ops07 ~]# multipath -ll
mpatha (204e7317153404e246c9ce900d54f5505) dm-0 Nimble,Server
size=250G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 3:0:0:0 sdb 8:16  active ready running
| |- 3:0:1:0 sdc 8:32  active ready running
| |- 5:0:1:0 sdg 8:96  active ready running
| `- 5:0:2:0 sdh 8:112 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  |- 3:0:2:0 sdd 8:48  active ghost running
  |- 3:0:3:0 sde 8:64  active ghost running
  |- 5:0:0:0 sdf 8:80  active ghost running
  `- 5:0:3:0 sdi 8:128 active ghost running


Version-Release number of selected component (if applicable):
device-mapper-event-1.02.163-5.el8.x86_64
device-mapper-libs-1.02.163-5.el8.x86_64
device-mapper-persistent-data-0.8.5-2.el8.x86_64
device-mapper-event-libs-1.02.163-5.el8.x86_64
device-mapper-multipath-libs-0.8.0-5.el8.x86_64
device-mapper-1.02.163-5.el8.x86_64
device-mapper-multipath-0.8.0-5.el8.x86_64


How reproducible:


Steps to Reproduce:
1.Install RH8.x and enable multipath.
2.Create and map a few LUNs from Nimble Storage controller to the server 
3.Add the below entry to multipath.conf file
blacklist_exceptions {
        property "(SCSI_IDENT_|ID_WWN)"
} 
4.Reload multipath
5. Fire the command multipath -ll

Actual results:
Paths are listed as faulty

[root@rtp-hpe-ops07 ~]# multipath -ll
mpatha (204e7317153404e246c9ce900d54f5505) dm-0 ##,##
size=250G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| |- #:#:#:# sdb 8:16 active faulty running
| |- #:#:#:# sdc 8:32 active faulty running
| |- #:#:#:# sdg 8:96 active faulty running
| `- #:#:#:# sdh 8:112 active faulty running
`-+- policy='service-time 0' prio=0 status=enabled
  |- #:#:#:# sdd 8:48 active faulty running
  |- #:#:#:# sde 8:64 active faulty running
  |- #:#:#:# sdf 8:80 active faulty running
  `- #:#:#:# sdi 8:128 active faulty running

Expected results:
Paths should be displayed as below:
[root@rtp-hpe-ops07 ~]# multipath -ll
mpatha (204e7317153404e246c9ce900d54f5505) dm-0 Nimble,Server
size=250G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 3:0:0:0 sdb 8:16  active ready running
| |- 3:0:1:0 sdc 8:32  active ready running
| |- 5:0:1:0 sdg 8:96  active ready running
| `- 5:0:2:0 sdh 8:112 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  |- 3:0:2:0 sdd 8:48  active ghost running
  |- 3:0:3:0 sde 8:64  active ghost running
  |- 5:0:0:0 sdf 8:80  active ghost running
  `- 5:0:3:0 sdi 8:128 active ghost running


Additional info:

Comment 1 Amaresh 2020-03-03 19:46:45 UTC
Created attachment 1667291 [details]
sos report from the host

Comment 2 Ben Marzinski 2020-06-02 22:18:16 UTC
Do you know if this continues to happen in rhel-8.2? The fixes for the sg3_utils bzs #1746414 and #1785062 added udev rules that should set SCSI_IDENT_*, so this will hopefully work correctly now.

Comment 3 Amaresh 2020-06-03 01:29:58 UTC
I don't see this on RHEL-8.2.

[root@rtp-hpe-ops07 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.2 (Ootpa)
[root@rtp-hpe-ops07 ~]# cat /etc/multipath.conf
defaults {
    user_friendly_names yes
}
blacklist {
}
blacklist_exceptions {
        property "(SCSI_IDENT_|ID_WWN)"
}
devices {
    device {
        product              "Server"
        failback             immediate
        path_grouping_policy group_by_prio
        no_path_retry        30
        dev_loss_tmo         infinity
        hardware_handler     "1 alua"
        fast_io_fail_tmo     5
        rr_min_io_rq         1
        vendor               "Nimble"
        rr_weight            uniform
        path_checker         tur
        prio                 "alua"
        path_selector        "service-time 0"
    }
}

[root@rtp-hpe-ops07 ~]# multipath -ll
mpathe (23841f8cda17bd2d36c9ce9003c1acdc6) dm-8 Nimble,Server
size=650G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
  |- 6:0:0:11 sdaa 65:160 active ready running
  |- 6:0:1:11 sdac 65:192 active ready running
  |- 6:0:2:11 sdae 65:224 active ready running
  |- 6:0:3:11 sdag 66:0   active ready running
  |- 7:0:0:11 sdai 66:32  active ready running
  |- 7:0:1:11 sdak 66:64  active ready running
  |- 7:0:2:11 sdam 66:96  active ready running
  `- 7:0:3:11 sdao 66:128 active ready running


Thanks,
Amaresh

Comment 4 Ben Marzinski 2020-06-03 14:24:59 UTC
Great. I'm closing this as a duplicate of 1785062, which adds /usr/lib/udev/rules.d/61-scsi-sg3_id.rules, and fixes this problem.

*** This bug has been marked as a duplicate of bug 1785062 ***