Bug 1669235 - Unable to access drive using multipath with the latest update of Fedora 29 [NEEDINFO]
Summary: Unable to access drive using multipath with the latest update of Fedora 29
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 29
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Keywords:
: 1668751 (view as bug list)
Depends On:
Blocks: PPCTracker
TreeView+ depends on / blocked
 
Reported: 2019-01-24 17:07 UTC by Miguel Nunes
Modified: 2019-04-09 20:46 UTC (History)
32 users (show)

(edit)
Clone Of:
(edit)
Last Closed:
nsoffer: needinfo? (extras-qa)
labbott: needinfo? (miguel.nunes)


Attachments (Terms of Use)
journalctl --no-hostname -b | grep -v sshd > dmesg.txt (127.58 KB, application/octet-stream)
2019-02-24 17:00 UTC, IBM Bug Proxy
no flags Details
qemu log from FAH 20190205 (12.68 KB, application/octet-stream)
2019-02-24 17:00 UTC, IBM Bug Proxy
no flags Details

Description Miguel Nunes 2019-01-24 17:07:53 UTC
Description of problem:
Multipath device not recognized on the boot in the latest fedora


Version-Release number of selected component (if applicable):
device-mapper-multipath-0.7.7-6.gitef6d98b.fc29.x86_64


How reproducible:


Steps to Reproduce:
1. Start the computer
2. Wait boot to finish
3. Device not recognized and unavailable

Actual results:
Not able to access the drive using multipath

Expected results:
Access the drive with multipath

Additional info:
dmesg gives this output:

[    7.250383] device-mapper: multipath service-time: version 0.3.0 loaded
[    7.250552] device-mapper: table: 253:0: multipath: error getting device
[    7.250553] device-mapper: ioctl: error adding target to table
[    7.252010] device-mapper: table: table load rejected: not all devices are blk-mq request-stackable
[    7.252010] device-mapper: table: unable to determine table type

Invoking the multipath command, this error:

 miguel-nunes  …  kd-11  rpcs3  build  sudo multipath -v3
[sudo] password for miguel-nunes: 
Jan 24 15:06:41 | /etc/multipath.conf line 110, invalid keyword: all_devs
Jan 24 15:06:41 | device config in /etc/multipath.conf missing vendor or product parameter
Jan 24 15:06:41 | loading /lib64/multipath/libchecktur.so checker
Jan 24 15:06:41 | loading /lib64/multipath/libprioconst.so prioritizer
Jan 24 15:06:41 | foreign library "nvme" loaded successfully
Jan 24 15:06:41 | sda: mask = 0x1f
Jan 24 15:06:41 | sda: dev_t = 8:0
Jan 24 15:06:41 | sda: size = 2000409264
Jan 24 15:06:41 | sda: vendor = ATA
Jan 24 15:06:41 | sda: product = SanDisk SD8SN8U1
Jan 24 15:06:41 | sda: rev = 0000
Jan 24 15:06:41 | sda: h:b:t:l = 0:0:0:0
Jan 24 15:06:41 | sda: tgt_node_name = ata-1.00
Jan 24 15:06:41 | sda: path state = running
Jan 24 15:06:41 | sda: 58983 cyl, 255 heads, 63 sectors/track, start at 0
Jan 24 15:06:41 | sda: serial = 165040420446
Jan 24 15:06:41 | sda: get_state
Jan 24 15:06:41 | sda: detect_checker = yes (setting: multipath internal)
Jan 24 15:06:41 | failed to issue vpd inquiry for pgc9
Jan 24 15:06:41 | sda: path_checker = tur (setting: multipath internal)
Jan 24 15:06:41 | sda: checker timeout = 30 s (setting: kernel sysfs)
Jan 24 15:06:41 | sda: tur state = up
Jan 24 15:06:41 | sda: uid_attribute = ID_SERIAL (setting: multipath internal)
Jan 24 15:06:41 | sda: uid = SanDisk_SD8SN8U1T001122_165040420446 (udev)
Jan 24 15:06:41 | sda: detect_prio = yes (setting: multipath internal)
Jan 24 15:06:41 | sda: prio = const (setting: multipath internal)
Jan 24 15:06:41 | sda: prio args = "" (setting: multipath internal)
Jan 24 15:06:41 | sda: const prio = 1
Jan 24 15:06:41 | sdb: mask = 0x1f
Jan 24 15:06:41 | sdb: dev_t = 8:16
Jan 24 15:06:41 | sdb: size = 1953525168
Jan 24 15:06:41 | sdb: vendor = ATA
Jan 24 15:06:41 | sdb: product = HGST HTS721010A9
Jan 24 15:06:41 | sdb: rev = A3J0
Jan 24 15:06:41 | sdb: h:b:t:l = 2:0:0:0
Jan 24 15:06:41 | sdb: tgt_node_name = ata-3.00
Jan 24 15:06:41 | sdb: path state = running
Jan 24 15:06:41 | sdb: 56065 cyl, 255 heads, 63 sectors/track, start at 0
Jan 24 15:06:41 | sdb: serial =       JR10204M3V1MRE
Jan 24 15:06:41 | sdb: get_state
Jan 24 15:06:41 | sdb: detect_checker = yes (setting: multipath internal)
Jan 24 15:06:41 | failed to issue vpd inquiry for pgc9
Jan 24 15:06:41 | sdb: path_checker = tur (setting: multipath internal)
Jan 24 15:06:41 | sdb: checker timeout = 30 s (setting: kernel sysfs)
Jan 24 15:06:41 | sdb: tur state = up
Jan 24 15:06:41 | sdb: uid_attribute = ID_SERIAL (setting: multipath internal)
Jan 24 15:06:41 | sdb: uid = HGST_HTS721010A9E630_JR10204M3V1MRE (udev)
Jan 24 15:06:41 | sdb: detect_prio = yes (setting: multipath internal)
Jan 24 15:06:41 | sdb: prio = const (setting: multipath internal)
Jan 24 15:06:41 | sdb: prio args = "" (setting: multipath internal)
Jan 24 15:06:41 | sdb: const prio = 1
Jan 24 15:06:41 | dm-0: device node name blacklisted
===== paths list =====
uuid                                 hcil    dev dev_t pri dm_st chk_st vend/p
SanDisk_SD8SN8U1T001122_165040420446 0:0:0:0 sda 8:0   1   undef undef  ATA,Sa
HGST_HTS721010A9E630_JR10204M3V1MRE  2:0:0:0 sdb 8:16  1   undef undef  ATA,HG
Jan 24 15:06:41 | libdevmapper version 1.02.154 (2018-12-07)
Jan 24 15:06:41 | DM multipath kernel driver v1.13.0
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: user_friendly_names = no (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: alias = SanDisk_SD8SN8U1T001122_165040420446 (setting: default to WWID)
Jan 24 15:06:41 | sda: ownership set to SanDisk_SD8SN8U1T001122_165040420446
Jan 24 15:06:41 | sda: mask = 0xc
Jan 24 15:06:41 | sda: path state = running
Jan 24 15:06:41 | sda: get_state
Jan 24 15:06:41 | sda: tur state = up
Jan 24 15:06:41 | sda: const prio = 1
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: failback = "manual" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: path_grouping_policy = failover (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: path_selector = "service-time 0" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: no_path_retry = 4 (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: retain_attached_hw_handler = yes (setting: implied in kernel >= 4.3.0)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: features = "0" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: hardware_handler = "0" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: rr_weight = "uniform" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: minio = 1 (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: fast_io_fail_tmo = 5 (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: dev_loss_tmo = 30 (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: deferred_remove = no (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: delay_watch_checks = "no" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: delay_wait_checks = "no" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: marginal_path_err_sample_time = "no" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: marginal_path_err_rate_threshold = "no" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: marginal_path_err_recheck_gap_time = "no" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: marginal_path_double_failed_time = "no" (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: skip_kpartx = no (setting: multipath internal)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: ghost_delay = "no" (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: flush_on_last_del = yes (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: update dev_loss_tmo to 30
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: assembled map [1 queue_if_no_path 0 1 1 service-time 0 1 1 8:0 1]
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: set ACT_CREATE (map does not exist)
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: failed to load map, error 16
Jan 24 15:06:41 | Initialized new file [/dev/shm/multipath/failed_wwids/.lock]
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: domap (0) failure for create/reload map
Jan 24 15:06:41 | SanDisk_SD8SN8U1T001122_165040420446: ignoring map
Jan 24 15:06:41 | sda: orphan path, map flushed
Jan 24 15:06:41 | const prioritizer refcount 2
Jan 24 15:06:41 | tur checker refcount 2
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: user_friendly_names = no (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: alias = HGST_HTS721010A9E630_JR10204M3V1MRE (setting: default to WWID)
Jan 24 15:06:41 | sdb: ownership set to HGST_HTS721010A9E630_JR10204M3V1MRE
Jan 24 15:06:41 | sdb: mask = 0xc
Jan 24 15:06:41 | sdb: path state = running
Jan 24 15:06:41 | sdb: get_state
Jan 24 15:06:41 | sdb: tur state = up
Jan 24 15:06:41 | sdb: const prio = 1
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: failback = "manual" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: path_grouping_policy = failover (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: path_selector = "service-time 0" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: no_path_retry = 4 (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: retain_attached_hw_handler = yes (setting: implied in kernel >= 4.3.0)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: features = "0" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: hardware_handler = "0" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: rr_weight = "uniform" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: minio = 1 (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: fast_io_fail_tmo = 5 (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: dev_loss_tmo = 30 (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: deferred_remove = no (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: delay_watch_checks = "no" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: delay_wait_checks = "no" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: marginal_path_err_sample_time = "no" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: marginal_path_err_rate_threshold = "no" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: marginal_path_err_recheck_gap_time = "no" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: marginal_path_double_failed_time = "no" (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: skip_kpartx = no (setting: multipath internal)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: ghost_delay = "no" (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: flush_on_last_del = yes (setting: multipath.conf defaults/devices section)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: update dev_loss_tmo to 30
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: assembled map [1 queue_if_no_path 0 1 1 service-time 0 1 1 8:16 1]
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: set ACT_CREATE (map does not exist)
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: failed to load map, error 22
Jan 24 15:06:41 | Initialized new file [/dev/shm/multipath/failed_wwids/.lock]
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: domap (0) failure for create/reload map
Jan 24 15:06:41 | HGST_HTS721010A9E630_JR10204M3V1MRE: ignoring map
Jan 24 15:06:41 | sdb: orphan path, map flushed
Jan 24 15:06:41 | const prioritizer refcount 1
Jan 24 15:06:41 | tur checker refcount 1
Jan 24 15:06:41 | unloading const prioritizer
Jan 24 15:06:41 | unloading tur checker

Comment 1 Ben Marzinski 2019-01-30 05:49:46 UTC
This issue is due to a kernel change.  Kernel commit cef6f55a9fb4f6d6f9df0f772aa64cf159997466 is responsible for this error message:

device-mapper: table: table load rejected: not all devices are blk-mq request-stackable

but it is just the last commit out of a number of commits that remove the old dm request-based code. Request based dm devices (which is what multipath is) now must be stacked on top of block multiqueue devices.  However, for some reason, the fedora kernel isn't defaulting the scsi devices to use the blk-mq drivers.

The easiest way to fix this is to add

scsi_mod.use_blk_mq=y

to the kernel command line.  But I don't understand why the fedora kernel wasn't simply compiled with CONFIG_SCSI_MQ_DEFAULT, since it has already included patches to strip out support for the old request-queue devices from device-mapper.

Comment 2 Nir Soffer 2019-01-30 20:42:19 UTC
Same issue exists in Fedora 28 with kernel 4.20.4-100.fc28.x86_64.

Should we clone this bug to Fedora 28?

Comment 3 Ben Marzinski 2019-01-30 22:40:24 UTC
Bug #1670966 is a Fedora 28 version of this.

Comment 4 Dan Horák 2019-02-22 09:55:20 UTC
*** Bug 1668751 has been marked as a duplicate of this bug. ***

Comment 5 Dan Horák 2019-02-22 10:21:53 UTC
If I see right, then CONFIG_SCSI_MQ_DEFAULT and scsi_mod.use_blk_mq are going away completely with 5.0+ kernels. So we need something for the lifetime of 4.20 in the Fedora stable branches.

Comment 6 Dan Horák 2019-02-22 10:55:39 UTC
Opened https://src.fedoraproject.org/rpms/kernel/pull-request/27 to update the stable branches.

Comment 7 IBM Bug Proxy 2019-02-24 17:00:26 UTC
Created attachment 1538180 [details]
journalctl --no-hostname -b | grep -v sshd > dmesg.txt

Comment 8 IBM Bug Proxy 2019-02-24 17:00:28 UTC
Created attachment 1538181 [details]
qemu log from FAH 20190205

Comment 9 Laura Abbott 2019-04-09 20:46:12 UTC
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 29 kernel bugs.
 
Fedora XX has now been rebased to 5.0.6  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 30, and are still experiencing this issue, please change the version to Fedora 30.
 
If you experience different issues, please open a new bug report for those.


Note You need to log in before you can comment on or make changes to this bug.