RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 917538 - device mapper multipath fails to create 1024 mpaths on s390x
Summary: device mapper multipath fails to create 1024 mpaths on s390x
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: device-mapper-multipath
Version: 7.0
Hardware: s390x
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Peter Rajnoha
QA Contact: Bruno Goncalves
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-04 10:10 UTC by Bruno Goncalves
Modified: 2023-03-08 07:25 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-13 13:21:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Bruno Goncalves 2013-03-04 10:10:24 UTC
Description of problem:
Trying to login to 1024 LUNs causes the following messages:
ar  4 05:06:04 ibm-z10-32 systemd-udevd[2315]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbfp' [8027]
Mar  4 05:06:06 ibm-z10-32 systemd-udevd[1949]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbfm' [8017]
Mar  4 05:06:06 ibm-z10-32 systemd-udevd[1970]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbgb' [8029]
Mar  4 05:06:07 ibm-z10-32 systemd-udevd[2164]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbgg' [8016]
Mar  4 05:06:07 ibm-z10-32 systemd-udevd[1903]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbft' [8024]
Mar  4 05:06:07 ibm-z10-32 systemd-udevd[1885]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbex' [8018]
Mar  4 05:06:07 ibm-z10-32 systemd-udevd[2315]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbfp' [8027]
Mar  4 05:06:07 ibm-z10-32 systemd-udevd[1887]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbef' [8019]
Mar  4 05:06:07 ibm-z10-32 systemd-udevd[1970]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbgb' [8029]
Mar  4 05:06:07 ibm-z10-32 systemd-udevd[1903]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbft' [8024]
Mar  4 05:06:07 ibm-z10-32 systemd-udevd[1970]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbgb' [8029]
Mar  4 05:06:08 ibm-z10-32 systemd-udevd[1933]: timeout: killing '/sbin/multipath -c /dev/sdbdw' [8023]
Mar  4 05:06:10 ibm-z10-32 systemd-udevd[1933]: timeout: killing '/sbin/multipath -c /dev/sdbdw' [8023]
Mar  4 05:06:12 ibm-z10-32 systemd-udevd[2315]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbfp' [8027]
Mar  4 05:06:13 ibm-z10-32 systemd-udevd[1903]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbft' [8024]
Mar  4 05:06:15 ibm-z10-32 systemd-udevd[1970]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbgb' [8029]
Mar  4 05:06:17 ibm-z10-32 systemd-udevd[1971]: timeout: killing '/sbin/multipath -c /dev/sdbea' [8011]
Mar  4 05:06:18 ibm-z10-32 systemd-udevd[1885]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbex' [8018]
Mar  4 05:06:21 ibm-z10-32 systemd-udevd[1949]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbfm' [8017]
Mar  4 05:06:23 ibm-z10-32 systemd-udevd[1887]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbef' [8019]

The system probably gets very busy and no other command seems to respond.

When device-mapper-multipath is removed all the LUNs login properly.

Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-42.el7.s390x

How reproducible:
100%

Steps to Reproduce:
1.Install RHEL7
2.enable iscsid to support 1024 LUNs
service iscsid stop
Stopping iscsid (via systemctl):  [  OK  ]

modprobe iscsi_tcp max_lun=1024

echo 1024 > /sys/module/scsi_mod/parameters/max_report_luns

3.Discovery target with 1024 LUNs
iscsiadm -m discovery -I default -p <target portal> -t st
  
4. login to target
iscsiadm -m node -l

Logging in to [iface: default, target: iqn.1992-08.com.netapp:sn.151753773, portal: 10.16.41.222,3260] (multiple)
Logging in to [iface: default, target: iqn.1992-08.com.netapp:sn.151753773, portal: 10.16.43.127,3260] (multiple)
Login to [iface: default, target: iqn.1992-08.com.netapp:sn.151753773, portal: 10.16.41.222,3260] successful.
Login to [iface: default, target: iqn.1992-08.com.netapp:sn.151753773, portal: 10.16.43.127,3260] successful.

Actual results:
systemd seems to try to remove the devices due timeout and server is not able to perform any other command.

Expected results:
1024 mpath devices, with 2 paths in each.

Additional info:
The following message appears when installing device-mapper-multipath
Mar  4 04:51:03 ibm-z10-32 systemd[1]: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-beaker-backend, ignoring: Invalid argument
Mar  4 04:51:03 ibm-z10-32 systemd[1]: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-fwd-backend, ignoring: Invalid argument
Mar  4 04:51:03 ibm-z10-32 yum[15464]: Installed: device-mapper-multipath-0.4.9-42.el7.s390x

Comment 2 Bruno Goncalves 2013-03-04 10:37:48 UTC
This Call trace also happened from time to time.

Mar  4 05:40:33 ibm-z10-32 systemd-udevd[1922]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbfh' [8026]
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.685839] INFO: task kworker/0:5:2077 blocked for more than 120 seconds.
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.685867] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.685880] kworker/0:5     D 00000000005f5d32     0  2077      2 0x00000200
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.685923]        0000000002f28500 0000000037f79880 0000000002f28570 0000000037f79880 
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.685923]        0000000000174b5a 000000001c47f930 000000001c47f958 0000000037f79880 
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.685923]        0000000002f28570 000000000096a500 000000000096a500 000000000096a500 
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.685923]        000000001c7c48a8 00000000008b9e80 0000000002f28500 0000000037f79838 
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.685923]        00000000006058b8 00000000005f7a56 000000001c47f998 000000001c47faf8 
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686051] Call Trace:
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686057] ([<00000000005f7a56>] __schedule+0x56a/0xab8)
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686074]  [<00000000005f5d32>] schedule_timeout+0x22a/0x2ac
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686084]  [<00000000005f7204>] wait_for_common+0x114/0x190
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686095]  [<0000000000158dde>] kthread_create_on_node+0xb2/0x14c
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686111]  [<000000000014d170>] create_worker+0x12c/0x288
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686124]  [<000000000014fcd4>] manage_workers+0x1c4/0x358
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686219]  [<0000000000150c86>] worker_thread+0x41e/0x460
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686221]  [<0000000000158b46>] kthread+0xda/0xe4
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686224]  [<00000000005f98ce>] kernel_thread_starter+0x6/0xc
Mar  4 05:40:33 ibm-z10-32 kernel: [ 2880.686227]  [<00000000005f98c8>] kernel_thread_starter+0x0/0xc
Mar  4 05:40:34 ibm-z10-32 systemd-udevd[1971]: timeout: killing '/sbin/multipath -c /dev/sdbea' [8011]
Mar  4 05:40:35 ibm-z10-32 systemd-udevd[1970]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbgb' [8029]

Comment 3 Peter Rajnoha 2013-03-04 10:42:12 UTC
This looks like another instance of bug #885978, but with scsi_id instead of blkid as it's seen in the other bug report...

Comment 4 Harald Hoyer 2013-03-22 09:34:50 UTC
(In reply to comment #3)
> This looks like another instance of bug #885978, but with scsi_id instead of
> blkid as it's seen in the other bug report...

well, seems like device-mapper-multipath is the culprit

(In reply to comment #0)
> When device-mapper-multipath is removed all the LUNs login properly.

Comment 5 Harald Hoyer 2013-03-25 16:55:38 UTC
This patch might help in systemd-udevd > 198
http://cgit.freedesktop.org/systemd/systemd/commit/?id=8cc3f8c0bcd23bb68166cb197a4c541d7621b19c

Comment 6 Peter Rajnoha 2013-05-06 07:49:21 UTC
Is this still reproducible with systemd > 198?

Comment 7 Bruno Goncalves 2013-05-07 11:35:52 UTC
It seems multipathd is not working properly with latest version:

May  7 11:34:31 ibm-z10-24 systemd[1]: Stopping Device-Mapper Multipath Device Controller...
May  7 11:34:31 ibm-z10-24 multipathd: --------shut down-------
May  7 11:34:31 ibm-z10-24 systemd[1]: Starting Device-Mapper Multipath Device Controller...
May  7 11:34:31 ibm-z10-24 systemd[1]: PID file /var/run/multipathd.pid not readable (yet?) after start.
May  7 11:34:31 ibm-z10-24 systemd[1]: Started Device-Mapper Multipath Device Controller.
May  7 11:34:31 ibm-z10-24 multipathd: DM multipath kernel driver not loaded
May  7 11:34:31 ibm-z10-24 multipathd: path checkers start up

[root@ibm-z10-24 ~]# multipath -l
May 07 11:34:45 | DM multipath kernel driver not loaded
May 07 11:34:45 | DM multipath kernel driver not loaded

[root@ibm-z10-24 ~]# cat /var/run/multipathd.pid
2102

[root@ibm-z10-24 ~]# ps -ef | grep 2102
root      2102     1  0 11:34 ?        00:00:00 /sbin/multipathd


rpm -q device-mapper-multipath
device-mapper-multipath-0.4.9-49.el7.s390x

rpm -q systemd
systemd-202-3.el7.s390x

Comment 8 Bruno Goncalves 2013-05-07 11:46:33 UTC
Loading the kernel module manually solves this problem.

modprobe dm-multipath

Comment 9 Bruno Goncalves 2013-05-07 11:52:36 UTC
The original issue is not reproduced any more on


rpm -q device-mapper-multipath
device-mapper-multipath-0.4.9-49.el7.s390x

rpm -q systemd
systemd-202-3.el7.s390x

Although, should I open a new BZ for the kernel module not being loaded automatically?

Comment 10 Ben Marzinski 2013-05-08 18:59:02 UTC
(In reply to comment #9)
> The original issue is not reproduced any more on
> 
> 
> rpm -q device-mapper-multipath
> device-mapper-multipath-0.4.9-49.el7.s390x
> 
> rpm -q systemd
> systemd-202-3.el7.s390x
> 
> Although, should I open a new BZ for the kernel module not being loaded
> automatically?

Sure. The module issue is multipath's fault.  It checks the version and fails if it's not loaded.  However, the kernel module does autoload when you try to create a multipath device. Or, it should.

With the dm-multipath module unloaded, can you try

# service multipathd start
# multipath -l

multipathd doesn't fail out if the driver isn't loaded, and as soon as it tries to create a multipath device, the module should get loaded correctly.

If that doesn't work, then there's a kernel issue. Otherwise, multipath just needs to load the kernel module when it's run.

Comment 11 Bruno Goncalves 2013-05-09 06:44:05 UTC
(In reply to comment #10)

> With the dm-multipath module unloaded, can you try
> 
> # service multipathd start
> # multipath -l
> 
> multipathd doesn't fail out if the driver isn't loaded, and as soon as it
> tries to create a multipath device, the module should get loaded correctly.
> 
> If that doesn't work, then there's a kernel issue. Otherwise, multipath just
> needs to load the kernel module when it's run.

That was the problem, I tried to run multipath -l after "service multipathd restart". As the server was configured the start multipathd service on boot.

So it seems it is a kernel issue there.

Comment 12 Bruno Goncalves 2013-05-09 07:13:15 UTC
It think this BZ can be closed as the original issue has been fixed.

I've just opened a new BZ#961218 to address dm-multipath module issue.

Comment 13 Ludek Smid 2014-06-13 13:21:48 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.


Note You need to log in before you can comment on or make changes to this bug.