Bug 917538
Summary: | device mapper multipath fails to create 1024 mpaths on s390x | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Bruno Goncalves <bgoncalv> |
Component: | device-mapper-multipath | Assignee: | Peter Rajnoha <prajnoha> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Bruno Goncalves <bgoncalv> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.0 | CC: | agk, bmarzins, harald, heinzm, msnitzer, prajnoha, sauchter, zkabelac |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | s390x | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-06-13 13:21:48 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Bruno Goncalves
2013-03-04 10:10:24 UTC
This Call trace also happened from time to time. Mar 4 05:40:33 ibm-z10-32 systemd-udevd[1922]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbfh' [8026] Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.685839] INFO: task kworker/0:5:2077 blocked for more than 120 seconds. Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.685867] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.685880] kworker/0:5 D 00000000005f5d32 0 2077 2 0x00000200 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.685923] 0000000002f28500 0000000037f79880 0000000002f28570 0000000037f79880 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.685923] 0000000000174b5a 000000001c47f930 000000001c47f958 0000000037f79880 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.685923] 0000000002f28570 000000000096a500 000000000096a500 000000000096a500 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.685923] 000000001c7c48a8 00000000008b9e80 0000000002f28500 0000000037f79838 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.685923] 00000000006058b8 00000000005f7a56 000000001c47f998 000000001c47faf8 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686051] Call Trace: Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686057] ([<00000000005f7a56>] __schedule+0x56a/0xab8) Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686074] [<00000000005f5d32>] schedule_timeout+0x22a/0x2ac Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686084] [<00000000005f7204>] wait_for_common+0x114/0x190 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686095] [<0000000000158dde>] kthread_create_on_node+0xb2/0x14c Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686111] [<000000000014d170>] create_worker+0x12c/0x288 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686124] [<000000000014fcd4>] manage_workers+0x1c4/0x358 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686219] [<0000000000150c86>] worker_thread+0x41e/0x460 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686221] [<0000000000158b46>] kthread+0xda/0xe4 Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686224] [<00000000005f98ce>] kernel_thread_starter+0x6/0xc Mar 4 05:40:33 ibm-z10-32 kernel: [ 2880.686227] [<00000000005f98c8>] kernel_thread_starter+0x0/0xc Mar 4 05:40:34 ibm-z10-32 systemd-udevd[1971]: timeout: killing '/sbin/multipath -c /dev/sdbea' [8011] Mar 4 05:40:35 ibm-z10-32 systemd-udevd[1970]: timeout: killing 'scsi_id --export --whitelisted -d /dev/sdbgb' [8029] This looks like another instance of bug #885978, but with scsi_id instead of blkid as it's seen in the other bug report... (In reply to comment #3) > This looks like another instance of bug #885978, but with scsi_id instead of > blkid as it's seen in the other bug report... well, seems like device-mapper-multipath is the culprit (In reply to comment #0) > When device-mapper-multipath is removed all the LUNs login properly. This patch might help in systemd-udevd > 198 http://cgit.freedesktop.org/systemd/systemd/commit/?id=8cc3f8c0bcd23bb68166cb197a4c541d7621b19c Is this still reproducible with systemd > 198? It seems multipathd is not working properly with latest version: May 7 11:34:31 ibm-z10-24 systemd[1]: Stopping Device-Mapper Multipath Device Controller... May 7 11:34:31 ibm-z10-24 multipathd: --------shut down------- May 7 11:34:31 ibm-z10-24 systemd[1]: Starting Device-Mapper Multipath Device Controller... May 7 11:34:31 ibm-z10-24 systemd[1]: PID file /var/run/multipathd.pid not readable (yet?) after start. May 7 11:34:31 ibm-z10-24 systemd[1]: Started Device-Mapper Multipath Device Controller. May 7 11:34:31 ibm-z10-24 multipathd: DM multipath kernel driver not loaded May 7 11:34:31 ibm-z10-24 multipathd: path checkers start up [root@ibm-z10-24 ~]# multipath -l May 07 11:34:45 | DM multipath kernel driver not loaded May 07 11:34:45 | DM multipath kernel driver not loaded [root@ibm-z10-24 ~]# cat /var/run/multipathd.pid 2102 [root@ibm-z10-24 ~]# ps -ef | grep 2102 root 2102 1 0 11:34 ? 00:00:00 /sbin/multipathd rpm -q device-mapper-multipath device-mapper-multipath-0.4.9-49.el7.s390x rpm -q systemd systemd-202-3.el7.s390x Loading the kernel module manually solves this problem. modprobe dm-multipath The original issue is not reproduced any more on rpm -q device-mapper-multipath device-mapper-multipath-0.4.9-49.el7.s390x rpm -q systemd systemd-202-3.el7.s390x Although, should I open a new BZ for the kernel module not being loaded automatically? (In reply to comment #9) > The original issue is not reproduced any more on > > > rpm -q device-mapper-multipath > device-mapper-multipath-0.4.9-49.el7.s390x > > rpm -q systemd > systemd-202-3.el7.s390x > > Although, should I open a new BZ for the kernel module not being loaded > automatically? Sure. The module issue is multipath's fault. It checks the version and fails if it's not loaded. However, the kernel module does autoload when you try to create a multipath device. Or, it should. With the dm-multipath module unloaded, can you try # service multipathd start # multipath -l multipathd doesn't fail out if the driver isn't loaded, and as soon as it tries to create a multipath device, the module should get loaded correctly. If that doesn't work, then there's a kernel issue. Otherwise, multipath just needs to load the kernel module when it's run. (In reply to comment #10) > With the dm-multipath module unloaded, can you try > > # service multipathd start > # multipath -l > > multipathd doesn't fail out if the driver isn't loaded, and as soon as it > tries to create a multipath device, the module should get loaded correctly. > > If that doesn't work, then there's a kernel issue. Otherwise, multipath just > needs to load the kernel module when it's run. That was the problem, I tried to run multipath -l after "service multipathd restart". As the server was configured the start multipathd service on boot. So it seems it is a kernel issue there. It think this BZ can be closed as the original issue has been fixed. I've just opened a new BZ#961218 to address dm-multipath module issue. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |