Description of problem: If we execute multipath -r right after modprobe lpfc. multipath will hang at these strace output: ==================== open("/dev/sdr", O_RDONLY) = 4 ioctl(4, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[6]=[12, 01, 80, 00, ff, 00], mx_sb_len=32, iovec_count=0, dxfer_len=255, timeout=300000, flags=0 ==================== Version-Release number of selected component (if applicable): kernel-2.6.18-262.el5 device-mapper-multipath-0.4.7-46.el5 How reproducible: 100% Steps to Reproduce: 1. multipath -F and make sure multipathd is running 2. modprobe -r lpfc 3. modprobe lpfc 4. multipath -r Actual results: multipath command hang and cannot terminated. Expected results: multipath table reloaded correctly. Additional info:
I think this is the same problem I am seeing. My issue is with an OFED 1.5.2 built SRP initiator not the RHEL provided one. Version Info: 1 - A RHEL modified Lustre kernel 2.6.18-194.17.1-el5 with a DDN modified device mapper RPM 2 - A RHEL original 2.6.18-194.26.1.el5 kernel with device-mapper-multipath-0.4.7-32.el5_5.6 Steps to Reproduce: Boot system with openibd and multipathd service started or 1. service multipathd start 2. service openibd start Actual results: multipathd does not finish setting up all device maps, hangs in kpartx on differnet devices each time I've seen it. If "multipath -ll" is ran to verify paths before complete it also hangs Expected results: All device maps should be updated by multipathd and "multipath -ll" should report the proper status of all paths. Additional info: I have a few hung kpartx tasks as a result of the multipathd running: Jun 1 15:01:01 test kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 1 15:01:01 test kernel: kpartx D ffffffff80150d56 0 12087 12068 (NOTLB) Jun 1 15:01:01 test kernel: ffff811223f41c08 0000000000000086 0000000000000001 ffffffff800e27ef Jun 1 15:01:01 test kernel: ffff811224974c08 0000000000000005 ffff81123fcea100 ffff81127f337820 Jun 1 15:01:01 test kernel: 00000035154def40 0000000000002100 ffff81123fcea2e8 0000000900000003 Jun 1 15:01:01 test kernel: Call Trace: Jun 1 15:01:01 test kernel: [<ffffffff800e27ef>] block_read_full_page+0x259/0x276 Jun 1 15:01:01 test kernel: [<ffffffff8006e1d7>] do_gettimeofday+0x40/0x90 Jun 1 15:01:01 test kernel: [<ffffffff80028b03>] sync_page+0x0/0x42 Jun 1 15:01:01 test kernel: [<ffffffff800637ea>] io_schedule+0x3f/0x67 Jun 1 15:01:01 test kernel: [<ffffffff80028b41>] sync_page+0x3e/0x42 Jun 1 15:01:01 test kernel: [<ffffffff8006392e>] __wait_on_bit_lock+0x36/0x66 Jun 1 15:01:01 test kernel: [<ffffffff8003fc4c>] __lock_page+0x5e/0x64 Jun 1 15:01:01 test kernel: [<ffffffff800a09f8>] wake_bit_function+0x0/0x23 Jun 1 15:01:01 test kernel: [<ffffffff8000c373>] do_generic_mapping_read+0x1df/0x359 Jun 1 15:01:01 test kernel: [<ffffffff8000d18c>] file_read_actor+0x0/0x159 Jun 1 15:01:01 test kernel: [<ffffffff8000c639>] __generic_file_aio_read+0x14c/0x198 Jun 1 15:01:01 test kernel: [<ffffffff800c6852>] generic_file_read+0xac/0xc5 Jun 1 15:01:01 test kernel: [<ffffffff800a09ca>] autoremove_wake_function+0x0/0x2e Jun 1 15:01:01 test kernel: [<ffffffff800e4c3b>] block_ioctl+0x1b/0x1f Jun 1 15:01:01 test kernel: [<ffffffff8004211a>] do_ioctl+0x21/0x6b Jun 1 15:01:01 test kernel: [<ffffffff800301f2>] vfs_ioctl+0x457/0x4b9 Jun 1 15:01:01 test kernel: [<ffffffff80063b05>] mutex_lock+0xd/0x1d Jun 1 15:01:01 test kernel: [<ffffffff8000b729>] vfs_read+0xcb/0x171 Jun 1 15:01:01 test kernel: [<ffffffff80011c14>] sys_read+0x45/0x6e Jun 1 15:01:01 test kernel: [<ffffffff8005d116>] system_call+0x7e/0x83 [root@test log]# ps -efl | grep kpartx 0 S root 10809 1 0 76 -2 - 2699 wait Jun01 ? 00:00:00 /bin/bash -c /sbin/mpath_wait /dev/mapper/test-OST0025; /sbin/kpartx -a -p p /dev/mapper/test-OST0025 4 D root 10814 10809 0 79 -2 - 3132 sync_p Jun01 ? 00:00:00 /sbin/kpartx -a -p p /dev/mapper/test-OST0025 0 S root 12068 1 0 75 -2 - 2699 wait Jun01 ? 00:00:00 /bin/bash -c /sbin/mpath_wait /dev/mapper/test-OST0005; /sbin/kpartx -a -p p /dev/mapper/test-OST0005 4 D root 12087 12068 0 75 -2 - 3132 sync_p Jun01 ? 00:00:00 /sbin/kpartx -a -p p /dev/mapper/test-OST0005 0 S root 21151 17219 0 77 0 - 15290 pipe_w 10:32 pts/1 00:00:00 grep kpartx [root@test log]# ps -eo pid,args,wchan | grep kpartx 10814 /sbin/kpartx -a -p p /dev/m sync_page 12087 /sbin/kpartx -a -p p /dev/m sync_page 21156 grep kpartx pipe_wait
This might be https://bugzilla.redhat.com/show_bug.cgi?id=674932. We had thought it was related to using scsi_dh_* modules, but if in comment #1 you are using SRP then we might have been on the wrong track. Jeremy, the SRP target you are using does not use scsi_dh_emc, scsi_dh_alua or scsi_dh_rdac does it? You can tell by looking in /var/log/messages for some attached to emc or alua or rdac messages or doing a lsmod and checking if one of those modules is loaded and has a refcount greater than 1.
Jeremy, I also wanted to confirm that at the time you got the hung kpartx that there were no transport errors/issues, right? When kpartx hangs, can you do a dd directly to the /dev/sdXs that make up the paths for the multipath device ok, or does the dds fail to some paths?
Unfortunately I'm not authorized to see bug 674932. This system does use ALUA. I have two different version of multipath that I tested, one is from the storage vendor DDN. I think the primary difference is that the DDN modified multipath interprets the target port groups from VPD page 0x83 slightly different so it sees 4 different path priorities while the regular dm-multipath see just 2 different. The problem was verified though on both. I don't have remote access to the system and it's on a stand alone network but IIRC my multipath config looks something like: I'm pretty sure there were no transport errors. I've had no issues with leaving multipathd stopped, starting the IB service (which discovers and adds storage), and manually running "multipath -v1" a little after the storage is all discovered. I've mounted the file systems and tested IO to them like that. Obviously after kpartx is hanging all IO also hangs. From what I looked at today with strace and systemtap it looks like: 1 thread of multipathd is in dm_suspend code path in drivers/md/dm.c in this loop: 1442 while (1) { 1443 set_current_state(TASK_INTERRUPTIBLE); 1444 1445 if (!atomic_read(&md->pending) || signal_pending(current)) 1446 break; 1447 1448 io_schedule(); 1449 } I'm hoping to take more of a look at other modules to figure out why md->pending is never decremented on later this week.
One other comment probably worth noting since you mentioned you suspect scsi_dh modules. When I took a look at how many times some of the functions were called with systemtap, activate_path() is called 59 times for my 59 LUNs but pg_init_done() is only called 57 the last time I looked with 2 devices hung in kpartx. pg_init_done calls scsi_dh_activate().
Jeremy, In your /var/log/messages do you see some messages like: alua: Detached ? Could you try this kernel http://people.redhat.com/jwilson/el5/265.el5/ ?
Created attachment 503786 [details] patches needed for scsi_dh issue in 2.6.18-194 kernel I confirmed today that we are seeing the alua detachments in /var/log/messages after the messages printed from functions in alua_initialize(). Also, I tested the kernel you requested today and it seemed to clear the problem. Unfortunately I need a fix for 5.5 today and soon 5.6. After looking through the diffs from the RHEL patch in the src RPMs I think I extracted what is needed in the attached patch. I built and tested the new kernel and it worked for the SCSI device handler issue but I didn't test anything else. Can you verify that the patch looks fine and there isn't anything I'm overlooking?
(In reply to comment #7) > problem. Unfortunately I need a fix for 5.5 today and soon 5.6. So are you going to request that we make z stream kernels? I think the patch looks ok. I ccd the engineer that made the patches for our kernel to double check.
> So are you going to request that we make z stream kernels? No I'll handle the kernel myself, just looking to get a head knod from someone a little more familiar with this code then myself as to whether the patch looks good or not.
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update. Contact your manager or support representative in case you need to escalate this bug.
(In reply to comment #11) > This request was evaluated by Red Hat Product Management for inclusion in Red > Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the > currently developed update. > > Contact your manager or support representative in case you need to escalate > this bug. So this will be included in 5.7 but not added to older releases (5.6 and 5.5)?
(In reply to comment #12) > (In reply to comment #11) > > This request was evaluated by Red Hat Product Management for inclusion in Red > > Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the > > currently developed update. > > > > Contact your manager or support representative in case you need to escalate > > this bug. > > So this will be included in 5.7 but not added to older releases (5.6 and 5.5)? Yes. I think if you want it in a 5.6 or 5.5 kernel you can request some sort of special port like a zstream kernel release. I think to do that you do what it requests in comment 11 where you have to have some sort of support rep do it. I do not have the proper bugzilla permissions to request it.
If you need info from me in the future _please_ set the needinfo flag accordingly. I somehow missed this bug until now. The patch that was provided in comment#7 rolls up changes that were introduced to address various scsi_dh* bugs in RHEL5: from Mike Christie (for 5.7): bug#666304 [scsi] scsi_dh: allow scsi_dh_detach to detach when attached from Mike Snitzer (for 5.7): bug#645343 [scsi] device_handler: propagate SCSI device deletion [scsi] device_handler: fix ref counting in scsi_dh_activate error path bug#619361 [scsi] scsi_dh_alua: handle transitioning state correctly bug#667660 [scsi] scsi_dh_alua: add scalable ONTAP lun to dev list from Michal Schmidt (for 5.6): bug#556476 [misc] add round_jiffies_up and related routines And some rdac changes for IBM and Dell device support from Rob Evers. Anyway, closing bug as dup of bug#645343 as that seems like the most relevant change for the issue discussed in this BZ -- but could be we need a combination of 645343 and 666304. But all these changes are in 5.7, any request for z-stream needs to be formally requested. *** This bug has been marked as a duplicate of bug 645343 ***