Hide Forgot
Description of problem: Test environment ---------------- RH6.8 - 2.6.32-642.el6.x86_64 device-mapper-multipath-0.4.9-96.el6.x86_64 (bugzilla 1365710) also tested with: device-mapper-multipath-0.4.9-93.el6.x86_64 defaults { polling_interval 15 no_path_retry 12 path_checker directio user_friendly_names no path_selector "round-robin 0" path_grouping_policy multibus prio const } SMC946 90 bay enclosure fully populated with a variety of SAS (no SATA) devices 02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02) 03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02) [root@perf3094 mdr]# cat /sys/class/scsi_host/host[01]/version_fw 10.00.00.00 10.00.00.00 Problem description ------------------- Fault insertion reveals that multipath takes too long to begin adding paths back to maps after both cables have been disconnected and a single cable has been restored. This causes multipath to fail queued I/O even though a path to the device is available. Test ---- Foreach /dev/mapper partition device in the enclosure (90 devices): dd if=/dev/mapper/part_mapname of=/dev/null bs=512k iflag=direct & Pull cable connecting host0 to the enclosure - verify i/o continues using just the remaining path Pull cable connecting host1 to the enclosure - observe syslog messages indicating that "recovery mode" was entered Reconnect either host0 or host1 SAS cable - observe SCSI infrastructure bringing paths back online Wait, wait, wait for multipath to add paths to map and resume i/o - observe multipath failing I/O after recovery mode timeout - also observe multipath adding paths back to maps at the same time - observe all "dd" processes failing with EIO - roughly 180 seconds from recovery mode entered to i/o failures 180 second timeout (12*15) 10:40:00 disable hba1 10:40:18 multipathd removed paths from maps 10:43:00 disable hba2 10:43:14 multipathd removed paths from maps, retrying 10:44:00 reenabled hba2 - multipathd does not restore paths to maps, i/o remains stopped 10:46:09 i/o terminated 10:46:20 paths added back to maps, too late If I set the timeout to 10 minutes, it takes multipath approximately 5-6 minutes to begin adding paths back and resuming i/o. Analysis -------- As near as I can tell, the problem stems from udev rule 13-dm-disk.rules inappropriately running "blkid" and/or 40-multipath.rules running "kpartx" when the last path to the mapper device has been failed. These programs hang as the map has no valid path. Commenting out the execution of these programs results in multipath discovering the newly restored paths in a timely manner. 40-multipath.rules has a line intended to suppress execution of kpartx. ENV{DM_ACTION}=="PATH_FAILED", GOTO="end_mpath" but it does not appear to work. udevadm monitor shows the DM_ACTION value properly set to PATH_FAILED but.... KERNEL[1480449937.733499] change /devices/virtual/block/dm-78 (block) UDEV_LOG=3 ACTION=change DEVPATH=/devices/virtual/block/dm-78 SUBSYSTEM=block DM_TARGET=multipath DM_ACTION=PATH_FAILED DM_SEQNUM=14 DM_PATH=67:192 DM_NR_VALID_PATHS=0 DM_NAME=35000c5002615610b DM_UUID=mpath-35000c5002615610b DEVNAME=dm-78 DEVTYPE=disk SEQNUM=165461 MAJOR=253 MINOR=78 Workaround ---------- Increase timeout to between 6 and 10 minutes. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Note: ZWS - Zero Watt Storage. Previously known as JBFS.
Are you sure that kpartx is hanging on the change event you posted in the problem description? It would be easier for me to debug this if I could see the output of # udevadm monitor --udev --property and the contents of /var/log/messages from while you are running your test. Also, if you set fast_io_fail_tmo 5 dev_loss_tmo 600 in your multipath.conf defaults section, does that work around the issue. This will make your scsi devices fail IOs to down devices after 5 seconds, but not remove failed devices from the system for 10 minutes. This will let multipath respond quickly to errors on the devices (the fast_io_fail_tmo parameter is set to 5 seconds by default in RHEL7), but it won't remove the devices right away. This means that multipathd doesn't need to rely on udev to restore access to the device.
The change event posted was an "example" from udevadm monitor that was showing DM_ACTION=PATH_FAILED was being generated. I doubt that the specific event I posted was directly tied to a specific kpartx. And, in general, it's blkid that is hanging. Occasionally a kpartx is also observed to be hanging. I will capture the requested syslog and udevadm monitor output for you and post to this bug. First, trying the suggested change to multipath.conf. Keep in mind, the storage targets in question are connected to LSI SAS adapter, not Fibre Channel. The multipath.conf.8 documentation suggests that the parameters you've requested be set are limited in scope to FC. After running the test, I can report that the requested changes to multipath.conf do not work around the issue. They appear to have no effect. Below are the settings used for the test. defaults { path_checker directio # checker_timeout 10 # force_sync yes polling_interval 15 # no_path_retry 40 no_path_retry 12 # no_path_retry queue fast_io_fail_tmo 5 dev_loss_tmo 600 user_friendly_names no path_selector "round-robin 0" path_grouping_policy multibus prio const # verbosity 3 }
Created attachment 1229224 [details] Requested data for comment #4. Contains following files udevadm_monitor.1402099.log messages.1402099.log blkid.1402099.log -- ps -ef|grep blkid # after loss of 2nd path
Created attachment 1230109 [details] udev rules file to disable blkid on multipath devices with no valid paths If you copy this file to /lib/udev/rules.d/ it should take care of the blkid problem. This won't solve the kpartx problem. I'll send you fix for that next.
Created attachment 1230110 [details] patch for /lib/udev/rules.d/40-multipath.rules to fix kpartx hang If you apply this patch to /lib/udev/rules.d/40-multipath.rules, it should fix the kpartx hang as well. All it's doing is skipping kpartx when DM_NOSCAN is set. The 11-dm-mpath.rules file from the previous attachment made sure that this was set whenever there are no valid paths. Let me know if these two changes fix the issue for you.
Comment on attachment 1230109 [details] udev rules file to disable blkid on multipath devices with no valid paths Adding 11-dm-mpath.rules to /lib/udev/rules.d results in these syslog messages. [root@perf3094 rules.d]# cp /root/mdr/11-dm-mpath.rules . Dec 9 15:43:02 perf3094 udevd[2719]: GOTO 'mpath_end' has no matching label in: '/lib/udev/rules.d/11-dm-mpath.rules' Dec 9 15:43:02 perf3094 udevd[2719]: GOTO 'mpath_end' has no matching label in: '/lib/udev/rules.d/11-dm-mpath.rules' Dec 9 15:43:02 perf3094 udevd[2719]: GOTO 'mpath_end' has no matching label in: '/lib/udev/rules.d/11-dm-mpath.rules' Dec 9 15:43:02 perf3094 udevd[2719]: GOTO 'mpath_end' has no matching label in: '/lib/udev/rules.d/11-dm-mpath.rules' Dec 9 15:43:02 perf3094 udevd[2719]: GOTO 'mpath_end' has no matching label in: '/lib/udev/rules.d/11-dm-mpath.rules' Dec 9 15:43:02 perf3094 udevd[2719]: GOTO 'mpath_end' has no matching label in: '/lib/udev/rules.d/11-dm-mpath.rules' The patch applied cleanly to 40-multipath.rules. [root@perf3094 mdr]# diff save_40-multipath.rules 40-multipath.rules 23a24 > ENV{DM_NOSCAN}=="1", GOTO="end_mpath" I made an edit to 11-dm-mpath.rules, adding the appropriate LABEL at the end of the file. This eliminated the syslog messages. If this is not the right place to have placed it please advise. I've captured the udevadm monitor output per previous request, as well as associated syslog. Syslog contains a period count of the number of blkid and kpartx processes executing. I will attach the tarball shortly.
Created attachment 1230166 [details] Attachment associated with comment #9 [root@perf3094 logs]# tar tvfz 1402099.2.tar.gz -rw-r--r-- root/root 2635375 2016-12-09 16:01 udevadm_monitor.1402099.2.log -rw-r--r-- root/root 349682 2016-12-09 16:01 messages.2.1402099.log -rw-r--r-- root/root 3425 2016-12-09 15:57 blkid.2.1402099.log
Getting ready for OnBoarding first week of January.
Hello, Due to where we are in the RHEL 6.9 release. This will not make RHEL 6.9. This is now requested for RHEL 6.10. Thank You Joe Kachuck
Hello, RHEL 6 has entered Phase 3. In phase 3 only Critical impact Security Advisories and selected Urgent Priority Bug Fix Advisories will be accepted. https://access.redhat.com/support/policy/updates/errata At current this BZ does not meet these requirements. I am closing this BZ as WONTFIX. Please reopen if this fix is required for RHEL 6. If so please also provide a justification for this fix. Thank You Joe Kachuck
Hi, We were hoping Redhat would re-consider this issue for 6.10. we feel that multipath failing queued-IO when there is a connection is a serious issue. And we feel that the original report of this is a strong case. thank you.
This is certainly fixable. There was a similar issue in RHEL7, Bug 1239173. The problem with my previous solution is that it didn't handle all the cases where a multipath device got an event that made it lose it's last path. If the last valid path was removed when the device was reloaded, instead of failing first, and then being removed later. The test patches that I send above wouldn't catch this. Unfortunately, fixing this correctly is a significantly bigger change, but it's still just a backport.
Ben, I hope your comment ultimately leads to this BZ being reopened and to a resolution. Thank you for considering this request! I cannot view 1239173. Isn't the core issue that the udev rule isn't correct? And the attempt to correct it didn't work? (Comment #8.) From comment #1: "As near as I can tell, the problem stems from udev rule 13-dm-disk.rules inappropriately running "blkid" and/or 40-multipath.rules running "kpartx" when the last path to the mapper device has been failed."
(In reply to Michael Reed from comment #17) > Isn't the core issue that the udev rule isn't correct? And the attempt > to correct it didn't work? (Comment #8.) Yes, but the issue is that the uevents don't have enough information to determine what to do in all cases. To deal with that, multipath needs to add some some flags to some of it's device-mapper callouts so that it can check these flags in the udev rules, and do the right thing. Otherwise there just isn't enough information for the udev rules to know the right thing to do.
Thank you for the clarification!
Hello Frank, To reopen for RHEL 6.10. This is currently a High issue. What would make this issue an Urgent issue? Does HPE/SGI have clients seeing this issue in the field? From the summery. There is a delay in the drives reappearing, if both cables are pulled. How long is this delay? Thank You Joe Kachuck
hi Joe, This is a BZ that I took over from George, so I'm not 100% sure on this. But I believe it's 5 or 6 minutes to recover; and that this is customer reported. -frank
Fortunately, this is not customer reported! It was discovered during fault insertion testing prior to FCS. It does take 5-6 minutes to recover after the cable is plugged back in. The sd devices are present and ready to go.
Micheal, thanks for the clarification.