Bug 526249
Summary: | Multipath remove/add race condition | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Oren Held <oren> | ||||
Component: | device-mapper-multipath | Assignee: | Ben Marzinski <bmarzins> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Lin Li <lilin> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 5.4 | CC: | agk, bdonahue, bmarzins, bmr, christophe.varoqui, dwysocha, edamato, egoggin, heinzm, iannis, jbrassow, junichi.nomura, kueda, lilin, lmb, orenhe, prockai, tranlan | ||||
Target Milestone: | rc | Flags: | oren:
needinfo-
|
||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2014-02-06 17:17:39 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Maybe it's related to #518575 There was definitely a race between multipath and multipathd, that could cause all sorts of problems (bz #506715). It was fixed. However the fix was in device-mapper-multipath-0.4.7-30.el5, which you say that you tried. However, with this package applied, you shouldn't have a line in 40-multipath.rules that calls multipath. It was commented out. It is also possible that 518575 has something to do with this, but I can't see what at the moment. When you tried this with device-mapper-multipath-0.4.7-30.el5, was the following line commented out in 40-multipathd.rules? # KERNEL!="dm-[0-9]*", ACTION=="add", PROGRAM=="/bin/bash -c '/sbin/lsmod | /bin/grep ^dm_multipath'", RUN+="/sbin/multipath -v0 %M:%m" This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. |
Created attachment 363007 [details] My /var/log/messages when the problem occurs, see wait_for_file() debug messages towards the ending Description of problem: When a new scsi disk is presented to the system, udev (rules.d/40-multipath) calls 'multipath <major>:<minor>', to add a mapping. If multipath maps are flushed (-F) and recreated at the more-or-less same time, this sometimes leads to a very bad situation: Multipathd gets the "add map" uevent on dm-X, while /sys/block/dm-X/dev is simply never created. (thus it's not solved by the recently added wait_for_file() function patch) This leads to losing multipath mappings which should've been created. Manually adding using the multipath command doesn't solve the situation in many cases. Version-Release number of selected component (if applicable): Tested on RHEL5.1 -> 5.3 with native kernel/device-mapper-multipath versions, and with latest device-mapper-mmultipath-0.4.7-30, kernel-2.6.18-164.el5 Steps to Reproduce: 0. (Optional) Add the following line to libmultipath/discovery.c inside wait_for_file()'s while loop: condlog(0, "wait_for_file %d %s", loop, filename); 1. Run the following evil oneliner: while [ "1" ]; multipath -F; multipath; sleep 1; done 2. Add a SCSI device to the system (I use add-single-device to /proc/scsi/scsi) 3. Stop the while loop after udev rule 40 has finished calling multipath command 4. multipath -l Actual results: In cases of failure: - Some multipath mappings would be missing. - If step 0 was taken, the syslog would print repeatly "multipathd: wait_for_file xxx /sys/block/dm-X/dev" lines. until xxx == 0 (/sys/block/dm-X wouldn't exist even after a long wait) Expected results: All multipath maps should be present. Additional info: I would gladly provide more info.