Red Hat Bugzilla – Bug 1313324
Individual errors during multipath discovery invalidate the entire discovery.
Last modified: 2016-11-04 04:18:35 EDT
Description of problem: Under heavy load (creating and deleting many devices simultaneously) multipath calls (for example "multipath -l $device") will fail if during discovery phase there's a problem retrieving individual udev information from a device (because it has been deleted after listing the devices). And the call should not fail. In the code we get the list of devices [A], then if we fail at [B], we will end up returning a positive number at [C] and then caller will fail operation. udev_enumerate_add_match_subsystem(udev_iter, "block"); udev_enumerate_scan_devices(udev_iter); <-------------------------------- A udev_list_entry_foreach(entry, udev_enumerate_get_list_entry(udev_iter)) { const char *devtype; devpath = udev_list_entry_get_name(entry); condlog(4, "Discover device %s", devpath); udevice = udev_device_new_from_syspath(conf->udev, devpath); <----- B if (!udevice) { condlog(4, "%s: no udev information", devpath); r++; continue; } devtype = udev_device_get_devtype(udevice); if(devtype && !strncmp(devtype, "disk", 4)) r += path_discover(pathvec, conf, udevice, flag); udev_device_unref(udevice); } condlog(4, "Discovery status %d", r); return r; <-------------------------------------------------------------- C } I believe upstream commit 646e754853b123a075b4cede7d9ccf540e8c9b0c should fix this.
I've backported the upstream fix.
Ben, Yes the customer can use device-mapper-multipath-0.4.9-88.el7. I will remove the zstream flag. Thanks.
When is the ETA for this fix for both Redhat 6.8 and 7.2 ?
The rhel-7.2 zstream for this bug is Bug 1328515, which has already been released. There is currently no rhel-6 version of this bug. I can create one, but rhel-6.8 has been out for a while, and rhel-6.9 is not yet in the planning phase. You'd need to talk to a support person about getting a 6.8 zstream.
The rhel-6 bug is Bug 1343747
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2536.html