RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1402426 - systemd-udevd inotify_add_watch7 failed for multipath partitions
Summary: systemd-udevd inotify_add_watch7 failed for multipath partitions
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: device-mapper-multipath
Version: 7.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 7.2
Assignee: Ben Marzinski
QA Contact: Lin Li
URL:
Whiteboard:
Depends On:
Blocks: 1420851 1473733
TreeView+ depends on / blocked
 
Reported: 2016-12-07 14:28 UTC by Jonathan Edwards
Modified: 2024-06-13 20:45 UTC (History)
31 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-15 15:16:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
inotify_add_watch journalctl with udev debug (1.64 MB, text/x-vhdl)
2016-12-07 17:34 UTC, John Pittman
no flags Details
inotify_add_watch journalctl with udev and multipathd debug (1.59 MB, text/x-vhdl)
2016-12-07 17:45 UTC, John Pittman
no flags Details
patch to stop multipath from deleting path device partitions. (1.62 KB, patch)
2017-04-27 18:07 UTC, Ben Marzinski
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2890091 0 None None None 2017-03-17 06:38:36 UTC

Description Jonathan Edwards 2016-12-07 14:28:04 UTC
This appears to be a race condition when multipath gets set up .. spurious messages from inotify in udev will appear against all the partitions of a disk as they're getting deleted from partx

systemd-udevd[558]: inotify_add_watch(7, /dev/sdc1, 10) failed: No such file

longer trace in systemd-udevd shows it seems to occur after /sbin/multipath -c /dev/sdc and /sbin/partx -d -nr 1-1024 /dev/sdc

Dec 06 17:09:38 localhost.localdomain systemd-udevd[575]: starting 'scsi_id --export --whitelisted -d /dev/sda'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[578]: starting 'scsi_id --export --whitelisted -d /dev/sdc'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[582]: starting '/sbin/multipath -c /dev/sr0'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[579]: starting 'scsi_id --export --whitelisted -d /dev/sdb'
Dec 06 17:09:38 localhost.localdomain systemd[1]: Found device QEMU_HARDDISK 1.
Dec 06 17:09:38 localhost.localdomain systemd[1]: Created slice system-lvm2\x2dpvscan.slice.
Dec 06 17:09:38 localhost.localdomain systemd[1]: Starting system-lvm2\x2dpvscan.slice.
Dec 06 17:09:38 localhost.localdomain systemd[1]: Starting LVM2 PV scan on device 8:50...
Dec 06 17:09:38 localhost.localdomain systemd-udevd[585]: starting '/sbin/multipath -c /dev/sdc'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[586]: starting '/sbin/multipath -c /dev/sda'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[589]: starting '/sbin/partx -d --nr 1-1024 /dev/sdc'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[588]: starting '/sbin/multipath -c /dev/sdb'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[593]: starting '/sbin/partx -d --nr 1-1024 /dev/sdb'
Dec 06 17:09:38 localhost.localdomain kernel: AES CTR mode by8 optimization enabled
Dec 06 17:09:38 localhost.localdomain systemd-udevd[595]: starting '/usr/bin/vmmouse_detect'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[558]: inotify_add_watch(7, /dev/sdc1, 10) failed: No such fi
Dec 06 17:09:38 localhost.localdomain systemd-udevd[592]: starting '/sbin/partx -d --nr 1-1024 /dev/sda'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[594]: starting '/bin/systemctl try-restart kdump.service'
Dec 06 17:09:38 localhost.localdomain systemd-udevd[552]: inotify_add_watch(7, /dev/sdb1, 10) failed: No such fi
Dec 06 17:09:38 localhost.localdomain kernel: alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni)
Dec 06 17:09:38 localhost.localdomain systemd-udevd[596]: starting '/bin/systemctl try-restart kdump.service'
Dec 06 17:09:38 localhost.localdomain kernel: alg: No test for crc32 (crc32-pclmul)
Dec 06 17:09:38 localhost.localdomain kernel: intel_rapl: no valid rapl domains found in package 0
Dec 06 17:09:38 localhost.localdomain kernel: intel_powerclamp: No package C-state available
Dec 06 17:09:38 localhost.localdomain kernel: intel_powerclamp: No package C-state available
Dec 06 17:09:38 localhost.localdomain systemd-udevd[555]: inotify_add_watch(7, /dev/sda1, 10) failed: No such fi
Dec 06 17:09:38 localhost.localdomain lvm[497]: WARNING: Failed to get udev device handler for device /dev/sda1.
Dec 06 17:09:38 localhost.localdomain lvm[497]: /dev/sda1: stat failed: No such file or directory
Dec 06 17:09:38 localhost.localdomain lvm[497]: Path /dev/sda1 no longer valid for device(8,1)
Dec 06 17:09:38 localhost.localdomain systemd-udevd[604]: starting '/bin/sh -c 'ethtool -i $1 | sed -n s/^dri

Comment 2 John Pittman 2016-12-07 17:34:38 UTC
Created attachment 1229168 [details]
inotify_add_watch journalctl with udev debug

Comment 3 John Pittman 2016-12-07 17:45:23 UTC
Created attachment 1229183 [details]
inotify_add_watch journalctl with udev and multipathd debug

Comment 4 John Pittman 2016-12-07 17:47:34 UTC
Issue is only reproducible with device-mapper-multipath-0.4.9-99.el7.x86_64.  Downgrading package to latest 7.2 level(device-mapper-multipath-0.4.9-85.el7_2.6.x86_64) seems to work around the issue.  As a test I used the old 62-multipath.rules file with the new package and the issue persisted.  I also tried the 7.3 kpartx package with 7.2 device-mapper-multipath package and the issue persisted as well.  Attached are two 7.3 journalctl files, one with udev debug enabled and one with udev & multipathd debug enabled.

Comment 5 John Pittman 2016-12-07 21:01:19 UTC
Attempting to isolate this further, through loosely followed bisection of the patches, I was able to narrow things down to a few patches.

0166-RHBZ-1323429-dont-allow-new-wwid.patch
0167-RHBZ-1335176-fix-show-cmds.patch
0168-RHBZ-1347769-shared-lock.patch

Package was built with edited spec file, new rpms were force installed (minus the debug package), and the system was rebooted.

With patches 166 - 188 commented out, I was not able to reproduce the issue.

uncomment 166 - issue does not occur

uncomment 167 - issue does occur once on 1st reboot
  1st reboot repro
  2nd reboot no repro
  3rd reboot no repro
  4th reboot no repro
  5th reboot no repro

uncomment 168 - issue occurs everytime

Comment 17 Peter Rajnoha 2017-04-04 11:26:58 UTC
Based on the logs, the problem here is that we're processing a uevent for a device and the processing is still not finished while the device is removed in parallel (by another udevd worker here as seen in this report - but I don't think that's important - the device can be removed anytime by anything in parallel, not only as part of another udevd worker, it can be removed from any script or directly by user).

When udevd worker finishes processing, it tries to update udev database and also set inotify watch for the device - but that was removed during processing and so we end up with the error/warning messages.

We can never assume the device is still there when we finish rule processing for that device. The only thing we can do is to decrease the level of warning/error messages produced in this situation. The system is dynamic - devices can appear and disappear anytime - even during udevd processing.

This problem can be easily simulated if we add a rule like:

1) adding /etc/udev/rules.d/50-delay.rules with this content:

  RUN+="/usr/bin/sleep 10"

2) udevadm control --reload

3) echo add > /sys/block/sda uevent (or any other device)

4) echo 1 > /sys/block/sda/device/delete (make it within the sleep time, which is 10 in our example here)

5) wait for the sleep to finish (10s here)

6) look at the logs, you'll see something like: Apr 04 13:18:12 rhel7-a systemd-udevd[1375]: inotify_add_watch(7, /dev/sda, 10) failed: No such file or directory



(The errors from lvm2 we see in the report from comment #1:

Dec 06 17:09:38 localhost.localdomain lvm[497]: WARNING: Failed to get udev device handler for device /dev/sda1.

...is just a consequence - because lvm2 reads udev db - and the udev db simply does not reflect the current state where the device is already removed - udev db still contains the device when it is read by lvm2, the udev db is updated only after the REMOVE event is processed)

Comment 18 Peter Rajnoha 2017-04-04 11:33:32 UTC
This is not bug in device-mapper-multipath - the rules for removing the partitions (partx -d ...) just made this fact as described in comment #17 more visible because the device/partition is now removed automatically while another udevd worker can still be in the middle of processing of this device/partition.

Normally, the device is either removed manually where the coincidence and chance of removing a device while another event for that device is being processed is very small. Or that device is removed because it gets unplugged - again, the chances of running into situation as described in comment #17 is small.

So this problem was always here, it's just made more visible now with adding the partx -d.

Comment 19 Jonathan Earl Brassow 2017-04-07 21:42:32 UTC
Is it just a scary message then, or does something need to be done?  It's sounding to me like the former...

Comment 20 Sang Woon Kim 2017-04-10 01:55:36 UTC
Hi,I understand what you explain on Commant 17 and 18. 
But here(on customer site) is critical issues with Oracle CRS and removing partition table when adding or removing SAN path.

If Oracle CRS check some partition table where is on SAN paths with multipathd/lvm2 and fail to check some partition tables during SAN path add or remove for health checking, CRS force to reboot system by itself because CRS just think emergency case is occuerd like losing path. 

So we need to find best way that partition table is not lost for preventing to reboot system by Oracle CRS whenever SAN path add or remove.

Is there any good ideas?

Thanks and Best Regard,
Cloud(SangWoon),Kim
Red Hat Platform Consult

Comment 21 Peter Rajnoha 2017-04-10 11:06:34 UTC
(In reply to Jonathan Earl Brassow from comment #19)
> Is it just a scary message then, or does something need to be done?  It's
> sounding to me like the former...

As for the warning messages - yes, it's just a message - we can't avoid that completely because a device can be removed anytime, even just in time udevd does the processing and the udev rules are processed. So from this point of view, I'd probably suggest lowering the message severity level to "debug" only, instead of "warning" or "error", if possible.

Comment 22 Peter Rajnoha 2017-04-10 11:24:37 UTC
(In reply to Sang Woon Kim from comment #20)
> Hi,I understand what you explain on Commant 17 and 18. 
> But here(on customer site) is critical issues with Oracle CRS and removing
> partition table when adding or removing SAN path.
> 
> If Oracle CRS check some partition table where is on SAN paths with
> multipathd/lvm2 and fail to check some partition tables during SAN path add
> or remove for health checking, CRS force to reboot system by itself because
> CRS just think emergency case is occuerd like losing path. 
> 
> So we need to find best way that partition table is not lost for preventing
> to reboot system by Oracle CRS whenever SAN path add or remove.
> 

This sounds like an improper assumption by Oracle CRS. I think those partitons are removed because we need them on top of the top-level mpath device instead of the components, but I'm no 100% sure - asking Ben to confirm for sure (...or if there's any other reason the partx -d is called).

Comment 23 Ben Marzinski 2017-04-10 14:33:02 UTC
The only reason why we do this is to make sure users work with the kpartx created partition devices, and not the scsi partitions.  People kept getting this wrong, and so it seemed easier if these devices just weren't around.  That's why the rules only will delete the partitions one time.  If the user recreates the partitions, it leaves them alone.

It might be better to actually remove the partitions as they come up. So instead of removing all the partitions during udev processing for the whole scsi device. I could remove them individually with a RUN rule during processing on the partition itself. That way the partition couldn't get removed while it was being processed.

It's not crucial that we remove the partitions. We could simply mark them as not ready in the udev rules, and that should take care of most automated things that might use them. But changing it now would confuse customers that are not expecting to see them on their multipathed scsi devices, and it does seem to help avoid user error with using multipath devices.

Comment 24 Jonathan Edwards 2017-04-10 19:01:38 UTC
confused about the logging level comment (comment 21) to strip the complaint in systemd-udevd .. wouldn't debug be a higher severity logging level than warn or error?

Comment 25 Sang Woon Kim 2017-04-11 00:56:38 UTC
How can explain to customers why SAN Block I/O hang up with inotify_add_watch messages when SAN paths are added by power-off and on one of four SAN Switches?

customer use below multipathd options.

device {
               vendor "HITACHI"
               product "OPEN-.*"
               path_grouping_policy multibus
               path_selector "round-robin 0"
               path_checker tur
               "1 queue_if_no_path"
               hardware_handler "0"
               prio const
               rr_weight uniform
               no_path_retry 6 
               rr_min_io 1000
               rr_min_io_rq 1
       }

Thanks and Best Regard,
Cloud(SangWoon),Kim
Red Hat Platform Consult

Comment 28 Ben Marzinski 2017-04-27 17:02:02 UTC
I don't think that any of these issues are directly related to the patches in Comment 5. Those simply are changing the timing of things, which can change the outcome of races.  I'm also not sure how removing partitions on a path device could cause IO to a multipath device to hang.  Fixing the issue by clearing the page cache also points away from a block layer issue, since the block layer sits below the page cache. Once a request hits device-mapper, it has already made it past the page cache, and I don't see how the page cache could effect the request anymore.

But just to make sure that removing the partition devices isn't causing an issue, I'll post a patch that you can try, which will stop multipath from removing partitions from its path devices, and will instead set SYSTEMD_READY to 0 for them, so that they are ignored in favor of the kpartx devices.

Comment 29 Ben Marzinski 2017-04-27 18:07:34 UTC
Created attachment 1274753 [details]
patch to stop multipath from deleting path device partitions.

You can apply this patch with

# patch -d /lib/udev/rules.d/ -p1 < multipath-rules.patch

It will stop /lib/udev/rules.d/62-multipath.rules from removing the partition device nodes for the paths of multipath devices. With this patch applied, you should no loner see the inotify udev messages.  Do you still see IO hangs when you run your tests with this applied?

Comment 43 Ben Marzinski 2017-09-27 14:17:01 UTC
Has anyone run the test I mentioned in Comment 29. This bug isn't going to make forward progress without more information.

Comment 58 Rakesh 2017-12-30 17:29:13 UTC
On a server with multipath partitions, I had the same issue.

After applying the patch proposed by Ben Marzinski, "inotify systemd-udevd inotify_add_watch" logs disappeared.

Comment 60 Ben Marzinski 2018-02-27 15:45:58 UTC
I'm moving this bug out from rhel-7.5. Like has been mentioned above, these inotify_add_watch messages should not effect anything. On the other hand, it should be completely safe to apply my patch to remove them, as long a you realize that these partition devices should never be used directly. Instead, you should use the kpartx created partition devices.

If you are worried that these messages are correlated with an actual problem, you should apply my patch to remove them, and see if your problem is resolved. If that is the case, please let me know.

Comment 70 Ben Marzinski 2018-10-15 15:16:13 UTC
I am closing this bug. The messages about paths being removed do not represent a problem. This is intentionally being done. Further, no one has tried my udev rules patch, and reported any change other than that the scsi partitions were no longer removed.  But we specifically are removing these partitions because leaving them in has caused many issues with customers using them instead of the partition devices on top of multipath.

If someone can come up with a good reason why this bug should not be closed, please re-open in with an explanation of a actual problem caused by removing the scsi partitions.


Note You need to log in before you can comment on or make changes to this bug.