Hide Forgot
This bug has been copied from bug #1691277 and has been proposed to be backported to 7.6 z-stream (EUS).
I've not been able to reproduce this with or without lvmetad. I have a suspicion it could be a side effect of the udev problems, the EMFILE errors first appear immediately after those udev issues. Could you try setting obtain_device_list_from_udev=0 in lvm.conf and see if this still happens?
There are other commits in the stable branch that fix this issue (so stable and 7.7 do not have this problem.) So, it looks like your testing has validated the original fix for bug 1691277 (there's no more problem when not using lvmetad), but has also uncovered other fixes that would be needed to make lvmetad work with this many devices. One or more of the following commits from stable would need to be backported to 7.6.z, but it's not clear how many of them can be cherry-picked directly. Some may depend on other unrelated changes. This may become more backporting than is appropriate for zstream. commit 9799c8da07b77844451c64bcbbce0d9d43ce2552 Author: David Teigland <teigland> Date: Tue Nov 6 16:03:17 2018 -0600 devices: reuse bcache fd when getting block size This avoids an unnecessary open() on the device. commit f7ffba204e06ae432ae2c7943cb41eec5b8e8bb1 Author: David Teigland <teigland> Date: Tue Jun 26 12:05:39 2018 -0500 devs: use bcache fd for read ahead ioctl to avoid an unnecessary open of the device in most cases. commit 73578e36faa78c616716617a83083cc3a31ba03f Author: David Teigland <teigland> Date: Fri May 11 14:28:46 2018 -0500 dev_cache: remove the lvmcache check when closing fd This is no longer used since devices are not held open in dev_cache. commit 3e3cb22f2a115f71f883a75c7840ab271bd83454 Author: David Teigland <teigland> Date: Fri May 11 14:25:08 2018 -0500 dev_cache: fix close in utility functions All these functions are now used as utilities, e.g. for ioctl (not for io), and need to open/close the device each time they are called. (Many of the opens can probably be eliminated by just using the bcache fd for the ioctl.) commit ccab54677c9f92cf1bd11895251799c043a57602 Author: David Teigland <teigland> Date: Fri May 11 13:53:19 2018 -0500 dev_cache: fix close in dev_get_block_size
To summarize, when there are many PVs in the system: - async io, supposed to speed things up, can not used because of Bug 1656498, - and lvmetad, supposed to speed things up, can not be used because of this bug. So sync io without lvmetad is the only option. Is not that hurting performance badly?
(In reply to Marian Csontos from comment #9) > To summarize, when there are many PVs in the system: > > - async io, supposed to speed things up, can not used because of Bug 1656498, That shouldn't be related to the number of devices. It's caused by other unknown software that's using all the aio contexts. A user can simply increase the number of aio contexts on the system if they want lvm to use aio instead of falling back to sync io. > - and lvmetad, supposed to speed things up, can not be used because of this bug. Just to clarify, this specific bug appears to be fixed, but there are other issues mentioned in comment 8 that will cause similar problems at around 1000 devices. That issue can also be avoided by simply increasing the open fd limit. If this is a problem we could open a new bug to do zstream backports of some of the other commits in comment 8. > So sync io without lvmetad is the only option. Is not that hurting performance badly?
Martin, are you happy with the above explanation?
David, could you provide a doc string, please.
Hello, When we are going to release the patch for this.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0814