Hide Forgot
Description of problem: This is a regression scenario for bug 1691277. Basically create >1024 lvm thin vols and create a PV on top of each. I'll try and grab a valid core and bt. [...] 1010 lvcreate -y -k n -s /dev/snapper_thinp/origin -n many_1010 1011 lvcreate -y -k n -s /dev/snapper_thinp/origin -n many_1011 1012 lvcreate -y -k n -s /dev/snapper_thinp/origin -n many_1012 1013 lvcreate -y -k n -s /dev/snapper_thinp/origin -n many_1013 Although the snap create passed, errors were found in it's output WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. WARNING: Sum of all thin volume sizes (1.99 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). /run/dmeventd-client: open failed: Too many open files WARNING: Failed to monitor snapper_thinp/POOL. Logical volume "many_1013" created. [root@hayes-01 ~]# pvcreate --config devices/scan_lvs=1 /dev/snapper_thinp/many_1013 SELinux context reset: setfscreatecon failed: Too many open files Physical volume "/dev/snapper_thinp/many_1013" successfully created. [root@hayes-01 ~]# ulimit -n 2048 && pvcreate --config devices/scan_lvs=1 /dev/snapper_thinp/many_1013 Physical volume "/dev/snapper_thinp/many_1013" successfully created. [root@hayes-01 ~]# lvcreate --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1014 WARNING: Sum of all thin volume sizes (1.99 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. *** buffer overflow detected ***: lvcreate terminated Aborted (core dumped) Jul 16 18:06:24 hayes-01 systemd[1]: Started Process Core Dump (PID 23044/UID 0). Jul 16 18:06:24 hayes-01 systemd-coredump[23045]: Process 23043 (lvcreate) of user 0 dumped core.#012#012Stack trace of thread 23043:#012#0 0x00007f67f23698af raise (libc.so.6)#012#1 0x00007f67f2353cc5 abort (libc.so.6)#012#2 0x00007f67f23acbe7 __libc_message (libc.so.6)#012#3 0x00007f67f243fe95 __GI___fortify_fail_abort (libc.so.6)#012#4 0x00007f67f243fec7 __fortify_fail (libc.so.6)#012#5 0x00007f67f243de86 __chk_fail (libc.so.6)#012#6 0x00007f67f243fdbb __fdelt_chk (libc.so.6)#012#7 0x00007f67f3d1e608 _daemon_write.isra.0 (libdevmapper-event.so.1.02)#012#8 0x00007f67f3d1ee68 daemon_talk (libdevmapper-event.so.1.02)#012#9 0x00007f67f3d1f7b3 _do_event (libdevmapper-event.so.1.02)#012#10 0x00007f67f3d1fd64 dm_event_get_registered_device (libdevmapper-event.so.1.02)#012#11 0x000055fb3e0f29b2 monitor_dev_for_events (lvm)#012#12 0x000055fb3e0f28d8 monitor_dev_for_events (lvm)#012#13 0x000055fb3e0f3d8c lv_suspend_if_active (lvm)#012#14 0x000055fb3e1402ea _lv_create_an_lv (lvm)#012#15 0x000055fb3e1408d4 lv_create_single (lvm)#012#16 0x000055fb3e0b9f28 _lvcreate_single (lvm)#012#17 0x000055fb3e0dcb2a process_each_vg (lvm)#012#18 0x000055fb3e0bba33 lvcreate (lvm)#012#19 0x000055fb3e0c418f lvm_run_command (lvm)#012#20 0x000055fb3e0c54c3 lvm2_main (lvm)#012#21 0x00007f67f2355843 __libc_start_main (libc.so.6)#012#22 0x000055fb3e0a1bae _start (lvm) Version-Release number of selected component (if applicable): kernel-4.18.0-114.el8 BUILT: Wed Jul 10 10:15:20 CDT 2019 lvm2-2.03.05-1.el8 BUILT: Mon Jun 17 05:59:47 CDT 2019 lvm2-libs-2.03.05-1.el8 BUILT: Mon Jun 17 05:59:47 CDT 2019 lvm2-dbusd-2.03.05-1.el8 BUILT: Mon Jun 17 06:01:56 CDT 2019 lvm2-lockd-2.03.05-1.el8 BUILT: Mon Jun 17 05:59:47 CDT 2019 device-mapper-1.02.163-1.el8 BUILT: Mon Jun 17 05:59:47 CDT 2019 device-mapper-libs-1.02.163-1.el8 BUILT: Mon Jun 17 05:59:47 CDT 2019 device-mapper-event-1.02.163-1.el8 BUILT: Mon Jun 17 05:59:47 CDT 2019 device-mapper-event-libs-1.02.163-1.el8 BUILT: Mon Jun 17 05:59:47 CDT 2019 device-mapper-persistent-data-0.8.5-2.el8 BUILT: Wed Jun 5 10:28:04 CDT 2019
Jul 16 18:23:20 hayes-01 systemd[1]: Started ABRT Automated Bug Reporting Tool. Jul 16 18:23:32 hayes-01 systemd[1]: Started Process Core Dump (PID 24123/UID 0). Jul 16 18:23:32 hayes-01 systemd-coredump[24124]: Resource limits disable core dumping for process 24121 (lvcreate). Jul 16 18:23:32 hayes-01 systemd-coredump[24124]: Process 24121 (lvcreate) of user 0 dumped core. Might be a catch 22 here. I can't trip the bug w/o going over the fd limit, and I can't get a valid core unless the fd limit isn't hit.
*** Bug 1739108 has been marked as a duplicate of this bug. ***
Any caller of open() should clearly handle EMFILE errors, although it'll probably be a bit of whack-a-mole for a while (it looks like this one is in the libdevmapper library). Even in cases where open failures are handled, we'll be testing error paths that may not have been tested before. In this sense it's a useful process. (We can also test this by lowering the ulimit to a very artifically low value and trying to use lvm with fewer devices.) In practice, we never want to get to a place where we hit EMFILE errors. We're currently hitting EMFILE between 900 and 1000 PVs (given the 1024 fd limit), and it seems this is a little low for some customers. I've fixed this with this upstream commit where lvm will raise its soft open file limit (1024) up to the hard limit (4096). That means lvm will by default handle around 4000 PVs without the user needing to configure new ulimits. https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=ecefcc9ca89982dbc5cc6b210da7ca9d0fef655b
This is fixed in 8.3. The -vvvv output will show what lvm is estimating is needed for the limit and what it can get. This can't be completely fixed because there will always be a limit where this can be forced to fail. At that point this bug doesn't apply and a new set of bugs is needed to address all the places in the code where EMFILE is possible. That's a grey area of fixability which is approaching ENOMEM type of fixes.
Created attachment 1723782 [details] verbose lvcreate that ended up failing
This is not a general problem related to ulimit and many PVs, but must be a specific issue with the way thin is being used, in which extra fd's are being created. This bz should be considered fixed and let's open a new bz showing the complete sequence of steps to create the issue in comment 6. Then we can track down where the code is unexpectedly consuming fd's.
The current design of lvm is to keep an fd open for each LVM PV [1]. This means the user must ensure that the hard ulimit for max fds is greater by some margin than the number of PVs being used. It's not clear from the comments above what is the exact combination of: number of PVs, ulimit -n setting, and command failure. I'm guessing that the ulimit setting is in the vicinity of the number of PVs? If that's not correct I'll need to see the full sequence of commands to diagnose further. When I set ulimit well above the number of PVs I cannot reproduce an error like this. [1] We intend to change this at some point which will make this more difficult to hit.
If this is not the same bug that was reported here, then I'm not sure there is a way to definitively verify this fix and this would have to be marked verified (SanityOnly). kernel-4.18.0-271.el8 BUILT: Fri Jan 8 03:32:43 CST 2021 lvm2-2.03.11-0.4.20201222gitb84a992.el8 BUILT: Tue Dec 22 06:33:49 CST 2020 lvm2-libs-2.03.11-0.4.20201222gitb84a992.el8 BUILT: Tue Dec 22 06:33:49 CST 2020 # With ulimit unlimited, I shouldn't run into a max, correct? [root@hayes-03 ~]# ulimit unlimited [root@hayes-03 ~]# ulimit -n 2048 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1006 WARNING: Sum of all thin volume sizes (<1.98 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. *** buffer overflow detected ***: lvcreate terminated Aborted (core dumped) [root@hayes-03 ~]# ulimit -n 2048 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1007 WARNING: Sum of all thin volume sizes (<1.98 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. *** buffer overflow detected ***: lvcreate terminated Aborted (core dumped) [root@hayes-03 ~]# lsof | wc -l 10210 # Setting a much higher ulimit [root@hayes-03 ~]# ulimit -n 10240 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1008 WARNING: Sum of all thin volume sizes (1.98 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. *** buffer overflow detected ***: lvcreate terminated Aborted (core dumped) # Setting a much higher ulimit [root@hayes-03 ~]# ulimit -n 11264 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1009 WARNING: Sum of all thin volume sizes (1.98 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. *** buffer overflow detected ***: lvcreate terminated Aborted (core dumped) Core was generated by `lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n'. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007fab87248db5 in __GI_abort () at abort.c:79 #2 0x00007fab872a14e7 in __libc_message (action=<optimized out>, fmt=fmt@entry=0x7fab873b07fc "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:181 #3 0x00007fab87334355 in __GI___fortify_fail_abort (need_backtrace=need_backtrace@entry=true, msg=msg@entry=0x7fab873b0779 "buffer overflow detected") at fortify_fail.c:28 #4 0x00007fab87334387 in __GI___fortify_fail (msg=msg@entry=0x7fab873b0779 "buffer overflow detected") at fortify_fail.c:44 #5 0x00007fab87332346 in __GI___chk_fail () at chk_fail.c:28 #6 0x00007fab8733427b in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25 #7 0x00007fab88d26638 in _daemon_write (msg=msg@entry=0x7ffd6b903e50, fifos=<optimized out>, fifos=<optimized out>) at libdevmapper-event.c:329 #8 0x00007fab88d26ed8 in daemon_talk (fifos=fifos@entry=0x7ffd6b903d10, msg=msg@entry=0x7ffd6b903e50, cmd=cmd@entry=8, dso_name=dso_name@entry=0x0, dev_name=dev_name@entry=0x0, evmask=evmask@entry=0, timeout=0) at libdevmapper-event.c:381 #9 0x00007fab88d27823 in _do_event (cmd=cmd@entry=4, dmeventd_path=<optimized out>, msg=msg@entry=0x7ffd6b903e50, dso_name=0x564f926d81f0 "libdevmapper-event-lvm2thin.so", dev_name=0x564f92754c80 "LVM-9ytrNmdwX3ERdBlpP2BOJ7wQ2Zb1YmM92WzHJLhyMJmE2bWrW8Uzo1ZXcNO1hmMV-tpool", evmask=DM_EVENT_ERROR_MASK, timeout=0) at libdevmapper-event.c:630 #10 0x00007fab88d27dd4 in dm_event_get_registered_device (dmevh=dmevh@entry=0x564f92162ea0, next=next@entry=0) at libdevmapper-event.c:768 #11 0x0000564f8f90a771 in target_registered_with_dmeventd (cmd=0x564f91516a10, dso=<optimized out>, lv=<optimized out>, pending=0x7ffd6b903f58, monitored=0x7ffd6b903f5c) at activate/activate.c:1662 #12 0x0000564f8f90af0d in monitor_dev_for_events (cmd=<optimized out>, lv=0x564f92a45450, laopts=<optimized out>, monitor=<optimized out>) at activate/activate.c:1882 #13 0x0000564f8f90ac88 in monitor_dev_for_events (cmd=cmd@entry=0x564f91516a10, lv=lv@entry=0x564f92a455b8, laopts=<optimized out>, laopts@entry=0x7ffd6b904190, monitor=monitor@entry=1) at activate/activate.c:1848 #14 0x0000564f8f90b7c7 in _lv_resume (cmd=cmd@entry=0x564f91516a10, laopts=laopts@entry=0x7ffd6b904190, error_if_not_active=error_if_not_active@entry=0, lv=lv@entry=0x564f92a455b8, lvid_s=<optimized out>) at activate/activate.c:2333 #15 0x0000564f8f90ca31 in _lv_resume (lvid_s=0x0, lv=0x564f92a455b8, error_if_not_active=0, laopts=0x7ffd6b904190, cmd=0x564f91516a10) at activate/activate.c:2361 #16 lv_resume_if_active (cmd=cmd@entry=0x564f91516a10, lvid_s=lvid_s@entry=0x0, origin_only=origin_only@entry=1, exclusive=exclusive@entry=0, revert=revert@entry=0, lv=0x564f92a455b8) at activate/activate.c:2361 #17 0x0000564f8f90d91c in resume_lv_origin (cmd=cmd@entry=0x564f91516a10, lv=lv@entry=0x564f9268c1e8) at activate/activate.c:2932 #18 0x0000564f8f965ec0 in _lv_create_an_lv (vg=vg@entry=0x564f9268a720, lp=lp@entry=0x7ffd6b904700, new_lv_name=<optimized out>) at metadata/lv_manip.c:8504 #19 0x0000564f8f96670c in lv_create_single (vg=vg@entry=0x564f9268a720, lp=lp@entry=0x7ffd6b904700) at metadata/lv_manip.c:8798 #20 0x0000564f8f8cec08 in _lvcreate_single (cmd=cmd@entry=0x564f91516a10, vg_name=vg_name@entry=0x564f9265c410 "snapper_thinp", vg=vg@entry=0x564f9268a720, handle=handle@entry=0x564f91528288) at lvcreate.c:1752 #21 0x0000564f8f8f3d4a in _process_vgnameid_list (process_single_vg=0x564f8f8cde40 <_lvcreate_single>, handle=0x564f91528288, arg_tags=0x7ffd6b904570, arg_vgnames=0x7ffd6b904580, vgnameids_to_process=0x7ffd6b9045a0, read_flags=1048576, cmd=0x564f91516a10) at toollib.c:1933 #22 process_each_vg (cmd=cmd@entry=0x564f91516a10, argc=argc@entry=0, argv=argv@entry=0x0, one_vgname=<optimized out>, use_vgnames=use_vgnames@entry=0x0, read_flags=read_flags@entry=1048576, include_internal=<optimized out>, handle=0x564f91528288, process_single_vg=<optimized out>) at toollib.c:2238 #23 0x0000564f8f8cfd7e in lvcreate (cmd=0x564f91516a10, argc=<optimized out>, argv=<optimized out>) at lvcreate.c:1797 #24 0x0000564f8f8d6e3d in lvm_run_command (cmd=cmd@entry=0x564f91516a10, argc=<optimized out>, argc@entry=10, argv=<optimized out>, argv@entry=0x7ffd6b904bf8) at lvmcmdline.c:3135 #25 0x0000564f8f8d8153 in lvm2_main (argc=10, argv=0x7ffd6b904bf8) at lvmcmdline.c:3667 #26 0x00007fab8724a493 in __libc_start_main (main=0x564f8f8b1ae0 <main>, argc=10, argv=0x7ffd6b904bf8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd6b904be8) at ../csu/libc-start.c:314 #27 0x0000564f8f8b1bae in _start () at lvm.c:22
[root@hayes-03]# grep reserved_ /etc/lvm/lvm.conf # Configuration option activation/reserved_stack. # reserved_stack = 64 # Configuration option activation/reserved_memory. reserved_memory = 131072 # edited by QA test script qe_lvmconf (on Wed Jan 13 16:02:13 CST 2021)! [root@hayes-03]# cat /proc/meminfo MemTotal: 263631084 kB MemFree: 260599272 kB MemAvailable: 260372304 kB Buffers: 3280 kB Cached: 1105320 kB SwapCached: 0 kB Active: 364008 kB Inactive: 941336 kB Active(anon): 213732 kB Inactive(anon): 24708 kB Active(file): 150276 kB Inactive(file): 916628 kB Unevictable: 23224 kB Mlocked: 23224 kB SwapTotal: 4194300 kB SwapFree: 4194300 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 212144 kB Mapped: 112076 kB Shmem: 31168 kB KReclaimable: 400484 kB Slab: 889016 kB SReclaimable: 400484 kB SUnreclaim: 488532 kB KernelStack: 25088 kB PageTables: 9644 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 136009840 kB Committed_AS: 902072 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB Percpu: 404736 kB HardwareCorrupted: 0 kB AnonHugePages: 81920 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 537300 kB DirectMap2M: 23347200 kB DirectMap1G: 246415360 kB
I'm attempting to replicate but I'm not sure if my test is going to be equivalent. The original description says to create >1024 thin LVs, and create a PV on each, so I'm thinking of trying a loop like: ulimit -n 1200 for i in `seq 1100`; do lvcreate --type thin -n thinpv$i --thinpool pool -V1G vg pvcreate /dev/vg/thinpv$i pvs | grep thinpv | wc -l done I'd expect a test with 1100 PVs to work with ulimit of about 1200 since we need to account for misc fds that are needed for other things. After that loop, I'd try doing a vgcreate on each of those thinpv devs, since we want to ensure that we can handle that number of PVs which actually have metadata on them. From the last comment I'm seeing snapshots and no pvcreate step so I'm not sure if my test is going to be the same.
PV creates are happening as a part of this scenario, up until the overflow. After that I didn't include them in my bug comment. Here's the test output of what was happening prior to the failure: [...] 999 ulimit -n 3072 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_999 WARNING: Sum of all thin volume sizes (1.96 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). ulimit -n 3072 && lvchange --config --addtag 'OVIRT_VOL_INITIALIZING_999' /dev/snapper_thinp/many_999 ulimit -n 3072 && pvcreate --config devices/scan_lvs=1 /dev/snapper_thinp/many_999 WARNING: Heartbeat did not respond at start of command. Check btimed service on remote host 1000 ulimit -n 3072 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1000 WARNING: Sum of all thin volume sizes (1.96 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). ulimit -n 3072 && lvchange --config --addtag 'OVIRT_VOL_INITIALIZING_1000' /dev/snapper_thinp/many_1000 ulimit -n 3072 && pvcreate --config devices/scan_lvs=1 /dev/snapper_thinp/many_1000 1001 ulimit -n 3072 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1001 WARNING: Sum of all thin volume sizes (<1.97 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). ulimit -n 3072 && lvchange --config --addtag 'OVIRT_VOL_INITIALIZING_1001' /dev/snapper_thinp/many_1001 ulimit -n 3072 && pvcreate --config devices/scan_lvs=1 /dev/snapper_thinp/many_1001 1002 ulimit -n 3072 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1002 WARNING: Sum of all thin volume sizes (<1.97 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). ulimit -n 3072 && lvchange --config --addtag 'OVIRT_VOL_INITIALIZING_1002' /dev/snapper_thinp/many_1002 ulimit -n 3072 && pvcreate --config devices/scan_lvs=1 /dev/snapper_thinp/many_1002 1003 ulimit -n 3072 && lvcreate --yes --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1003 WARNING: Sum of all thin volume sizes (1.97 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB). *** buffer overflow detected ***: lvcreate terminated couldn't create snapshot many_1003
The test I had started in comment 18 finished without any problems (and I followed that with 100 vgcreates). It looks like the problem is not the number of PVs, but rather the thin snapshots, and the monitoring that comes along with those. It seems the issue is probably the monitoring and not the snapshots. In the VG with 1200 PVs created above, I tried to create a single thin snapshot and it immediately hung with the same backtrace. (gdb) bt #0 0x00007f309bbe5805 in read () from /lib64/libc.so.6 #1 0x00007f309d2ea7e1 in _daemon_write (fifos=0x7ffe8f4fa240, msg=0x7ffe8f4fa2c0) at libdevmapper-event.c:316 #2 0x00007f309d2eab21 in daemon_talk (fifos=0x7ffe8f4fa240, msg=0x7ffe8f4fa2c0, cmd=8, dso_name=0x0, dev_name=0x0, evmask=0, timeout=0) at libdevmapper-event.c:381 #3 0x00007f309d2ebc3b in _do_event (cmd=4, dmeventd_path=0x564b170ac1c0 "/usr/sbin/dmeventd", msg=0x7ffe8f4fa2c0, dso_name=0x564b16ffb5c0 "libdevmapper-event-lvm2thin.so", dev_name=0x564b172f5360 "LVM-KEp4Bt24bKCEWlJc1uEInbIJu6KgP8Pkimr2l5DoXkXQlBpipRtNlEBoMaepHCC9-tpool", evmask=DM_EVENT_ERROR_MASK, timeout=0) at libdevmapper-event.c:630 #4 0x00007f309d2ec2d0 in dm_event_get_registered_device (dmevh=0x564b1702c9b0, next=0) at libdevmapper-event.c:768 #5 0x0000564b155564c3 in target_registered_with_dmeventd (cmd=0x564b165005d0, dso=0x564b165299f0 "libdevmapper-event-lvm2thin.so", lv=0x564b173ee568, pending=0x7ffe8f4fa444, monitored=0x7ffe8f4fa440) at activate/activate.c:1662 #6 0x0000564b1561f99d in _target_registered (seg=0x564b1744a980, pending=0x7ffe8f4fa444, monitored=0x7ffe8f4fa440) at thin/thin.c:421 #7 0x0000564b15557027 in monitor_dev_for_events (cmd=0x564b165005d0, lv=0x564b173ee568, laopts=0x564b156b8040 <zlaopts>, monitor=1) at activate/activate.c:1882 #8 0x0000564b15556dbe in monitor_dev_for_events (cmd=0x564b165005d0, lv=0x564b1744a2f0, laopts=0x7ffe8f4fa650, monitor=1) at activate/activate.c:1849 #9 0x0000564b15558b17 in _lv_resume (cmd=0x564b165005d0, lvid_s=0x0, laopts=0x7ffe8f4fa650, error_if_not_active=0, lv=0x564b1744a2f0) at activate/activate.c:2333 #10 0x0000564b15558bd7 in lv_resume_if_active (cmd=0x564b165005d0, lvid_s=0x0, origin_only=1, exclusive=0, revert=0, lv=0x564b1744a2f0) at activate/activate.c:2361 #11 0x0000564b1555a575 in resume_lv_origin (cmd=0x564b165005d0, lv=0x564b17498330) at activate/activate.c:2932 #12 0x0000564b155c8b4e in _lv_create_an_lv (vg=0x564b172407c0, lp=0x7ffe8f4faac0, new_lv_name=0x7ffe8f4fc4e4 "snappv1") at metadata/lv_manip.c:8508 #13 0x0000564b155c9cca in lv_create_single (vg=0x564b172407c0, lp=0x7ffe8f4faac0) at metadata/lv_manip.c:8802 #14 0x0000564b1550e8bb in _lvcreate_single (cmd=0x564b165005d0, vg_name=0x564b17016d50 "test", vg=0x564b172407c0, handle=0x564b165ebe38) at lvcreate.c:1752 #15 0x0000564b15539a6a in _process_vgnameid_list (cmd=0x564b165005d0, read_flags=1048576, vgnameids_to_process=0x7ffe8f4fa9b0, arg_vgnames=0x7ffe8f4fa9d0, arg_tags=0x7ffe8f4fa9e0, handle=0x564b165ebe38, process_single_vg=0x564b1550e289 <_lvcreate_single>) at toollib.c:1933 #16 0x0000564b1553a63e in process_each_vg (cmd=0x564b165005d0, argc=0, argv=0x0, one_vgname=0x564b165ebe30 "test", use_vgnames=0x0, read_flags=1048576, include_internal=0, handle=0x564b165ebe38, process_single_vg=0x564b1550e289 <_lvcreate_single>) at toollib.c:2238 #17 0x0000564b1550eb4d in lvcreate (cmd=0x564b165005d0, argc=1, argv=0x7ffe8f4faed0) at lvcreate.c:1797 #18 0x0000564b1551803b in lvm_run_command (cmd=0x564b165005d0, argc=1, argv=0x7ffe8f4faed0) at lvmcmdline.c:3135 #19 0x0000564b155196d7 in lvm2_main (argc=10, argv=0x7ffe8f4fae88) at lvmcmdline.c:3667 #20 0x0000564b155516b7 in main (argc=10, argv=0x7ffe8f4fae88) at lvm.c:22 This bug was opened to address problems validating bug 1691277 (too many open files with a large number of PVs.) The fix for that was https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=ecefcc9ca89982dbc5cc6b210da7ca9d0fef655b It still looks like that bug is properly fixed by the commit. What we're seeing now is a new bug related to thin/monitoring that was exposed by the unconventional way that the test is creating many PVs using thin snapshots.
I'm moving the issue that I've been seeing to rhel8.4 bug 1916919. Let's use the results that dct provided in comments #18 and #20 to mark this bug verified.