Bug 1730503 - lvcreate core dumped after 1024 max open fd was reached and exceeded
Summary: lvcreate core dumped after 1024 max open fd was reached and exceeded
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: lvm2
Version: 8.1
Hardware: x86_64
OS: Linux
urgent
medium
Target Milestone: rc
: 8.1
Assignee: David Teigland
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
: 1739108 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-16 23:17 UTC by Corey Marthaler
Modified: 2020-02-07 16:59 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)

Description Corey Marthaler 2019-07-16 23:17:13 UTC
Description of problem:
This is a regression scenario for bug 1691277. Basically create >1024 lvm thin vols and create a PV on top of each. I'll try and grab a valid core and bt.


[...]
1010 lvcreate  -y -k n -s /dev/snapper_thinp/origin -n many_1010
1011 lvcreate  -y -k n -s /dev/snapper_thinp/origin -n many_1011
1012 lvcreate  -y -k n -s /dev/snapper_thinp/origin -n many_1012
1013 lvcreate  -y -k n -s /dev/snapper_thinp/origin -n many_1013
Although the snap create passed, errors were found in it's output
  WARNING: You have not turned on protection against thin pools running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
  WARNING: Sum of all thin volume sizes (1.99 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB).
  /run/dmeventd-client: open failed: Too many open files
  WARNING: Failed to monitor snapper_thinp/POOL.
  Logical volume "many_1013" created.



[root@hayes-01 ~]# pvcreate --config devices/scan_lvs=1 /dev/snapper_thinp/many_1013
  SELinux context reset: setfscreatecon failed: Too many open files
  Physical volume "/dev/snapper_thinp/many_1013" successfully created.

[root@hayes-01 ~]# ulimit -n 2048 && pvcreate --config devices/scan_lvs=1 /dev/snapper_thinp/many_1013
  Physical volume "/dev/snapper_thinp/many_1013" successfully created.

[root@hayes-01 ~]# lvcreate --config devices/scan_lvs=1 -k n -s /dev/snapper_thinp/origin -n many_1014
  WARNING: Sum of all thin volume sizes (1.99 TiB) exceeds the size of thin pool snapper_thinp/POOL (2.00 GiB).
  WARNING: You have not turned on protection against thin pools running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
*** buffer overflow detected ***: lvcreate terminated
Aborted (core dumped)


Jul 16 18:06:24 hayes-01 systemd[1]: Started Process Core Dump (PID 23044/UID 0).
Jul 16 18:06:24 hayes-01 systemd-coredump[23045]: Process 23043 (lvcreate) of user 0 dumped core.#012#012Stack trace of thread 23043:#012#0  0x00007f67f23698af raise (libc.so.6)#012#1  0x00007f67f2353cc5 abort (libc.so.6)#012#2  0x00007f67f23acbe7 __libc_message (libc.so.6)#012#3  0x00007f67f243fe95 __GI___fortify_fail_abort (libc.so.6)#012#4  0x00007f67f243fec7 __fortify_fail (libc.so.6)#012#5  0x00007f67f243de86 __chk_fail (libc.so.6)#012#6  0x00007f67f243fdbb __fdelt_chk (libc.so.6)#012#7  0x00007f67f3d1e608 _daemon_write.isra.0 (libdevmapper-event.so.1.02)#012#8  0x00007f67f3d1ee68 daemon_talk (libdevmapper-event.so.1.02)#012#9  0x00007f67f3d1f7b3 _do_event (libdevmapper-event.so.1.02)#012#10 0x00007f67f3d1fd64 dm_event_get_registered_device (libdevmapper-event.so.1.02)#012#11 0x000055fb3e0f29b2 monitor_dev_for_events (lvm)#012#12 0x000055fb3e0f28d8 monitor_dev_for_events (lvm)#012#13 0x000055fb3e0f3d8c lv_suspend_if_active (lvm)#012#14 0x000055fb3e1402ea _lv_create_an_lv (lvm)#012#15 0x000055fb3e1408d4 lv_create_single (lvm)#012#16 0x000055fb3e0b9f28 _lvcreate_single (lvm)#012#17 0x000055fb3e0dcb2a process_each_vg (lvm)#012#18 0x000055fb3e0bba33 lvcreate (lvm)#012#19 0x000055fb3e0c418f lvm_run_command (lvm)#012#20 0x000055fb3e0c54c3 lvm2_main (lvm)#012#21 0x00007f67f2355843 __libc_start_main (libc.so.6)#012#22 0x000055fb3e0a1bae _start (lvm)




Version-Release number of selected component (if applicable):
kernel-4.18.0-114.el8    BUILT: Wed Jul 10 10:15:20 CDT 2019
lvm2-2.03.05-1.el8    BUILT: Mon Jun 17 05:59:47 CDT 2019
lvm2-libs-2.03.05-1.el8    BUILT: Mon Jun 17 05:59:47 CDT 2019
lvm2-dbusd-2.03.05-1.el8    BUILT: Mon Jun 17 06:01:56 CDT 2019
lvm2-lockd-2.03.05-1.el8    BUILT: Mon Jun 17 05:59:47 CDT 2019
device-mapper-1.02.163-1.el8    BUILT: Mon Jun 17 05:59:47 CDT 2019
device-mapper-libs-1.02.163-1.el8    BUILT: Mon Jun 17 05:59:47 CDT 2019
device-mapper-event-1.02.163-1.el8    BUILT: Mon Jun 17 05:59:47 CDT 2019
device-mapper-event-libs-1.02.163-1.el8    BUILT: Mon Jun 17 05:59:47 CDT 2019
device-mapper-persistent-data-0.8.5-2.el8    BUILT: Wed Jun  5 10:28:04 CDT 2019

Comment 1 Corey Marthaler 2019-07-16 23:36:21 UTC
Jul 16 18:23:20 hayes-01 systemd[1]: Started ABRT Automated Bug Reporting Tool.
Jul 16 18:23:32 hayes-01 systemd[1]: Started Process Core Dump (PID 24123/UID 0).
Jul 16 18:23:32 hayes-01 systemd-coredump[24124]: Resource limits disable core dumping for process 24121 (lvcreate).
Jul 16 18:23:32 hayes-01 systemd-coredump[24124]: Process 24121 (lvcreate) of user 0 dumped core.

Might be a catch 22 here. I can't trip the bug w/o going over the fd limit, and I can't get a valid core unless the fd limit isn't hit.

Comment 2 Corey Marthaler 2019-08-13 16:36:16 UTC
*** Bug 1739108 has been marked as a duplicate of this bug. ***

Comment 3 David Teigland 2019-08-14 14:37:22 UTC
Any caller of open() should clearly handle EMFILE errors, although it'll probably be a bit of whack-a-mole for a while (it looks like this one is in the libdevmapper library).  Even in cases where open failures are handled, we'll be testing error paths that may not have been tested before.  In this sense it's a useful process.  (We can also test this by lowering the ulimit to a very artifically low value and trying to use lvm with fewer devices.)

In practice, we never want to get to a place where we hit EMFILE errors.  We're currently hitting EMFILE between 900 and 1000 PVs (given the 1024 fd limit), and it seems this is a little low for some customers.  I've fixed this with this upstream commit where lvm will raise its soft open file limit (1024) up to the hard limit (4096).  That means lvm will by default handle around 4000 PVs without the user needing to configure new ulimits.

https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=ecefcc9ca89982dbc5cc6b210da7ca9d0fef655b


Note You need to log in before you can comment on or make changes to this bug.