RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 698393 - clvmd crashes when attempting to create thousands of LVs
Summary: clvmd crashes when attempting to create thousands of LVs
Keywords:
Status: CLOSED DUPLICATE of bug 730289
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Milan Broz
QA Contact: Corey Marthaler
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-20 19:31 UTC by Corey Marthaler
Modified: 2013-03-01 04:10 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-10-11 11:19:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
coredump from taft-02 (4.63 MB, application/x-gzip)
2011-04-20 19:41 UTC, Corey Marthaler
no flags Details

Description Corey Marthaler 2011-04-20 19:31:32 UTC
Description of problem:
This may just be a dup of bug 697945.

I tried this on all the nodes in the cluster

for i in $(seq 1 10); 
do 
  for j in $(seq 1 75); 
  do 
    lvcreate -n b_$j -L 12M b$i & 
  done; 
done

clvmd[7062]: segfault at 7fff8d42b550 ip 000000000041140f sp 00007fff7d909d80 error 4 in clvmd[4000]
Apr 20 14:04:57 taft-02 kernel: clvmd[7062]: segfault at 7fff8d42b550 ip 000000000041140f sp 00007f]
Apr 20 14:05:13 taft-02 abrt[9866]: saved core dump of pid 7062 (/usr/sbin/clvmd) to /var/spool/abr)
Apr 20 14:05:13 taft-02 abrtd: Directory 'ccpp-1303326297-7062' creation detected
Apr 20 14:05:14 taft-02 abrt[9866]: size of '/var/spool/abrt' >= 1250 MB, deleting 'ccpp-1303233863'
Apr 20 14:05:14 taft-02 abrtd: Size of '/var/spool/abrt' >= 1000 MB, deleting 'ccpp-1303233863-1487'
Apr 20 14:05:14 taft-02 abrtd: New crash /var/spool/abrt/ccpp-1303326297-7062, processing


Core was generated by `clvmd -T30'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000041140f in main_loop (local_sock=<value optimized out>, cmd_timeout=60) at clvmd.c:860
860                                     if (FD_ISSET(thisfd->fd, &in)) {
Missing separate debuginfos, use: debuginfo-install clusterlib-3.0.12-41.el6.x86_64 corosynclib-1.2.3-36.el6.x86_64 glibc-2.12-1.25.el6.x86_64 libgcc-4.4.5-6.el6.x86_64 libselinux-2.0.94-5.el6.x86_64 libsepol-2.0.41-3.el6.x86_64 libudev-147-2.35.el6.x86_64
(gdb) bt
#0  0x000000000041140f in main_loop (local_sock=<value optimized out>, cmd_timeout=60) at clvmd.c:860
#1  0x0000000000412f61 in main (argc=<value optimized out>, argv=0x7fff7d90aa38) at clvmd.c:596


Version-Release number of selected component (if applicable):
2.6.32-131.0.1.el6.x86_64

lvm2-2.02.83-3.el6    BUILT: Fri Mar 18 09:31:10 CDT 2011
lvm2-libs-2.02.83-3.el6    BUILT: Fri Mar 18 09:31:10 CDT 2011
lvm2-cluster-2.02.83-3.el6    BUILT: Fri Mar 18 09:31:10 CDT 2011
udev-147-2.35.el6    BUILT: Wed Mar 30 07:32:05 CDT 2011
device-mapper-1.02.62-3.el6    BUILT: Fri Mar 18 09:31:10 CDT 2011
device-mapper-libs-1.02.62-3.el6    BUILT: Fri Mar 18 09:31:10 CDT 2011
device-mapper-event-1.02.62-3.el6    BUILT: Fri Mar 18 09:31:10 CDT 2011
device-mapper-event-libs-1.02.62-3.el6    BUILT: Fri Mar 18 09:31:10 CDT 2011
cmirror-2.02.83-3.el6    BUILT: Fri Mar 18 09:31:10 CDT 2011

Comment 1 Corey Marthaler 2011-04-20 19:41:22 UTC
Created attachment 493600 [details]
coredump from taft-02

Comment 2 Corey Marthaler 2011-04-20 20:34:15 UTC
This is easily reproducible. In fact I just hit it again on all four nodes in my cluster. These are the two different stacks I saw.


Core was generated by `clvmd -T30'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000041a9d2 in lvmcache_label_scan ()
Missing separate debuginfos, use: debuginfo-install lvm2-cluster-2.02.83-3.el6.x86_64
(gdb) bt
#0  0x000000000041a9d2 in lvmcache_label_scan ()
#1  0x00000000004473d3 in lv_from_lvid ()
#2  0x00000000004173c5 in lv_activation_filter ()
#3  0x0000000000414bc3 in ?? ()
#4  0x000000000041500f in do_lock_lv ()
#5  0x00000000004100c6 in do_command ()
#6  0x000000000041372b in ?? ()
#7  0x0000000000413adc in ?? ()
#8  0x00000033054077e1 in start_thread () from /lib64/libpthread.so.0
#9  0x00000033050e68ed in clone () from /lib64/libc.so.6



Program terminated with signal 11, Segmentation fault.
#0  0x000000000041a9d2 in lvmcache_label_scan (cmd=0x7f9f6c0008c0, full_scan=2) at cache/lvmcache.c:589
589             if (full_scan == 2 && !cmd->filter->use_count && !refresh_filters(cmd)) {
Missing separate debuginfos, use: debuginfo-install clusterlib-3.0.12-41.el6.x86_64 corosynclib-1.2.3-36.el6.x86_64 glibc-2.12-1.25.el6.x86_64 libgcc-4.4.5-6.el6.x86_64 libselinux-2.0.94-5.el6.x86_64 libsepol-2.0.41-3.el6.x86_64 libudev-147-2.35.el6.x86_64
(gdb) bt
#0  0x000000000041a9d2 in lvmcache_label_scan (cmd=0x7f9f6c0008c0, full_scan=2) at cache/lvmcache.c:589
#1  0x00000000004473d3 in _vg_read_by_vgid (cmd=0x7f9f6c0008c0,
    lvid_s=0x22cd403 "bIn4Wt2DKNoo9nlg5g9ro4vqgvjBgK5i5BnRbu7QvgHWV5ty69jeLPa4SWXt31o7", precommitted=0) at metadata/metadata.c:3223
#2  lv_from_lvid (cmd=0x7f9f6c0008c0, lvid_s=0x22cd403 "bIn4Wt2DKNoo9nlg5g9ro4vqgvjBgK5i5BnRbu7QvgHWV5ty69jeLPa4SWXt31o7",
    precommitted=0) at metadata/metadata.c:3262
#3  0x00000000004173c5 in lv_activation_filter (cmd=0x7f9f6c0008c0,
    lvid_s=0x22cd403 "bIn4Wt2DKNoo9nlg5g9ro4vqgvjBgK5i5BnRbu7QvgHWV5ty69jeLPa4SWXt31o7", activate_lv=0x7f9f7145ca6c)
    at activate/activate.c:1332
#4  0x0000000000414bc3 in do_activate_lv (resource=0x22cd403 "bIn4Wt2DKNoo9nlg5g9ro4vqgvjBgK5i5BnRbu7QvgHWV5ty69jeLPa4SWXt31o7",
    lock_flags=132 '\204', mode=1) at lvm-functions.c:343
#5  0x000000000041500f in do_lock_lv (command=25 '\031', lock_flags=132 '\204',
    resource=0x22cd403 "bIn4Wt2DKNoo9nlg5g9ro4vqgvjBgK5i5BnRbu7QvgHWV5ty69jeLPa4SWXt31o7") at lvm-functions.c:532
#6  0x00000000004100c6 in do_command (client=0x227aa30, msg=0x22cd3f0, msglen=85, buf=0x7f9f7145cdd0, buflen=1481, retlen=0x7f9f7145cddc)
    at clvmd-command.c:120
#7  0x0000000000413bd1 in process_local_command (arg=<value optimized out>) at clvmd.c:1677
#8  process_work_item (arg=<value optimized out>) at clvmd.c:1910
#9  lvm_thread_fn (arg=<value optimized out>) at clvmd.c:1959
#10 0x00000033200077e1 in start_thread () from /lib64/libpthread.so.0
#11 0x000000331fce68ed in clone () from /lib64/libc.so.6

Comment 3 Milan Broz 2011-08-11 13:02:10 UTC
I hope it is fixed by properly return error if clvmd has no more file descriptors, should be part of 2.02.87 upstream.

(I was not able to reproduce clvmd crash at least with patch but backtraces differs.)

Comment 5 Corey Marthaler 2011-09-08 22:18:18 UTC
This issue still exists in the latest rpms.

Sep  8 17:11:48 taft-04 kernel: clvmd[6236]: segfault at 10 ip 000000000041cb92 sp 00007f5714374690 error 4 in clvmd[400000+9a000]

2.6.32-193.el6.x86_64

lvm2-2.02.87-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
lvm2-libs-2.02.87-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
lvm2-cluster-2.02.87-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
udev-147-2.37.el6    BUILT: Wed Aug 10 07:48:15 CDT 2011
device-mapper-1.02.66-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
device-mapper-libs-1.02.66-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
device-mapper-event-1.02.66-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
device-mapper-event-libs-1.02.66-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
cmirror-2.02.87-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011

Comment 7 Corey Marthaler 2011-09-13 20:05:05 UTC
FWIW, the bt in comment #5 looks to be similar to the one in the original report:

Core was generated by `clvmd -T30'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000041cb92 in lvmcache_label_scan ()
Missing separate debuginfos, use: debuginfo-install lvm2-cluster-2.02.87-2.el6.x86_64
(gdb) bt
#0  0x000000000041cb92 in lvmcache_label_scan ()
#1  0x000000000044eea3 in lv_from_lvid ()
#2  0x0000000000418b85 in lv_activation_filter ()
#3  0x0000000000416343 in ?? ()
#4  0x000000000041679f in do_lock_lv ()
#5  0x00000000004117b7 in do_command ()
#6  0x000000000041503b in ?? ()
#7  0x00000000004153fc in ?? ()
#8  0x00000039a2c077e1 in ?? ()
#9  0x00007f5714375700 in ?? ()
#10 0x0000000000000000 in ?? ()

Comment 9 Zdenek Kabelac 2011-10-11 11:19:17 UTC
Filter refreshing was not handled well, when clvmd runs out of free file descriptors. It also addressed within memory consumption patch set.

*** This bug has been marked as a duplicate of bug 730289 ***


Note You need to log in before you can comment on or make changes to this bug.