Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1411668 - rgmanager general protection fault in malloc_consolidate
Summary: rgmanager general protection fault in malloc_consolidate
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rgmanager
Version: 5.11
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: ---
Assignee: Ryan McCabe
QA Contact: cluster-qe@redhat.com
Depends On:
TreeView+ depends on / blocked
Reported: 2017-01-10 09:19 UTC by Josef Zimek
Modified: 2020-09-10 10:06 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2017-04-04 20:28:51 UTC
Target Upstream Version:

Attachments (Terms of Use)
rgmanager core dump (461.22 KB, application/x-gzip)
2017-01-10 09:19 UTC, Josef Zimek
no flags Details

Description Josef Zimek 2017-01-10 09:19:23 UTC
Created attachment 1239017 [details]
rgmanager core dump

Description of problem:

rgmanager segfaults in malloc_consolidate. this is supposed to be fixed in version 2.0.52-47.el5 (https://access.redhat.com/solutions/118963) however customer is hitting this issue in rgmanager-2.0.52-54.el5.x86_64

Jan  3 12:56:43 node1 kernel: clurgmgrd[19093] general protection rip:3eec670454 rsp:4e721ac0 error:0

Core was generated by `clurgmgrd -w'.
Program terminated with signal 11, Segmentation fault.

#0  0x0000003eec670454 in malloc_consolidate () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003eec670454 in malloc_consolidate () from /lib64/libc.so.6
#1  0x0000003eec672a6c in _int_malloc () from /lib64/libc.so.6
#2  0x0000003eec674cde in malloc () from /lib64/libc.so.6
#3  0x000000000042847b in member_list_dup (orig=0x12fba450) at members.c:577
#4  0x00000000004277f0 in member_list () at members.c:210
#5  0x000000000041d6c9 in get_rg_state (name=0x4e721ee0 "service:srv_concours", svcblk=0x4e721e30)
    at rg_state.c:368
#6  0x000000000041f06b in svc_status (svcName=0x4e721ee0 "service:srv_concours") at rg_state.c:1229
#7  0x000000000040a017 in resgroup_thread_main (arg=0x419a7fb0) at rg_thread.c:464
#8  0x000000000042f1f5 in setup_thread (thread_arg=0x419a7fb0) at tmgr.c:71
#9  0x0000003eece0673d in start_thread () from /lib64/libpthread.so.0
#10 0x0000003eec6d3d1d in clone () from /lib64/libc.so.6

Version-Release number of selected component (if applicable):
Customer runs rhel5.5 with rgmanager-2.0.52-54.el5.x86_64 installed

How reproducible:
random event

Actual results:
rgmanager segfaults

Expected results:
rgmanager keeps operationala without segfault

Comment 2 Jan Pokorný [poki] 2017-01-26 15:11:01 UTC
I'd bet we are facing some kind of global heap corruption as opposed
to glibc being buggy, which is the only other possible explanation
(resulting address space etc. is self-managed by glibc in case of malloc
call, and malloc should be thread-safe on its own) if we put
a possibility of faulty hardware aside.

Being multithreaded, rgmanager is prone to certain race conditions
(due to improper/missing locking) that might cause corruption of
internal glibc structures used for heap management, that would
exhibit such random crashes in malloc.

Comment 4 Chris Williams 2017-04-04 20:28:51 UTC
Red Hat Enterprise Linux 5 shipped it's last minor release, 5.11, on September 14th, 2014. On March 31st, 2017 RHEL 5 exits Production Phase 3 and enters Extended Life Phase. For RHEL releases in the Extended Life Phase, Red Hat  will provide limited ongoing technical support. No bug fixes, security fixes, hardware enablement or root-cause analysis will be available during this phase, and support will be provided on existing installations only.  If the customer purchases the Extended Life-cycle Support (ELS), certain critical-impact security fixes and selected urgent priority bug fixes for the last minor release will be provided.  For more details please consult the Red Hat Enterprise Linux Life Cycle Page:


This BZ does not appear to meet ELS criteria so is being closed WONTFIX. If this BZ is critical for your environment and you have an Extended Life-cycle Support Add-on entitlement, please open a case in the Red Hat Customer Portal, https://access.redhat.com ,provide a thorough business justification and ask that the BZ be re-opened for consideration of an errata. Please note, only certain critical-impact security fixes and selected urgent priority bug fixes for the last minor release can be considered.

Note You need to log in before you can comment on or make changes to this bug.