Bug 619476 - clurgmgrd segfaults with error 6
clurgmgrd segfaults with error 6
Status: CLOSED DUPLICATE of bug 572695
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: rgmanager (Show other bugs)
4
All Linux
low Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
: 637263 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-29 11:56 EDT by Shane Bradley
Modified: 2010-10-22 10:33 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-10-22 10:32:44 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Shane Bradley 2010-07-29 11:56:32 EDT
Description of problem:
rgmanager is generating segfaults on service state change:


Jul  7 14:10:04 nodeA clurgmgrd[500]: <notice> Service Q01STRS_TH started 
....
Jul  8 05:39:27 nodeA clurgmgrd[500]: <notice> Service Q01STRS_MY started
Jul  8 05:44:45 nodeA clurgmgrd[500]: <err> #48: Unable to obtain cluster lock: Unknown error 65539
Jul  8 05:44:45 nodeA clurgmgrd[500]: <notice> Stopping service Q01STRS_AU
Jul  8 05:44:55 nodeA clurgmgrd[500]: <notice> Service Q01STRS_AU is recovering
Jul  8 05:44:55 nodeA clurgmgrd[500]: <notice> Recovering failed service Q01STRS_AU
Jul  8 05:45:16 nodeA clurgmgrd[500]: <notice> Service Q01STRS_AU started
Jul  8 05:56:06 nodeA kernel: clurgmgrd[15838]: segfault at 000000c000000010 rip 0000003000269b40 rsp 000000007204e900 error 6
Jul  8 05:56:06 nodeA clurgmgrd[499]: <crit> Watchdog: Daemon died, rebooting...
Jul  8 05:56:06 nodeA kernel: md: stopping all md devices.
Jul  8 05:56:06 nodeA kernel: md: md0 switched to read-only mode.
Jul  8 05:59:25 nodeA syslogd 1.4.1: restart (remote reception).
....
Jul  8 06:01:12 nodeA clurgmgrd[506]: <notice> Starting stopped service Q01STRS_TH 
Jul  8 06:01:33 nodeA clurgmgrd[506]: <notice> Service Q01STRS_TH started 

The segfault backtrace looks like:
Program terminated with signal 11, Segmentation fault.
#0  _int_malloc (av=0x3000434640, bytes=) at malloc.c:4181
4181            bck->fd = bin;

Thread 1 (process 15838):
#0  _int_malloc (av=0x3000434640, bytes=) at malloc.c:4181
#1  0x000000300026b6d2 in *__GI___libc_malloc (bytes=32) at malloc.c:3346
#2  0x0000000000425028 in clist_insert ()
#3  0x00000000004216bf in msg_open ()
#4  0x000000000041efc6 in vf_write (membership=0x657850, flags=2, keyid=0x7204ec60 "usrm::rg=\"Q01STRS_TH\"", data=0x7204ef20, datalen=104) at vft.c:1315
#5  0x000000000040b515 in set_rg_state (rgname=0x7204efd8 "Q01STRS_TH", svcblk=0x7204ef20) at rg_state.c:306
#6  0x000000000040b595 in init_rg (name=0x7204efd8 "Q01STRS_TH", svcblk=0x7204ef20) at rg_state.c:323
#7  0x000000000040b688 in get_rg_state (rgname=0x7204efd0 "service:Q01STRS_TH", svcblk=0x7204ef20) at rg_state.c:353
#8  0x000000000040c3c7 in svc_status (svcName=0x7204efd0 "service:Q01STRS_TH") at rg_state.c:877
#9  0x0000000000404f10 in resgroup_thread_main (arg=0x414620c0) at rg_thread.c:384
#10 0x0000003527d06137 in start_thread (arg=) at pthread_create.c:274
#11 0x00000030002c9883 in ?? () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 from /lib64/tls/libc.so.6
Current language:  auto; currently c
#1  0x000000300026b6d2 in *__GI___libc_malloc (bytes=32) at malloc.c:3346
3346      victim = _int_malloc(ar_ptr, bytes);

Version-Release number of selected component (if applicable):
rgmanager-1.9.87-1.el4_8.1-x86_64 


How reproducible:
Not easily, only happen couple times.

Steps to Reproduce:
1. Appears to happen when Service is changing states
  
Actual results:
clurgmgrd segfaults with error 6

Expected results:
No segfault

Additional info:
Comment 3 Lon Hohberger 2010-09-28 12:10:01 EDT
*** Bug 637263 has been marked as a duplicate of this bug. ***
Comment 5 Lon Hohberger 2010-10-22 10:32:44 EDT
This was fixed some time ago by bug 572695.

Furthermore, it was copied into the z-stream (EUS) as bug 572792.

https://rhn.redhat.com/errata/RHBA-2010-0404.html

*** This bug has been marked as a duplicate of bug 572695 ***

Note You need to log in before you can comment on or make changes to this bug.