Bug 188020 - Process clurgmgrd dies
Summary: Process clurgmgrd dies
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: rgmanager
Version: 4
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-04-05 13:00 UTC by Synedra Support
Modified: 2009-04-16 20:20 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-09-19 20:50:48 UTC
Embargoed:


Attachments (Terms of Use)

Description Synedra Support 2006-04-05 13:00:59 UTC
Description of problem:

clurgmgrd dies very often.

clustat on one node only shows following:

[root@aimtest2 ~]# clustat
Member Status: Quorate

  Member Name                              Status
  ------ ----                              ------
  aimtest1                                 Online, rgmanager
  aimtest2                                 Online, Local, rgmanager

When getting status of rgmanager:

[root@aimtest2 ~]# service rgmanager status
clurgmgrd dead but pid file exists


Version-Release number of selected component (if applicable):


How reproducible:
It happens almost every day, but is not manualy 
  

Actual results:

  Member Name                              Status
  ------ ----                              ------
  aimtest1                                 Online, rgmanager
  aimtest2                                 Online, Local, rgmanager

Expected results:

[root@aimtest1 ~]# clustat
Member Status: Quorate

  Member Name                              Status
  ------ ----                              ------
  aimtest1                                 Online, Local, rgmanager
  aimtest2                                 Online, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  aim-oracle           aimtest1                       started
  aim-web              aimtest1                       started
  aim-namingservice    aimtest1                       started
  aim-core             aimtest2                       started
  aim-interface        aimtest2                       started
  aim-data01           aimtest2                       started
  aim-data02           aimtest2                       started
  aim-datalta          aimtest2                       started
  aim-dicom            aimtest2                       started

Additional info:

When rgmanager is restarted or clurgmgrd is started manualy rgmanager relocates
all services and everything is working fine for a while.

Comment 1 Synedra Support 2006-04-05 13:08:26 UTC
How reproducible:
It happens almost every day, but is not manualy reproducible

Comment 2 Lon Hohberger 2006-04-06 20:19:48 UTC
What version of rgmanager?  Try stopping rgmanager, then running:
 
# ulimit -c unlimited
# clurgmgrd

This should allow rgmanager to generate a core file, which I can use to help
debug the problem.

Comment 3 Synedra Support 2006-04-07 08:45:26 UTC
we use rgmanager-1.9.43-0

ok, where can i find the core file?

Comment 4 Lon Hohberger 2006-04-07 14:15:06 UTC
I believe the core file will show up in the root directory.  However, this
sounds like the following bug we fixed in 1.9.46 (U3):

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166109

Here's the errata package:

https://rhn.redhat.com/network/software/packages/details.pxt?pid=340158

I'm confident that 1.9.46 will your problem; sorry for the confusion.

Comment 5 Synedra Support 2006-04-10 09:15:02 UTC
ok, i will try 1.9.46.

thanks


Note You need to log in before you can comment on or make changes to this bug.