Description of problem: clurgmgrd dies very often. clustat on one node only shows following: [root@aimtest2 ~]# clustat Member Status: Quorate Member Name Status ------ ---- ------ aimtest1 Online, rgmanager aimtest2 Online, Local, rgmanager When getting status of rgmanager: [root@aimtest2 ~]# service rgmanager status clurgmgrd dead but pid file exists Version-Release number of selected component (if applicable): How reproducible: It happens almost every day, but is not manualy Actual results: Member Name Status ------ ---- ------ aimtest1 Online, rgmanager aimtest2 Online, Local, rgmanager Expected results: [root@aimtest1 ~]# clustat Member Status: Quorate Member Name Status ------ ---- ------ aimtest1 Online, Local, rgmanager aimtest2 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- aim-oracle aimtest1 started aim-web aimtest1 started aim-namingservice aimtest1 started aim-core aimtest2 started aim-interface aimtest2 started aim-data01 aimtest2 started aim-data02 aimtest2 started aim-datalta aimtest2 started aim-dicom aimtest2 started Additional info: When rgmanager is restarted or clurgmgrd is started manualy rgmanager relocates all services and everything is working fine for a while.
How reproducible: It happens almost every day, but is not manualy reproducible
What version of rgmanager? Try stopping rgmanager, then running: # ulimit -c unlimited # clurgmgrd This should allow rgmanager to generate a core file, which I can use to help debug the problem.
we use rgmanager-1.9.43-0 ok, where can i find the core file?
I believe the core file will show up in the root directory. However, this sounds like the following bug we fixed in 1.9.46 (U3): https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166109 Here's the errata package: https://rhn.redhat.com/network/software/packages/details.pxt?pid=340158 I'm confident that 1.9.46 will your problem; sorry for the confusion.
ok, i will try 1.9.46. thanks