Description of problem: (a) SIGHUP isn't properly handled, and (b) if two signals come in quickly, cluquorumd can deadlock because it call syslog (and other unsafe things) in the signal handler for SIGHUP Version-Release number of selected component (if applicable): 1.2.28 How reproducible: ?
Created attachment 120904 [details] Fixes quorumd signal handling
A new *TEST* package build with several bugfixes (fixes bugzillas: 171637 172735 172893 172894 ) is available. Gulm-bridge support has been disabled in this release to prevent having to install with the "--nodeps" option: http://people.redhat.com/lhh/clumanager-1.2.28.6-0.1nogfs.i386.rpm http://people.redhat.com/lhh/clumanager-1.2.28.6-0.1nogfs.x86_64.rpm http://people.redhat.com/lhh/clumanager-1.2.28.6-0.1nogfs.src.rpm Let us know if this works for you.
QA This is a pre-emptive strike against a potential problem, and needs no testing. Verify that the cluster still operates under normal constraints.
Actually scratch that; writing a test case.
1. Set cluquorumd to debug log level. Send cluquorumd SIGHUP signals very quickly by running this for a few seconds (presss ^C to stop it): while [ 0 ]; do killall -HUP cluquorumd; done 2. Run clustat. It should hang. 3. On 1.2.30, this behavior should not exist.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0196.html