Created attachment 410562 [details] Patch to send SIGQUIT to disconnect the clients Description of problem: It there is active connections to the postgres database then a SIGTERM will fail thus the resource will fail to stop. This can result in the service/resource for failing over. A solution to this is send a SIGQUIT if the SIGTERM fails. Below is documentation on postgres. Postgres documentation on stopping the pid: To terminate the postgres server normally, the signals SIGTERM, SIGINT, or SIGQUIT can be used. The first will wait for all clients to terminate before quitting, the second will forcefully disconnect all clients, and the third will quit immediately without proper shutdown, resulting in a recovery run during restart. Sample error message that is created: Apr 28 13:23:46 rh5node-single clurgmgrd[3094]: <notice> Stopping service service:psql Apr 28 13:23:47 rh5node-single clurgmgrd: [3094]: <debug> Verifying Configuration Of postgres-8:postgresql Apr 28 13:23:47 rh5node-single clurgmgrd: [3094]: <debug> Verifying Configuration Of postgres-8:postgresql > Succeed Apr 28 13:23:47 rh5node-single clurgmgrd: [3094]: <info> Stopping Service postgres-8:postgresql Apr 28 13:24:49 rh5node-single clurgmgrd: [3094]: <err> Stopping Service postgres-8:postgresql > Failed - Application Is Still Running Apr 28 13:24:49 rh5node-single clurgmgrd: [3094]: <err> Stopping Service postgres-8:postgresql > Failed Apr 28 13:24:49 rh5node-single clurgmgrd[3094]: <notice> stop on postgres-8 "postgresql" returned 1 (generic error) Apr 28 13:24:49 rh5node-single clurgmgrd: [3094]: <info> Removing IPv4 address 192.168.1.56/24 from eth1 Apr 28 13:24:59 rh5node-single clurgmgrd[3094]: <crit> #12: RG service:psql failed to stop; intervention required Apr 28 13:24:59 rh5node-single clurgmgrd[3094]: <notice> Service service:psql is failed Version-Release number of selected component (if applicable): rgmanager-2.0.52-6.el5 How reproducible: Everytime Steps to Reproduce: 1. Start up a postgres-8.sh resource contained in a service 2. Open an active connection to the database 3. Stop the postgres-8.sh resource contained in a service Actual results: The postgres-8.sh resource will fail to stop gracefully this can lead to a service not failing over correctly. Expected results: The postgres-8.sh service gracefully stop and allow failover to occur. Additional info: Patch is attached to fix issue by sending SIGQUIT to postgres pid if the SIGTERM kill fails.
I was having the same problem and tried your patch. I can confirm it works, thanks.
Hi, Shane thanks for a patch, I have created a new one which is very similar but implements it in a way that other resource agents can use this functionality too. Patch is available in upstream now: http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=87863e2c83b8cc3f6d7cb6b657202e04bd2df40f
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0134.html