Description of problem: On service recovery procedure after Node crash the service has been running last_owner reported by 'clustat -l' utility is wrong. As far as last_owner updated by 'service stop' logic, no one update the last_owner for failover case. Version-Release number of selected component (if applicable): rgmanager-2.0.52-6.el5.centos How reproducible: Steps to Reproduce: 1. Create service SRV1 on node N1 with realocation rules to node N2 on failover 2. Start SRV1 on N1 3. fence the node N1 4. After service has started on N2 check the last_owner for SRV1 by 'clustat' Additional info: Temporary fix for the problem. [root@dim-ws rgmanager-2.0.52]# diff -u src/daemons/rg_state.c src/daemons/rg_state.c.orig --- src/daemons/rg_state.c 2010-07-02 12:19:07.000000000 +0400 +++ src/daemons/rg_state.c.orig 2010-07-02 12:18:58.000000000 +0400 @@ -681,6 +681,7 @@ /* * Service is running but owner is down -> RG_EFAILOVER */ + svcStatus->rs_last_owner = svcStatus->rs_owner; clulog(LOG_NOTICE, "Taking over service %s from down member %s\n", svcName, memb_id_to_name(membership,
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=221c56a50dc2451eadffcdf11ebee8b5542377ea
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
This request was erroneously denied for the current release of Red Hat Enterprise Linux. The error has been fixed and this request has been re-proposed for the current release.
Verified in version rgmanager-2.0.52-19.el5, kernel 2.6.18-265.el5 (03:15:36) [root@a1:~]$ clustat -l Cluster Status for a_cluster @ Fri Jun 10 03:15:50 2011 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ a1 1 Online, Local, rgmanager a2 2 Online, rgmanager a3 3 Online, rgmanager Service Information ------- ----------- Service Name : service:nfsservice Current State : started (112) Flags : none (0) Owner : a2 Last Owner : none Last Transition : Fri Jun 10 03:15:41 2011 (03:15:50) [root@a1:~]$ fence_node a2 (03:16:21) [root@a1:~]$ clustat -l Cluster Status for a_cluster @ Fri Jun 10 03:17:09 2011 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ a1 1 Online, Local, rgmanager a2 2 Offline a3 3 Online, rgmanager Service Information ------- ----------- Service Name : service:nfsservice Current State : started (112) Flags : none (0) Owner : a3 Last Owner : a2 Last Transition : Fri Jun 10 03:16:52 2011 (03:17:09) [root@a1:~]$
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1000.html