Description of problem: From: Simone Gotti Subject:[Linux-cluster] 2 missing patches in HEAD and RHEL5 branch. (rg_state.c and ip.sh) Date: Fri, 12 Jan 2007 14:59:53 +0100 (08:59 EST) Hi all, On a 2 node openais cman cluster, I failed a network interface and noticed that it didn't failed over the other node. Looking at the rgmanager-2.0.16 code I noticed that: handle_relocate_req is called with preferred_target = -1, but inside this function, there are 2 checks to see if the preferred_target is setted, the check is a 'if (preferred_target != 0)' so the function thinks that a preferred target is choosed. Then, inside the cycle, the only one target that really exists is "me" (as -1 isn't a real target) and there a "goto exausted:", the service is then restarted only on the locale node, where it fails again and so it's stopped. Changing these checks to "> 0" worked. Before writing a patch I noticed that in the RHEL4 CVS tag is used a NODE_ID_NONE instead of the numeric values, so the problem (not tested) probably doesn't happen. Is it probably a forgotten patch on HEAD and RHEL5? - (other ref omitted; in a separate bugzilla) Patch: Index: rgmanager/src/daemons/rg_state.c =================================================================== RCS file: /cvs/cluster/cluster/rgmanager/src/daemons/rg_state.c,v retrieving revision 1.24.4.2 diff -u -r1.24.4.2 rg_state.c --- rgmanager/src/daemons/rg_state.c 14 Dec 2006 22:17:21 -0000 1.24.4.2 +++ rgmanager/src/daemons/rg_state.c 12 Jan 2007 20:57:51 -0000 @@ -1292,7 +1292,7 @@ int *new_owner) { cluster_member_list_t *allowed_nodes, *backup = NULL; - uint32_t target = preferred_target, me = my_id(); + int target = preferred_target, me = my_id(); int ret, x; /* @@ -1308,7 +1308,7 @@ return RG_EFORWARD; } - if (preferred_target != 0) { + if (preferred_target >= 0) { allowed_nodes = member_list(); /* @@ -1380,7 +1380,7 @@ //count_resource_groups(allowed_nodes); } - if (preferred_target != 0) + if (preferred_target >= 0) memb_mark_down(allowed_nodes, preferred_target); memb_mark_down(allowed_nodes, me);
This would block http://testify.test.redhat.com/plancases.cgi?op=view&id=3967 in a 2 node config.
Devel ACK - Regression from RHEL4, blocks QE test for RHEL5 and fix is available. Requires a rebuild of the rgmanager package.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion.
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being marked as a blocker for this release. Please resolve ASAP.
Patch in cvs
A package has been built which should help the problem described in this bug report. This report is therefore being closed with a resolution of CURRENTRELEASE. You may reopen this bug report if the solution does not work for you.
Moving all RHCS ver 5 bugs to RHEL 5 so we can remove RHCS v5 which never existed.