Bug 192117 - service only starts if added on the node for which it is default
service only starts if added on the node for which it is default
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: rgmanager (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-05-17 13:57 EDT by Lenny Maiorani
Modified: 2009-04-16 16:20 EDT (History)
1 user (show)

See Also:
Fixed In Version: RHBA-2007-0149
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-05-10 17:16:43 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Fix (as real patch) (892 bytes, patch)
2006-11-27 17:21 EST, Lon Hohberger
no flags Details | Diff

  None (edit)
Description Lenny Maiorani 2006-05-17 13:57:25 EDT
Description of problem:
When a new VIP is added by adding it to /etc/cluster/cluster.conf and running
'ccs_tool update' and 'cman_tool version' the VIP is only started if this was
done on the node for which the VIP has the same domain. 

Example: 
Add this line to the /etc/cluster/cluster.conf resources section:
<ip address="10.250.1.93/16" monitor_link="1"/>

Add this service:
<service autostart="1" domain="node2" name="10.250.1.93">
        <ip ref="10.250.1.93/16"/>
</service>

And update the configuration version number.

If all this is done on node1, the service will not be started. However, if it is
done on node2 it will be started.


Version-Release number of selected component (if applicable):
1.9.46-1.3speed (patch listed on bug 182454)

How reproducible:
Always


Actual results:
Service doesn't get started

Expected results:
Service starts
Comment 1 Lenny Maiorani 2006-05-22 19:53:16 EDT
I have found some other problems with this version of the rgmanager. These are
much more important, but look related.

When a node panics, the VIPs on that node do not fail-over to other nodes and
they say the node they are on is "unknown". 

During a graceful shutdown, the VIPs appear to stay on the node which went down
but they are in Stopped state and have parens around the node name. 

After a graceful shutdown, bringing the node back online it only activates the
VIPs which it owns by default.

Should the changes around ownership of VIPs be taken out of this patch? Is there
a bug in one of those checks?
Comment 2 Lon Hohberger 2006-07-20 13:14:41 EDT
This is the one we worked through, right?

Does this happen on the 1.4.2x or the current beta bits in the RHN channel?
Comment 3 Lenny Maiorani 2006-07-20 15:09:20 EDT
I saw some additional situations:
When VIPs are first added to /etc/cluster/cluster.conf when the cluster service
is up, some (many) VIPs are not get started initially. But, the weird things are
that they will get started if you remove them from the file and add them back
with the exact same approach. 
Even more, you can get them started by just remove just ONE of them and add it back.

In short, the bug is still valid in rgmanager-1.9.46-U4pre1 you gave me.
Comment 4 dex chen 2006-07-20 15:38:16 EDT
When I take a closer look at this issue, I found the "ip.sh" script is not 
invoked when I run ccs_tool update /etc/clsuter/cluster.conf to update the new 
added VIP services. The end result of this is that the VIPs are not assigned to 
any physical interfaces.
Comment 5 Lon Hohberger 2006-09-06 14:04:42 EDT
I wonder if this has been fixed in U4?
Comment 7 Lon Hohberger 2006-10-24 14:26:38 EDT
I think I know what this is, and no, it's not fixed in U4.
Comment 8 Lon Hohberger 2006-11-17 10:49:45 EST
This fix from head should fix it:

diff -u -r1.24 -r1.25
--- cluster/rgmanager/src/daemons/groups.c	2006/10/06 21:22:27	1.24
+++ cluster/rgmanager/src/daemons/groups.c	2006/10/23 22:47:01	1.25
@@ -1090,8 +1093,20 @@
 		if (curr->rn_resource->r_flags & RF_NEEDSTART)
 			need_init = 1;
 
-		if (get_rg_state_local(rg, &svcblk) < 0)
-			continue;
+		if (!need_init) {
+			if (get_rg_state_local(rg, &svcblk) < 0)
+				continue;
+		} else {
+			if (rg_lock(rg, &lockp) != 0)
+				continue;
+
+			if (get_rg_state(rg, &svcblk) < 0) {
+				rg_unlock(&lockp);
+				continue;
+			}
+
+			rg_unlock(&lockp);
+		}
 
 		if (!need_init && svcblk.rs_owner != my_id())
 			continue;
Comment 9 Lenny Maiorani 2006-11-17 12:24:40 EST
Yes, this has fixed my problems. I have changed it slightly to retro-fit RHEL4U4...

diff -u -r1.24 -r1.25
--- cluster/rgmanager/src/daemons/groups.c	2006/10/06 21:22:27	1.24
+++ cluster/rgmanager/src/daemons/groups.c	2006/10/23 22:47:01	1.25
@@ -1090,8 +1093,20 @@
 		if (curr->rn_resource->r_flags & RF_NEEDSTART)
 			need_init = 1;
 
-		if (get_rg_state_local(name, &svcblk) < 0)
-			continue;
+		if (!need_init) {
+			if (get_rg_state_local(name, &svcblk) < 0)
+				continue;
+		} else {
+			if (rg_lock(name, &lockp) != 0)
+				continue;
+
+			if (get_rg_state(name, &svcblk) < 0) {
+				rg_unlock(name, lockp);
+				continue;
+			}
+
+			rg_unlock(name, lockp);
+		}
 
 		if (!need_init && svcblk.rs_owner != my_id())
 			continue;
Comment 10 Lon Hohberger 2006-11-27 17:21:53 EST
Created attachment 142234 [details]
Fix (as real patch)
Comment 13 Red Hat Bugzilla 2007-05-10 17:16:43 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0149.html

Note You need to log in before you can comment on or make changes to this bug.