Bug 118886 - service's device mounted on both cluster nodes
Summary: service's device mounted on both cluster nodes
Alias: None
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: clumanager
Version: 2.1
Hardware: i686
OS: Linux
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact:
Depends On:
Blocks: 116726
TreeView+ depends on / blocked
Reported: 2004-03-22 12:21 UTC by Radek Bohunsky
Modified: 2007-11-30 22:06 UTC (History)
0 users

Clone Of:
Last Closed: 2004-08-18 15:52:55 UTC

Attachments (Terms of Use)
Patch to prevent start during node-event after a failback (2.37 KB, patch)
2004-04-06 14:45 UTC, Lon Hohberger
no flags Details | Diff
Correct patch. (3.05 KB, patch)
2004-04-06 14:48 UTC, Lon Hohberger
no flags Details | Diff
log (500 bytes, text/plain)
2004-05-11 12:56 UTC, Radek Bohunsky
no flags Details
Previous patch plus fix for no-start on relocate (3.60 KB, patch)
2004-05-11 13:36 UTC, Lon Hohberger
no flags Details | Diff
Patch addressing both issues. (4.25 KB, patch)
2004-05-12 14:37 UTC, Lon Hohberger
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2004:223 normal SHIPPED_LIVE Updated clumanager package fixes bugs 2004-08-18 04:00:00 UTC

Description Radek Bohunsky 2004-03-22 12:21:43 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624

Description of problem:
If you try to disable started service and user script returns nonzero
status, the device associated with this service isn't unmounted (and
alias ip address isn't released), but service state is disabled. It
may lead to data corruption.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. create user script, which returns nonzero in stop - for example
case $1 in
                exit 0
                exit 1
                exit 0
2. create relocatable cluster service with this user script and some
disk device - for example:
cluadmin> service show config test
name: test
preferred node: testcluster1
relocate: yes
user script: /tmp/test.initd
monitor interval: 10
device 0: /dev/sda10
  mount point, device 0: /cluster/test
  mount fstype, device 0: ext3
  force unmount, device 0: yes
  samba share, device 0: None
3. start service on preferred node
4. switch off and switch on this preferred node - testcluster1
(service is relocated on testcluster2 after switch off)
5. after full start of testcluster1, service is relocated back on
testcluster1 (preffered node)

Actual Results:  device /dev/sda10 is mounted on both cluster nodes
(and application may be running on both nodes too)
(The same with ip address)

Expected Results:  device is unmounted (regardless to user script exit
status), or service on the other node isn't started if user script
returns nonzero

Additional info:

I think, the bug is independent on platform, but I haven't hw to test it.

May be, the sufficient bugfix is switching of setServiceStatus and
exec_service_script in svc_stop (src/daemons/svcmgrd.c), but I have no
idea about imapct of this change to whole cluster manager.

Comment 2 Lon Hohberger 2004-03-22 14:05:38 UTC
Needs further evaluation.

Comment 3 Lon Hohberger 2004-03-22 14:08:21 UTC
Note: This is not a problem in RHEL3's Cluster Suite; services which
fail to stop (cleanly) are placed into a separate state.

Comment 4 Radek Bohunsky 2004-04-06 12:10:10 UTC
Will be this separate state implemented to AS2.1 clumanager? Or will
be any other solution in foreseeable future?

Comment 5 Lon Hohberger 2004-04-06 14:27:32 UTC
Maybe.  In any case, there should be something to prevent inadvertent
starting of failed services.

In RHEL3, whenever a service fails to stop, we mark it as 'failed'. 
This is done because there isn't a determinable way to determine why
it failed to stop; it could have failed for *any* reason.  When a
service enters this state, it is not possible to start or relocate the
service, and member transitions have no effect on it.  The only way to
fix a service which enters this state is by disabling it first (manually).

Comment 6 Lon Hohberger 2004-04-06 14:45:39 UTC
Created attachment 99145 [details]
Patch to prevent start during node-event after a failback

This patch should fix the problem.  When a quorum daemon notifies the local
service manager that it is quorate, the service manager handles several things
- one is service initialization, the other is sending a "failback" request. 
The "failback" request instructs the remote service manager to relocate any
running services back to the preferred node.

What I think was happening was:

Local node:		      Remote Node
(1) Spawn service handlers.
(2) Send failback request.
			      (3) Handle failback request. (as a relocate req)
			      (4) Exec stop; mark service "stopped"
(5) Service handler sees
service "stopped". Attempt to
			      (6) Stop failed.	Attempt to restart.
			      (7) Exec start; mark service "started"
(8) Exec start; mark service

(bad things...)

The patch changes it in this way:

Instead of using the "stopped" state for relocation requests, we now use the
"pending" state.  When a service is marked as "pending", only remote starts
from other service managers (e.g. a relocate request) or a cluadmin "disable"
request will cause the service.

This patch is untested.

Comment 7 Lon Hohberger 2004-04-06 14:48:27 UTC
Created attachment 99146 [details]
Correct patch.

Attached the wrong patch.

This is the corrected patch.

Comment 8 Radek Bohunsky 2004-05-10 14:31:43 UTC
I have tried this patch (in new clumanager-1.0.26-2 version). When I
stop cluster manager (rc-script), services are relocated, but when I
start it again, the services which belongs to this node are not
started, but stay in pending state.
If I look to logs, I may see before start of any service on this node,
the cluster manager tries to stop all defined services - may it be the
reason for this behaviour?
Whats now? :-)

Comment 9 Lon Hohberger 2004-05-10 14:47:09 UTC
Services are always stopped during service manager boot in order to
clean up any potentially allocated resources.

Is your script still configured as stated above?

Comment 10 Radek Bohunsky 2004-05-10 15:50:25 UTC
yes, maybe, but their stop scripts obviously end with error (services
are not running and we try to stop them), which leads (with yours
patch) to services stay in pending state and cluster manager doesn't
try to start them again.

I tries it on real system, I may configure this dummy service again,
but problem is elsewhere.

Comment 11 Lon Hohberger 2004-05-10 16:40:47 UTC
Being stuck in the wrong state ("pending" instead of "disabled") is a
bug, but it is different from having devices mounted on both nodes.

Service scripts returning "error" (nonzero) in the stop phase should
only ever indicate an unrecoverable situation; the fact that the
cluster does not recover from it is expected behavior.

Comment 12 Radek Bohunsky 2004-05-11 12:56:41 UTC
Created attachment 100146 [details]

Comment 13 Radek Bohunsky 2004-05-11 13:05:12 UTC
(Hm, scripts, which must return ok in some "not so bad" error
situations. It doesn't look very cleanly ;-) Sometimes it's hard to
discriminate between critical errors and others. And I think it's not
good design procedure to run stop scripts without good reason, only
for to be sure.)

I have tried to change all exit statuses to zero for testing purposes,
but service ends in pending state again (after it was stopped correctly).

Attached part of log shows last messages of relocate request after
successfull stopping of service.

Comment 14 Lon Hohberger 2004-05-11 13:36:02 UTC
Created attachment 100147 [details]
Previous patch plus fix for no-start on relocate

The delta of the two patches is basically this change to line 1203 in

-	((req != SVC_START_PENDING) && (req != SVC_DISABLE))) {
+	((req != SVC_START_RELOCATE) && (req != SVC_DISABLE))) {

This should fix it.  START_PENDING is an unnecessary request; this reference
was missed.  Basically, the service manager needs to drop all enable, start,
and stop requests while a service is being relocated by a node.

Comment 15 Lon Hohberger 2004-05-11 14:21:39 UTC
Just for a bit of info:

The scripts are System-V style: either it works or it doesn't; if it
doesn't work, there's no way to tell what went wrong.

The simplicity of the model is a double-edged sword:  It is
simultaneously the greatest strength and greatest weakness.  It is
very easy to write plugin scripts, but their return codes are limited
(e.g. 0, 1 for stop/start, 0, 1, 3 for status).

Other models are being evaluated for future releases of Red Hat
Enterprise Linux.

Comment 16 Radek Bohunsky 2004-05-11 18:13:29 UTC
The new patch looks better, service stays in pending state if it
starts to migrate to preffered node and stop script returns nonzero.
All other standard actions looks right too.
Unfortunately the simillar problem exists in the case the whole
clumanager is correctly stopped on one node (/etc/rc.d/init.d/cluster
stop) and service, which migrate to the running node returns nonzero -
resources are allocated on both nodes.

Comment 17 Lon Hohberger 2004-05-12 14:37:00 UTC
Created attachment 100182 [details]
Patch addressing both issues.

This patch has been unit-tested, and addresses the following two situations:

Scenario #1: Non-preferred node owns service; preferred node down, "relocate on
preferred node boot" is set for the service.  Preferred node joins the cluster.

Incorrect behavior: Service is relocated to preferred node even though the
"stop" phase fails, ending up with resources allocated on both cluster members.

Correct behavior: The service should not relocate to the preferred node, and
instead should be restarted on the non-preferred node to minimize down time. 
In the event that the service can not be restarted, it should be disabled.  If
it becomes disabled, manual intervention is required to clean up the service
prior to re-enabling it.

Scenario #2: Service running on member 'A'.  'A' leaves cluster (via "service
cluster stop").

Incorrect behavior:  The stop phase fails for the service, and the only thing
that happens is a message: "Not all service stopped cleanly, reboot needed". 
Other cluster 'B' member starts service, resulting in resources allocated on
both members.

Correct behavior:  The service should be disabled if the 'stop' phase fails. 
In the event that a service becomes disabled from this, manual intervention is
required to clean up the service prior to re-enabling it.

Comment 18 Radek Bohunsky 2004-05-14 15:41:57 UTC
This patch looks good. I have tried almost all situations, which
occured to me and no one ends in some dangerous state.

Will be this patch adopted to the next version of clumanager for AS 2.1?

Comment 19 Lon Hohberger 2004-05-14 16:04:15 UTC
If I have any say in the matter, it certainly will be.

Thank you for your diligent testing; my apologies for not addressing
it a more efficient manner.

Comment 20 Derek Anderson 2004-06-16 14:27:56 UTC
Verified that both test scenarios work as expected.  Setting to 

Comment 21 John Flanagan 2004-08-18 15:52:55 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.