Bug 214477 - Multiple "exclusive" services are started on the same node
Multiple "exclusive" services are started on the same node
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: rgmanager (Show other bugs)
4
All Linux
medium Severity high
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-11-07 14:50 EST by Jiho Hahm
Modified: 2010-10-22 02:52 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2007-1000
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-11-21 16:53:04 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch (1.93 KB, patch)
2007-04-05 02:10 EDT, Andrey Mirkin
no flags Details | Diff
diff-rhel4-rgmanager-1.9.54-1 (5.63 KB, patch)
2007-04-12 04:38 EDT, Andrey Mirkin
no flags Details | Diff
diff-rhel5-rgmanager-2.0.23-1 (5.42 KB, patch)
2007-04-12 04:41 EDT, Andrey Mirkin
no flags Details | Diff

  None (edit)
Description Jiho Hahm 2006-11-07 14:50:18 EST
Description of problem:

A node can be running at most one service marked as exclusive at any moment, but
the exclusivity seems to be honored only during failover, initial startup, and
relocation without specifying node.  It is possible to run multiple exclusive
services on a single node with "clusvcadm -e <service>" (without -m/-n),
"clusvcadm -r <service> -m <node>", and during initial daemon startup in some cases.

This can lead to serious problems, depending on the reason exclusivity was
required by the application under cluster control.  In my case, logical data
corruption can result when one application instance can potentially modify a mix
of data files of two services.  I said the data corruption is logical because
it's not filesystem level corruption.  But it can nonetheless result in
unrecoverable damage.  This is bordering on urgent severity.


Version-Release number of selected component (if applicable):
rgmanager-1.9.54-1.i386.rpm
(same result with rgmanager-1.9.34-1.i386.rpm)


How reproducible: always


Steps to Reproduce:
1. Create a simple cluster with 3 nodes (N1, N2, N3) and 2 services. (S1 and S2)
 Each service can be a simple service with one IP address.  Set exclusive="1" in
both services.

2. Start the daemons.  Sometimes, both S1 and S2 end up getting started on the
same node.  This problem doesn't happen all the time.

3a. Start S1 on N1, S2 on N2.  Login to N1.
3b. clusvcadm -d S2; clusvcadm -e S2
3c. Both S1 and S2 are started on N1.

4a. Start S1 on N1, S2 on N2.  Login to N1.
4b. clusvcadm -r S2 -m N1; clustat
4c. Both S1 and S2 are started on N1.

5a. Start S1 on N1, S2 on N2.  Login to N1.
5b. clusvcadm -r S2 : S2 moves to N3.
5c. clusvcadm -r S2 (again) : S2 moves back to N2.
5d. Repeat.  S2 moves back and forth between N2 and N3.  So relocate works
correctly as long as target node is not specified.
  
Actual results: Both exclusive services are started on the same node.


Expected results: An exclusive service should refuse to start on a node that is
already running another exclusive service.


Additional info: I looked at a somewhat old version of the code and it's clear
exclusivity check is being performend only in some cases.  It's easy to see that
-e and -r/-m actions are not doing exclusivity check, but I don't know why the
initial startup case also fails in some cases.  In my test I started rgmanager
on all nodes sequentially, like:

# for n in N1 N2 N3 ; do ssh root@$n 'service rgmanager start' ; done

after ccsd/cman/fenced were started everywhere and quorum was established. 
Could the timing have something to do with it?
Comment 1 Lon Hohberger 2006-11-08 13:06:01 EST
Explicit specification of where a service should run (using -e/-r X -n X)
overrides the "exclusive" flag.  It also overrides failover domain ordering (but
not restriction; though maybe it should).

However, the startup case is *definitely* a bug - rgmanager should not colocate
services even on startup.  I think you're right - it sounds like a timing issue.
 It should not be hard to fix.

Can you post your <rm> tag and children?
Comment 2 Jiho Hahm 2006-11-08 14:05:30 EST
Config:

<rm>
  <failoverdomains/>
  <resources>
    <ip address="10.10.130.11" monitor_link="1"/>
    <ip address="10.10.130.12" monitor_link="1"/>
  </resources>
  <service autostart="1" exclusive="1" name="S1">
    <ip ref="10.10.130.11"/>
  </service>
  <service autostart="1" exclusive="1" name="S2">
    <ip ref="10.10.130.12"/>
  </service>
</rm>

Please reconsider the expected behavior when target node is explicitly
specified.  If the user specifies colocating exclusive services, that's a user
error!  The software should prevent it rather than trusting the user knows
exactly what he/she is doing.

What makes it particularly error prone is the behavior of the ENABLE ("-e")
command without any other option.  With that command, the target node is assumed
to be the localhost, so it's very easy to trigger an explicit colocation.

By the way, I don't know if this is already happening, but exclusivity check
should consider not only the "started" status, but other status values should be
examined such as "starting", "recovering", "failed", etc.  When you choose the
eligible nodes for bringing up an exclusive service, all nodes that has an
exclusive service started or in a state that may potentially have some or all
service resources started must be considered ineligible.
Comment 3 Lon Hohberger 2006-11-08 17:37:34 EST
Ok, sounds fine.  Give me a couple of days.
Comment 4 Lon Hohberger 2006-12-12 15:35:50 EST
Ok, this is taking longer than I thought, but ... the good news is that the fix
I've been working on fixes an entire class of issues like this, not just this
particular issue.
Comment 5 Andrey Mirkin 2007-04-05 02:10:31 EDT
Created attachment 151735 [details]
patch
Comment 6 Andrey Mirkin 2007-04-05 02:14:23 EDT
Hello,

I have prepared a patch which I hope fixes this problem.
Can you, please, take a look on it.
Comment 7 Lon Hohberger 2007-04-09 11:52:01 EDT
Hi Andrey,

Your patch looks like it would fix the problem.  Good work.

Comments:

- count_resource_groups() is a very expensive operation (because
rg_lock()/get_rg_state()/rg_unlock() is very expensive!).  If we could make a
local-only copy of it which uses get_rg_state_local() instead of get_rg_state()
during handle_start_req, this would improve performance an order of magnitude.

- handle_start_remote_req() might need to have similar changes, except rather
than flipping to relocate, it would just return failure.

What do you think?
Comment 8 Andrey Mirkin 2007-04-12 04:38:21 EDT
Created attachment 152372 [details]
diff-rhel4-rgmanager-1.9.54-1

Hi,

I have fixed all your comments. New version of patch is attached.
This patch is for rgmanager-1.9.54-1 from RHEL4.
Comment 9 Andrey Mirkin 2007-04-12 04:41:03 EDT
Created attachment 152373 [details]
diff-rhel5-rgmanager-2.0.23-1

This patch is for rgmanager-2.0.23-1 from RHEL5.
Comment 11 Lon Hohberger 2007-05-03 10:50:02 EDT
Hi Andrey,

I applied the RHEL5 patch to RHEL5 branch and head of CVS on 4/19, and I'm
applying the RHEL4 patch today.  Sorry for the wait.
Comment 17 errata-xmlrpc 2007-11-21 16:53:04 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-1000.html

Note You need to log in before you can comment on or make changes to this bug.