Bug 1166589 - ccs should trigger config activation/propagation across the nodes no more than once
Summary: ccs should trigger config activation/propagation across the nodes no more tha...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: ricci
Version: 6.5
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Chris Feist
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On: 1157951
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-21 10:05 UTC by Jan Pokorný [poki]
Modified: 2015-07-22 07:34 UTC (History)
7 users (show)

Fixed In Version: ccs-0.16.2-75.el6
Doc Type: Bug Fix
Doc Text:
Cause: ccs did not of logic to prevent multiple syncs/activations in one ccs command Consequence: It was possible to issue a command using multiple options that would cause multiple syncs and activations Fix: Only allow one sync/activation per command Result: ccs no longer issues multiple sync/activation commands.
Clone Of: 1157951
Environment:
Last Closed: 2015-07-22 07:34:07 UTC


Attachments (Terms of Use)
Proposed patch (4.02 KB, patch)
2014-11-21 15:14 UTC, Jan Pokorný [poki]
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1405 normal SHIPPED_LIVE ricci bug fix and enhancement update 2015-07-20 18:07:08 UTC

Description Jan Pokorný [poki] 2014-11-21 10:05:50 UTC
Apparently fixing this on the side of ccs is not enough for the flawless
cluster stack operation, but at least, it will lower the probability of
running into issues with configuration being reloaded within the cluster
stack components in a way the first such reload hasn't finished completely
when it is triggered anew (cf. the likely race condition in rgmanager
in the original [bug 1157951]).


+++ This bug was initially created as a clone of Bug #1157951 +++

--- Additional comment from Jan Pokorný on 2014-11-20 23:52:52 CET ---

[...]

0. assumption:
   you originally used the same (or equivalent) command as later on, i.e.:

>  ccs -h localhost --activate --sync --password "secret" --rmvm iRed2

--


1. "Updating cluster.conf" followed by symptoms of cluster.conf being
   indeed propagated, shortly twice in row on nr-c03n01, seemed unnatural
   and suspicious

->

2. indeed there is a bug in ccs causing following sequence:

   - if (removevm): remove_vm(name)
     -> set_cluster_conf (while "activate" holds ~ --activate,
                          only against localhost)

     <spoiler-alert>
         "activate" should be temporarily masked if "sync" is set
         to prevent "double activate", just as the method below does
     </spoiler-alert>

   - if (sync): sync_cluster_conf()
     -> set_cluster_conf (with "activate" masked,
                          against all nodes via cluster.conf hostnames)
     -> set_cluster_conf (with "activate" unmasked, hence true as above,
                         only against the last enumerated node)

--

Bottom-line: there is still a bug in rgmanager in not being able, in some
circumstances, to deal with 2+ subsequent configuration updates in a very
very very short time frame (likely a race condition)

Good news: buggy ccs (in a sense, working, but less efficiently than
appropriate) helped to discover this bug :)

Comment 1 Jan Pokorný [poki] 2014-11-21 15:14:20 UTC
Created attachment 959802 [details]
Proposed patch

Solution should be easy, just temporarily mask the "activate" flag,
unmask it just before "sync" that is intentionally a last triggerable
modifier in the ccs invocation.

NOTE:

> This variant of the patch tries to preserve original behavior that
> standalone --activate (without --sync as suggested per help message)
> will also activate (rule of "no more than once" is respected).
> 
> If not suitable, replace "not(sync) and activate" with "False".

Comment 4 Chris Feist 2015-03-03 23:49:48 UTC
Before Fix (2 propagate command sent):
[root@ask-03 ~]# rpm -q ccs
ccs-0.16.2-75.el6.x86_64
[root@ask-03 ~]# rm -f /etc/cluster/cluster.conf 
[root@ask-03 ~]# ccs --createcluster test_cluster
[root@ask-03 ~]# ccs --addnode localhost
Node localhost added.
[root@ask-03 ~]# ccs --addvm my_vm
[root@ask-03 ~]# ccs --sync --activate --debug  --rmvm my_vm | grep propagate | wc
      2      34     678



After Fix (1 propagate command set):
[root@ask-02 ccs]# rpm -q ccs
ccs-0.16.2-77.el6.x86_64
[root@ask-02 ccs]# rm -f /etc/cluster/cluster.conf 
[root@ask-02 ccs]# ccs --createcluster test_cluster
[root@ask-02 ccs]# ccs --addnode localhost
Node localhost added.
[root@ask-02 ccs]# ccs --addvm my_vm
[root@ask-02 ccs]# ccs --sync --activate --debug  --rmvm my_vm | grep propagate | wc
      1      17     340

Comment 8 errata-xmlrpc 2015-07-22 07:34:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1405.html


Note You need to log in before you can comment on or make changes to this bug.