Bug 1166589

Summary: ccs should trigger config activation/propagation across the nodes no more than once
Product: Red Hat Enterprise Linux 6 Reporter: Jan Pokorný [poki] <jpokorny>
Component: ricciAssignee: Chris Feist <cfeist>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 6.5CC: cluster-maint, cluster-qe, fdinitto, jpokorny, mkelly, rmccabe, rsteiger
Target Milestone: rcKeywords: EasyFix, Patch
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ccs-0.16.2-75.el6 Doc Type: Bug Fix
Doc Text:
Cause: ccs did not of logic to prevent multiple syncs/activations in one ccs command Consequence: It was possible to issue a command using multiple options that would cause multiple syncs and activations Fix: Only allow one sync/activation per command Result: ccs no longer issues multiple sync/activation commands.
Story Points: ---
Clone Of: 1157951 Environment:
Last Closed: 2015-07-22 07:34:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1157951    
Bug Blocks:    
Attachments:
Description Flags
Proposed patch none

Description Jan Pokorný [poki] 2014-11-21 10:05:50 UTC
Apparently fixing this on the side of ccs is not enough for the flawless
cluster stack operation, but at least, it will lower the probability of
running into issues with configuration being reloaded within the cluster
stack components in a way the first such reload hasn't finished completely
when it is triggered anew (cf. the likely race condition in rgmanager
in the original [bug 1157951]).


+++ This bug was initially created as a clone of Bug #1157951 +++

--- Additional comment from Jan Pokorný on 2014-11-20 23:52:52 CET ---

[...]

0. assumption:
   you originally used the same (or equivalent) command as later on, i.e.:

>  ccs -h localhost --activate --sync --password "secret" --rmvm iRed2

--


1. "Updating cluster.conf" followed by symptoms of cluster.conf being
   indeed propagated, shortly twice in row on nr-c03n01, seemed unnatural
   and suspicious

->

2. indeed there is a bug in ccs causing following sequence:

   - if (removevm): remove_vm(name)
     -> set_cluster_conf (while "activate" holds ~ --activate,
                          only against localhost)

     <spoiler-alert>
         "activate" should be temporarily masked if "sync" is set
         to prevent "double activate", just as the method below does
     </spoiler-alert>

   - if (sync): sync_cluster_conf()
     -> set_cluster_conf (with "activate" masked,
                          against all nodes via cluster.conf hostnames)
     -> set_cluster_conf (with "activate" unmasked, hence true as above,
                         only against the last enumerated node)

--

Bottom-line: there is still a bug in rgmanager in not being able, in some
circumstances, to deal with 2+ subsequent configuration updates in a very
very very short time frame (likely a race condition)

Good news: buggy ccs (in a sense, working, but less efficiently than
appropriate) helped to discover this bug :)

Comment 1 Jan Pokorný [poki] 2014-11-21 15:14:20 UTC
Created attachment 959802 [details]
Proposed patch

Solution should be easy, just temporarily mask the "activate" flag,
unmask it just before "sync" that is intentionally a last triggerable
modifier in the ccs invocation.

NOTE:

> This variant of the patch tries to preserve original behavior that
> standalone --activate (without --sync as suggested per help message)
> will also activate (rule of "no more than once" is respected).
> 
> If not suitable, replace "not(sync) and activate" with "False".

Comment 4 Chris Feist 2015-03-03 23:49:48 UTC
Before Fix (2 propagate command sent):
[root@ask-03 ~]# rpm -q ccs
ccs-0.16.2-75.el6.x86_64
[root@ask-03 ~]# rm -f /etc/cluster/cluster.conf 
[root@ask-03 ~]# ccs --createcluster test_cluster
[root@ask-03 ~]# ccs --addnode localhost
Node localhost added.
[root@ask-03 ~]# ccs --addvm my_vm
[root@ask-03 ~]# ccs --sync --activate --debug  --rmvm my_vm | grep propagate | wc
      2      34     678



After Fix (1 propagate command set):
[root@ask-02 ccs]# rpm -q ccs
ccs-0.16.2-77.el6.x86_64
[root@ask-02 ccs]# rm -f /etc/cluster/cluster.conf 
[root@ask-02 ccs]# ccs --createcluster test_cluster
[root@ask-02 ccs]# ccs --addnode localhost
Node localhost added.
[root@ask-02 ccs]# ccs --addvm my_vm
[root@ask-02 ccs]# ccs --sync --activate --debug  --rmvm my_vm | grep propagate | wc
      1      17     340

Comment 8 errata-xmlrpc 2015-07-22 07:34:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1405.html