1166589 – ccs should trigger config activation/propagation across the nodes no more than once

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1166589 - ccs should trigger config activation/propagation across the nodes no more than once

Summary: ccs should trigger config activation/propagation across the nodes no more tha...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	ricci
Sub Component:
Version:	6.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Chris Feist
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:	1157951
Blocks:
TreeView+	depends on / blocked

Reported:	2014-11-21 10:05 UTC by Jan Pokorný [poki]
Modified:	2015-07-22 07:34 UTC (History)
CC List:	7 users (show)
Fixed In Version:	ccs-0.16.2-75.el6
Doc Type:	Bug Fix
Doc Text:	Cause: ccs did not of logic to prevent multiple syncs/activations in one ccs command Consequence: It was possible to issue a command using multiple options that would cause multiple syncs and activations Fix: Only allow one sync/activation per command Result: ccs no longer issues multiple sync/activation commands.
Clone Of:	1157951
Environment:
Last Closed:	2015-07-22 07:34:07 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Proposed patch (4.02 KB, patch) 2014-11-21 15:14 UTC, Jan Pokorný [poki]	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:1405	0	normal	SHIPPED_LIVE	ricci bug fix and enhancement update	2015-07-20 18:07:08 UTC

Description Jan Pokorný [poki] 2014-11-21 10:05:50 UTC

Apparently fixing this on the side of ccs is not enough for the flawless
cluster stack operation, but at least, it will lower the probability of
running into issues with configuration being reloaded within the cluster
stack components in a way the first such reload hasn't finished completely
when it is triggered anew (cf. the likely race condition in rgmanager
in the original [bug 1157951]).


+++ This bug was initially created as a clone of Bug #1157951 +++

--- Additional comment from Jan Pokorný on 2014-11-20 23:52:52 CET ---

[...]

0. assumption:
   you originally used the same (or equivalent) command as later on, i.e.:

>  ccs -h localhost --activate --sync --password "secret" --rmvm iRed2

--


1. "Updating cluster.conf" followed by symptoms of cluster.conf being
   indeed propagated, shortly twice in row on nr-c03n01, seemed unnatural
   and suspicious

->

2. indeed there is a bug in ccs causing following sequence:

   - if (removevm): remove_vm(name)
     -> set_cluster_conf (while "activate" holds ~ --activate,
                          only against localhost)

     <spoiler-alert>
         "activate" should be temporarily masked if "sync" is set
         to prevent "double activate", just as the method below does
     </spoiler-alert>

   - if (sync): sync_cluster_conf()
     -> set_cluster_conf (with "activate" masked,
                          against all nodes via cluster.conf hostnames)
     -> set_cluster_conf (with "activate" unmasked, hence true as above,
                         only against the last enumerated node)

--

Bottom-line: there is still a bug in rgmanager in not being able, in some
circumstances, to deal with 2+ subsequent configuration updates in a very
very very short time frame (likely a race condition)

Good news: buggy ccs (in a sense, working, but less efficiently than
appropriate) helped to discover this bug :)

Comment 1 Jan Pokorný [poki] 2014-11-21 15:14:20 UTC

Created attachment 959802 [details]
Proposed patch

Solution should be easy, just temporarily mask the "activate" flag,
unmask it just before "sync" that is intentionally a last triggerable
modifier in the ccs invocation.

NOTE:

> This variant of the patch tries to preserve original behavior that
> standalone --activate (without --sync as suggested per help message)
> will also activate (rule of "no more than once" is respected).
> 
> If not suitable, replace "not(sync) and activate" with "False".

Comment 2 Chris Feist 2015-03-03 21:42:41 UTC

Fixed upstream: 
https://github.com/feist/ccs/commit/4a296076308b2a8ea9399a9f2579c34ffb74a00a

Comment 4 Chris Feist 2015-03-03 23:49:48 UTC

Before Fix (2 propagate command sent):
[root@ask-03 ~]# rpm -q ccs
ccs-0.16.2-75.el6.x86_64
[root@ask-03 ~]# rm -f /etc/cluster/cluster.conf 
[root@ask-03 ~]# ccs --createcluster test_cluster
[root@ask-03 ~]# ccs --addnode localhost
Node localhost added.
[root@ask-03 ~]# ccs --addvm my_vm
[root@ask-03 ~]# ccs --sync --activate --debug  --rmvm my_vm | grep propagate | wc
      2      34     678



After Fix (1 propagate command set):
[root@ask-02 ccs]# rpm -q ccs
ccs-0.16.2-77.el6.x86_64
[root@ask-02 ccs]# rm -f /etc/cluster/cluster.conf 
[root@ask-02 ccs]# ccs --createcluster test_cluster
[root@ask-02 ccs]# ccs --addnode localhost
Node localhost added.
[root@ask-02 ccs]# ccs --addvm my_vm
[root@ask-02 ccs]# ccs --sync --activate --debug  --rmvm my_vm | grep propagate | wc
      1      17     340

Comment 8 errata-xmlrpc 2015-07-22 07:34:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1405.html

Note You need to log in before you can comment on or make changes to this bug.