Bug 1212435

Summary:	[RFE] Add semi-automatic rolling-update helper to pcs
Product:	Red Hat Enterprise Linux 7	Reporter:	Jan Pokorný [poki] <jpokorny>
Component:	pcs	Assignee:	Tomas Jelinek <tojeline>
Status:	CLOSED WONTFIX	QA Contact:	cluster-qe <cluster-qe>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	7.2	CC:	cfeist, cluster-maint, kgaillot, plambri, tojeline
Target Milestone:	rc	Keywords:	FutureFeature
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Enhancement
Doc Text:		Story Points:	---
Clone Of:
Clones:	1388827 (view as bug list)		Environment:
Last Closed:	2020-12-15 07:34:05 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1388827

Description Jan Pokorný [poki] 2015-04-16 12:02:58 UTC

It would be nice if pcs could assist with a process of node-by-node
rolling update of (cluster or whole environment) software.

One possible analogy to the proposed feature is akin to "git rebase
--interactive" where you use "edit" keyword for each commit.
Git then:

1. some preparation work
2. proceed particular commit in row:
2a. pick the current queued commit and commits it
2b. USER HAS FREE HANDS TO MODIFY THE COMMIT NOW
(here in terms of git commit --ammend, for instance)
2c. wait for "git rebase --continue" so as to move on the next
commit in the queue and apply it in 2a. (finish if there
is no more commit left in the queue)
2d. or wait for "git rebase --abort" that rolls the whole
rebase operation back
2e. or wait for "git rebase --skip" that, IIUIC, undoes 2a.
and continues with the next commit at 2a.
3. possibly some "transaction finished" finalization

In a similar vein, pcs could assist with an iteration over all the nodes
in the cluster, mainly for the purpose of per-node rolling update:

1. some preparation work
- figure out the nodes and if at all the cluster is elligible for
the operation (all nodes online and healthy)
- internally note the operation-in-progress
2. proceed particular node in row:
2a. pick the node, put it in the standby mode
2b. USER HAS FREE HANDS TO DO WHATEVER IS NEEDED ON THAT NODE
(software updates, other maintenance)
2c. wait for something like "pcs maint --continue" so as to move on
the next node in queue, prior to trying contacting the current
node and bringing it back from standby mode, the new node is
used again in 2a. (finish if there is no more commit left in
the queue)
2d. or wait for something like "pcs maint --abort" that just
tries contacting the current node and bringing it back from
standby-mode
2e. or wait for something like "pcs maint --skip" that performs
2d., but continues with the next node in row
3. "transaction finished" finalization
- remove internal tracking of the operation-in-progress
- possibly check if cluster is fully healthy (plus the software
versions match, etc.)

This could make such a complicated administrative step as rolling update
across cluster a breeze. Thanks for considering.

This feature should perhaps not be exposed in the GUI, as it expects
some level of expertise, nothing for a novice user.

References:
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_rolling_node_by_node.html

Comment 5 Jan Pokorný [poki] 2016-10-24 07:43:25 UTC

Considering not so staightforward procedures in some cases (e.g.
pacemaker_remoted 1.1.15 won't talk to pacemaker 1.1.14 [the source
of the issue is apparent in the logs on pacemaker_remoted side]),
it would be really useful to have this feature implemented, with
rather flexible ways to express the logic for particular scenarios.

It would be good to have constructs like these to express the steps:

- get set of cluster nodes/remote nodes

- proceed something with all/all but one from the node set

- make resources avoid particular nodes + cancel that
  (maintenance mode or ban/unban?)

- open/close upgrade window in which user is supposed to
  perform the upgrade

Comment 7 Ken Gaillot 2016-10-24 14:46:33 UTC

One twist is that rolling upgrades require attention to certain internal protocol versioning, which currently includes the crm feature set and the lrmd protocol version.

The crm feature set is easily obtainable from <cib crm_feature_set="..."> in the CIB, but there is no way currently for an external program to check the lrmd protocol version. Probably some pacemaker CLI tool should be able to provide all versions (pacemaker, crm feature set, lrmd protocol) on request. Note that each of these is per-node, not per-cluster.

The pacemaker version itself is not of any concern for rolling upgrades.

The crm feature set applies to the full cluster nodes in a cluster. The crm feature set is currently a triplet (e.g. "3.0.11"), although in the distant past (through pacemaker 1.0.1 in 2008) it was just a major-minor (e.g. "2.1"). If the major version (the first number) changes, a rolling upgrade is not possible. If the minor version (the second number) changes, a rolling upgrade is possible, but any node that leaves the cluster may be unable to return unless it is upgraded, so it is important that the upgrade be completed in a reasonable window. If the minor-minor changes, currently it is treated the same as the minor, but there are plans to change that so it is irrelevant to rolling upgrades (used only to provide information to resource agents). If the crm feature set does not change, rolling upgrades are possible with no limitation.

The lrmd protocol version applies to the connection between remote/guest nodes and cluster nodes. It is currently a major-minor (e.g. "1.1"). There are no explicit semantics for major-version changes, but presumably they should be interpreted as making rolling upgrades impossible. If the minor version is different between a remote node and the cluster node hosting its connection, pacemaker through 1.1.14 (lrmd protocol version 1.0) will not allow the connection to proceed; pacemaker 1.1.15 and later (lrmd protocol version >= 1.1) will allow the connection to proceed only if the cluster node's version is newer. If the lrmd protocol version does not change, rolling upgrades are possible with no limitation.

It's complicated, but predictable. Let me know if anything is unclear.

Comment 8 Tomas Jelinek 2016-11-01 08:33:33 UTC

Upgrading process description in Pacemaker Explained documentation:
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/_upgrading.html

Comment 11 RHEL Program Management 2020-12-15 07:34:05 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.