Bug 1212435

Summary: [RFE] Add semi-automatic rolling-update helper to pcs
Product: Red Hat Enterprise Linux 7 Reporter: Jan Pokorný [poki] <jpokorny>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED WONTFIX QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: cfeist, cluster-maint, kgaillot, plambri, tojeline
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 1388827 (view as bug list) Environment:
Last Closed: 2020-12-15 07:34:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1388827    

Description Jan Pokorný [poki] 2015-04-16 12:02:58 UTC
It would be nice if pcs could assist with a process of node-by-node
rolling update of (cluster or whole environment) software.


One possible analogy to the proposed feature is akin to "git rebase
--interactive" where you use "edit" keyword for each commit.
Git then:

1. some preparation work
2. proceed particular commit in row:
   2a. pick the current queued commit and commits it
   2b. USER HAS FREE HANDS TO MODIFY THE COMMIT NOW
       (here in terms of git commit --ammend, for instance)
   2c. wait for "git rebase --continue" so as to move on the next
       commit in the queue and apply it in 2a. (finish if there
       is no more commit left in the queue)
   2d. or wait for "git rebase --abort" that rolls the whole
       rebase operation back
   2e. or wait for "git rebase --skip" that, IIUIC, undoes 2a.
       and continues with the next commit at 2a.
3. possibly some "transaction finished" finalization


In a similar vein, pcs could assist with an iteration over all the nodes
in the cluster, mainly for the purpose of per-node rolling update:

1. some preparation work
   - figure out the nodes and if at all the cluster is elligible for
     the operation (all nodes online and healthy)
   - internally note the operation-in-progress
2. proceed particular node in row:
   2a. pick the node, put it in the standby mode
   2b. USER HAS FREE HANDS TO DO WHATEVER IS NEEDED ON THAT NODE
       (software updates, other maintenance)
   2c. wait for something like "pcs maint --continue" so as to move on
       the next node in queue, prior to trying contacting the current
       node and bringing it back from standby mode, the new node is
       used again in 2a. (finish if there is no more commit left in
       the queue)
   2d. or wait for something like "pcs maint --abort" that just
       tries contacting the current node and bringing it back from
       standby-mode
   2e. or wait for something like "pcs maint --skip" that performs
       2d., but continues with the next node in row
3. "transaction finished" finalization
   - remove internal tracking of the operation-in-progress
   - possibly check if cluster is fully healthy (plus the software
     versions match, etc.)


This could make such a complicated administrative step as rolling update
across cluster a breeze.  Thanks for considering.

This feature should perhaps not be exposed in the GUI, as it expects
some level of expertise, nothing for a novice user.


References:
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_rolling_node_by_node.html

Comment 5 Jan Pokorný [poki] 2016-10-24 07:43:25 UTC
Considering not so staightforward procedures in some cases (e.g.
pacemaker_remoted 1.1.15 won't talk to pacemaker 1.1.14 [the source
of the issue is apparent in the logs on pacemaker_remoted side]),
it would be really useful to have this feature implemented, with
rather flexible ways to express the logic for particular scenarios.

It would be good to have constructs like these to express the steps:

- get set of cluster nodes/remote nodes

- proceed something with all/all but one from the node set

- make resources avoid particular nodes + cancel that
  (maintenance mode or ban/unban?)

- open/close upgrade window in which user is supposed to
  perform the upgrade

Comment 7 Ken Gaillot 2016-10-24 14:46:33 UTC
One twist is that rolling upgrades require attention to certain internal protocol versioning, which currently includes the crm feature set and the lrmd protocol version.

The crm feature set is easily obtainable from <cib crm_feature_set="..."> in the CIB, but there is no way currently for an external program to check the lrmd protocol version. Probably some pacemaker CLI tool should be able to provide all versions (pacemaker, crm feature set, lrmd protocol) on request. Note that each of these is per-node, not per-cluster.

The pacemaker version itself is not of any concern for rolling upgrades.

The crm feature set applies to the full cluster nodes in a cluster. The crm feature set is currently a triplet (e.g. "3.0.11"), although in the distant past (through pacemaker 1.0.1 in 2008) it was just a major-minor (e.g. "2.1"). If the major version (the first number) changes, a rolling upgrade is not possible. If the minor version (the second number) changes, a rolling upgrade is possible, but any node that leaves the cluster may be unable to return unless it is upgraded, so it is important that the upgrade be completed in a reasonable window. If the minor-minor changes, currently it is treated the same as the minor, but there are plans to change that so it is irrelevant to rolling upgrades (used only to provide information to resource agents). If the crm feature set does not change, rolling upgrades are possible with no limitation.

The lrmd protocol version applies to the connection between remote/guest nodes and cluster nodes. It is currently a major-minor (e.g. "1.1"). There are no explicit semantics for major-version changes, but presumably they should be interpreted as making rolling upgrades impossible. If the minor version is different between a remote node and the cluster node hosting its connection, pacemaker through 1.1.14 (lrmd protocol version 1.0) will not allow the connection to proceed; pacemaker 1.1.15 and later (lrmd protocol version >= 1.1) will allow the connection to proceed only if the cluster node's version is newer. If the lrmd protocol version does not change, rolling upgrades are possible with no limitation.

It's complicated, but predictable. Let me know if anything is unclear.

Comment 8 Tomas Jelinek 2016-11-01 08:33:33 UTC
Upgrading process description in Pacemaker Explained documentation:
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/_upgrading.html

Comment 11 RHEL Program Management 2020-12-15 07:34:05 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.