Bug 1389028

Summary: Repair rolling upgrades from 6.8 -> 6.9
Product: Red Hat Enterprise Linux 6 Reporter: Ken Gaillot <kgaillot>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: CLOSED ERRATA QA Contact: Patrik Hagara <phagara>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.9CC: abeekhof, cfeist, cluster-maint, cluster-qe, fdinitto, jkortus, jpokorny, kgaillot, mjuricek, mkrcmari, mnovacek, plambri, snagar, tlavigne, tojeline
Target Milestone: rcKeywords: Regression, ZStream
Target Release: 6.9   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: pacemaker-1.1.15-3.el6 Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: 1388827 Environment:
Last Closed: 2017-03-21 09:52:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ken Gaillot 2016-10-26 17:03:31 UTC
+++ This bug was initially created as a clone of Bug #1388827 +++

(below trimmed for relevance)

--- Additional comment from Jan Pokorný on 2016-10-24 03:43:25 EDT ---

pacemaker_remoted 1.1.15 won't talk to pacemaker 1.1.14 [the source
of the issue is apparent in the logs on pacemaker_remoted side]

--- Additional comment from Andrew Beekhof on 2016-10-26 05:23:04 EDT ---

Here's what we need to do:

2. Set LRMD_PROTOCOL_VERSION back to 1.0 in 7.3 builds

5. Add a "review lrmd changes that might require bumping
LRMD_PROTOCOL_VERSION" to the upstream release checklist
6. Ensure that new lrm features think about how to handle "old"
remotes in the future
7. Update the docs to indicate that the software version on remote
nodes must be lte the lowest version in the main cluster

--- Additional comment from Ken Gaillot on 2016-10-26 12:03:53 EDT ---

Update: it has been determined that the protocol version change was not strictly necessary in any case, so Comment 1's step 2 is sufficient to fix the issue (steps 5-7 are still worthwhile for the future) .

--- Additional comment from Ken Gaillot on 2016-10-26 12:42:11 EDT ---

QA: Test procedure is to set up a pacemaker cluster that includes remote nodes, using the 7.2 packages, then do a rolling upgrade to the current packages (i.e. one node at a time, pcs cluster stop, yum update, pcs cluster start). There should be no failures, unexpected stops, or fencing, regardless of which order nodes are upgraded (in particular, a newer remote node connecting to an older cluster node, and an older remote node connecting to a newer cluster node).

Comment 5 Patrik Hagara 2017-01-18 16:48:50 UTC
Cluster of 3 nodes + 2 pacemaker_remote nodes successfully passed automated rolling upgrade test from RHEL-6.8 to 6.9 using the pacemaker-1.1.15-3.el6 packages. See full log at http://pastebin.test.redhat.com/447152 -- marking verified.

Comment 7 errata-xmlrpc 2017-03-21 09:52:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2017-0629.html