Bug 1262129

Summary: PA content RHEL6->RHEL7 - "Cluster & HA" - revisit and suggest clufter, config migration utility or its output directly
Product: Red Hat Enterprise Linux 6 Reporter: Jan Pokorný [poki] <jpokorny>
Component: preupgrade-assistant-el6toel7Assignee: pstodulk
Status: NEW --- QA Contact: Alois Mahdal <amahdal>
Severity: medium Docs Contact:
Priority: high    
Version: 6.7CC: borgan, briang, fdinitto, fkluknav, jpokorny, mjuricek, mspqa-list, ovasik, phracek, pstodulk, pvn, ttomecek
Target Milestone: rcKeywords: Extras
Target Release: ---Flags: phracek: needinfo? (jpokorny)
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1056996
: 1262271 (view as bug list) Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On: 1056996, 1182358    
Bug Blocks: 1262271, 1278535    

Description Jan Pokorný [poki] 2015-09-10 21:41:23 UTC
Cloning the original bug so as to have new clufter migration considered.
Comments in-line:

+++ This bug was initially created as a clone of Bug #1056996 +++

> On 08/20/2014 07:10 PM, Paul Novarese wrote:
>> On 08/20/2014 05:42 AM, Fabio M. Di Nitto wrote:
>>> We don't support rolling upgrades from 6 to 7 with RHEL-HA.
>>> There is no plan to support it, in fact it cannot be supported because
>>> the corosync onwire protocol is completely different.
>> But what about a scenario where we do an in-place upgrade from 6->7 of
>> the individual nodes, without the requirement that the cluster remain
>> available during the entire process?  So, an in-place upgrade but NOT
>> rolling?
> That means you are simply taking down the whole cluster, upgrade every
> single node, reconfigure the cluster and then start it.

Sure in-place upgrade is the only thing to consider.
> It doesn't help anything from a downtime perspective. It actually makes
> it worst.

Sure uptime has to be sacrificed, to the point "it make take weeks" to
finish the migration.  But this is no worth than offering no upgrade

> That is assuming that:
> - you are already running pacemaker on rhel6.

Not necessarily for the clufter use case (see below).

> - we develop a tool to convert cluster.conf to corosync.conf

^ this one was actually based on truth (more inclusively, cluster.conf
  to corosync.conf + possibly rgmanager's to pacemaker's one);

There is now (since 6.7) a tool called clufter [bug 1182358] that
facilitates the upgrade of cluster stacks configuration in a way human
supervision, ladjustments and (importantly) testing of the suggested
cluster deployment is required(!).

Not sure about the optimal extent of the clufter-PA integration, we
definitely should not try to enforce any result coming from clufter,
but we could suggest either:

1. just statically mention there is a possibility to use clufter for
   the task

2. running the sequence of commands involving management CLI for
   the new cluster stack as generated by clufter to some specified
   file -- these commands shall then be run on only a single cluster
   node + all the nodes would require to have pcs installed and
   pcsd.service enabled and started (hence the recipe would be:
   run this set of commands on every node, run the other set of
   commands on arbitrary single node)

3. the contents of corosync.conf and pacemaker configuration
   directly -- the files should either be generated separately on each
   node (provided the same input, also the output should be
   deterministically same)

My suggestion is to stick with 2. when converting from CMAN/rgmanager
configuration (non-empty /cluster/rm tree in cluster.conf and rgmanager
not disabled with a /cluster/rm/@disabled[?]) and 3. otherwise with
proviso that only corosync.conf output is offered.

> - _EVERY_ application in the cluster (Oracle? DB2? etc.) can upgrade too.

True, there is a lot of fragility arising from complex dependencies.