Bug 1150191

Summary: Inhibit migrations RHEL7.0 -> RHEL 6.5 (or equivalent: CentOS)
Product: [Retired] oVirt Reporter: Francesco Romani <fromani>
Component: ovirt-engine-coreAssignee: Tomas Jelinek <tjelinek>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Stehlik <pstehlik>
Severity: high Docs Contact:
Priority: high    
Version: 3.5CC: ecohen, gklein, iheim, lsurette, michal.skrivanek, nicolas, rbalakri, s.kieske, yeylon
Target Milestone: ---   
Target Release: 3.5.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: virt
Fixed In Version: ovirt-3.5.1_rc1 Doc Type: Known Issue
Doc Text:
Migrations from RHEL 7.x to RHEL 6.x are not supported. Therefore we don't allow mixed RHEL version inside a same cluster. For manual migration from RHEL 6.x to 7.0 (which is supported) there is a new advanced option in the Migrate dialog to be able to migrate to a different cluster within the same data center. However, the suitability of the destination cluster is left up to the user, so one needs to use extra caution.
Story Points: ---
Clone Of:
: 1154631 (view as bug list) Environment:
Last Closed: 2015-01-21 16:03:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1147536, 1154631, 1164308, 1164311    

Description Francesco Romani 2014-10-07 15:49:05 UTC
Description of problem:
Migration backward from RHEL/CentOS 7.0 to RHEL/CentOS 6.5 is not supported
(https://bugzilla.redhat.com/show_bug.cgi?id=1150163), so it is prohibited by engine.

Either migration should not be allowed on mixed clusters, or RHEL 7.0 should not be able to join RHEL 6.5 clusters.

Version-Release number of selected component (if applicable):
ovirt-engine-3.5.0-0.0.master.20140923231850.git42065cc.fc19.noarch


How reproducible:
100%

Steps to Reproduce:
1. have a cluster with RHEL 6.5 hosts
2. add one RHEL 7.0 host to cluster
3. start a VM on the RHEL 7.0 host
4. migrate the VM to any RHEL 6.5 host

Actual results:
RHEL 7.0 host can join a RHEL 6.5 cluster; a VM can be migrated from 7.0 to 6.5 host. after that, random misbehaviour (may) occur(s).

Expected results:
Either migration should not be allowed on mixed clusters, or RHEL 7.0 should not be able to join RHEL 6.5 clusters.

Comment 1 Francesco Romani 2014-10-08 07:15:15 UTC
Just to share a thought, I believe Engine should never allow to migrate to a QEMU with lower version, except maybe for micro version.

Comment 2 Sven Kieske 2014-10-08 11:11:07 UTC
(In reply to Francesco Romani from comment #1)
> Just to share a thought, I believe Engine should never allow to migrate to a
> QEMU with lower version, except maybe for micro version.

Then you should really check the qemu version and not the rhel version.
In general: features should be tested, not version numbers ;)

Comment 3 Michal Skrivanek 2014-10-20 11:25:34 UTC
proposed implementation changes for separating RHEL 6 and RHEL 7 clusters and a special cross-cluster migration:

1) Scheduler changes - even when the specific host is selected we still go to the scheduler (IMHO we should not even in a common case); We can do a small change tricking scheduler to run against the target cluster instead of the one where the VM is, but that may uncover some hidden bugs when we actually pull information from the VM instead of the cluster passed in. Still probably not that difficult.
Another option is to bypass the scheduler as a special case, that's a fairly contained change and I would favor it.

2) on host coming up we check whether the newly joining host has the same set of machine types. If it differs we move it to non-operational with a message. This should be simple enough change, not requiring any db fields, the code has been recently cleaned up/reviewed by Roy for similar procesing of PPC vs x86 archs, so he should still have it in mind and relatively a low risk change.
The behavior would be
- if it's a first host, set the supported emulated machines (current behavior)
- if it's 2nd+ host check if any other host already in the cluster is the same RHEL (as reported in VDS caps already). We can limit this to the specific EL 6 and 7 case

3) In UI in the manual migration dialog we would add "Advanced option" (like in extra nfs options) to specify target cluster. We would simply show all clusters and leave it to the user to select the right one. Simple.

The upgrade workflow:
create a second cluster for migration
move to maintenance/install RHEL 7 hosts into the second cluster ,and bring them up while manually special-migrate the VMs
you end up with an empty old cluster
…so downside is the cluster settings need to be the same, but there aren't that many things to care about

Comment 4 Sandro Bonazzola 2015-01-15 14:15:36 UTC
This is an automated message: 
This bug should be fixed in oVirt 3.5.1 RC1, moving to QA

Comment 5 Sandro Bonazzola 2015-01-21 16:03:58 UTC
oVirt 3.5.1 has been released. If problems still persist, please make note of it in this bug report.

Comment 6 Nicolas Ecarnot 2015-04-30 14:27:59 UTC
Just tried the following, in a 3.5.1 :

- first cluster is made of centos 6.6 hosts
- created a second cluster with one 7.0 host
- start a VM in first 6.6 cluster
- migrate it to 7.0 cluster -> OK
- migrate it back to 6.6 cluster -> failing (losing ping, console, network, nothing helpful)

- I have to brute kill the VM, then start it with no issue - whatever the cluster.

Conclusion : backward migration seems still broken.
My opinion is that the Redhat team should not spend too much time on this, and maybe just block this backward migration.

Comment 7 Michal Skrivanek 2015-04-30 14:30:46 UTC
you had to use the explicit cluster override in migration dialog, right?
If so, then indeed it's not going to work backwards. It's just that we do not check/warn on the specific issues and leave it without any checks. That was the intent and that's why the option is a bit hidden and we explicitly warn about it in docs

Comment 8 Nicolas Ecarnot 2015-04-30 14:45:13 UTC
Indeed, I used the hidden option, explicitly specifying the destination cluster.
And doing so, upward migration is working; downward migration is failing.

And as I was writing, this is not a big deal. (I'm already really happy to plan a smooth hosts upgrade path with the two-clusters method - oVirt team once again made my day - really)