Bug 1341023

Summary: Cluster level can be changed while there are running VMs
Product: Red Hat Enterprise Virtualization Manager Reporter: rhev-integ
Component: ovirt-engineAssignee: Marek Libra <mlibra>
Status: CLOSED ERRATA QA Contact: Shira Maximov <mshira>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.5CC: adevolder, agkesos, amarchuk, aperotti, bgraveno, c.handel, didi, emahoney, germano, gveitmic, inetkach, lsurette, mavital, mgoldboi, michal.skrivanek, mkalinin, pstehlik, rbalakri, redhat, rgolan, Rhev-m-bugs, sites-redhat, srevivo, tjelinek, ycui, ykaul
Target Milestone: ovirt-3.6.7Keywords: Reopened, ZStream
Target Release: 3.6.7   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously the cluster level could be changed while virtual machines were running, which caused non-deterministic issues. This update ensures that no virtual machines are running when changing the cluster level.
Story Points: ---
Clone Of: 1336527 Environment:
Last Closed: 2016-06-29 16:20:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1336527, 1340345    
Bug Blocks:    

Comment 2 Shira Maximov 2016-06-07 07:33:44 UTC
verified on : 
Red Hat Enterprise Virtualization Manager Version: 3.6.7.2-0.1.el6

Verification steps: 
1. Create a DC with 3.5 compatibility version
2. Add a host with 3.5 compatibility version
3. Create and run vm 
4. Try to edit the  compatibility version in cluster from 3.5 to 3.6

Result : 
'shut down all vm before changing the cluster level'

Comment 3 Carl Thompson 2016-06-19 14:10:21 UTC
Does this fix break the Hosted Engine case? I have a 3.6 cluster I've upgraded to 4.0RC2 and I can't figure out how to how to change the Compatibility Version of the cluster to 4.0 now. When I attempt to change it by editing the cluster properties I get the message "Shut down all VMs before changing the cluster version." The only VM running is the Hosted Engine.

I tried simply creating a new cluster at version 4.0 but there doesn't seem to be any way of moving the Hosted Engine to the new cluster nor does there seem to be a way to migrate my 100+ VMs to the new cluster without having to manually edit the properties of each one. Is there a way to do this?

I'm new to oVirt so perhaps I'm missing something...

Thank you,
Carl Thompson

Comment 4 Michal Skrivanek 2016-06-20 13:39:40 UTC
Didi, can you update on teh right way how to execute the upgrade patch?

Comment 5 Michal Skrivanek 2016-06-20 14:03:08 UTC
patch->path. Sorry.

the point is there needs to be a HE shutdown somewhere, and then it should start up in the right 4.0 cluster

Comment 6 Carl Thompson 2016-06-21 03:23:00 UTC
Hello, Michal. Thanks for responding.

I don't understand what you're suggesting; I've restarted the HE and it comes back up in the old cluster for which I can't change the Compatibility Version. Are you saying there's a way to restart it so that it comes up in a different cluster? Can you elaborate on how that's done?

Thanks,
Carl

Comment 7 Michal Skrivanek 2016-06-21 07:04:53 UTC
(In reply to Carl Thompson from comment #6)
> Hello, Michal. Thanks for responding.
> 
> I don't understand what you're suggesting; I've restarted the HE and it
> comes back up in the old cluster for which I can't change the Compatibility
> Version. Are you saying there's a way to restart it so that it comes up in a
> different cluster? Can you elaborate on how that's done?
> 
> Thanks,
> Carl

Hi Carl,
this should be solved by bug 1341145 in 4.0.0.5 build, so please try to update to the latest available and try to change the cluster again

Comment 8 Michal Skrivanek 2016-06-21 08:46:52 UTC
(In reply to Carl Thompson from comment #3)
 
> I tried simply creating a new cluster at version 4.0 but there doesn't seem
> to be any way of moving the Hosted Engine to the new cluster nor does there
> seem to be a way to migrate my 100+ VMs to the new cluster without having to
> manually edit the properties of each one. Is there a way to do this?

There might be a real issue with HE, still investigating, but to answer this specific point about 100 VMs - they need to be shut down if you want to move the cluster to newer cluster level or they need to have the per-VM cluster level override set.
This should be improved. I guess it needs to be done soon

Comment 9 Carl Thompson 2016-06-21 14:16:09 UTC
(In reply to Michal Skrivanek from comment #7)

> Hi Carl,
> this should be solved by bug 1341145 in 4.0.0.5 build, so please try to
> update to the latest available and try to change the cluster again

This did not fix the problem for me. Still can't change the Compatibility Version and get the error message "Shut down all VMs before changing the cluster version." Bug 1341145 does not seem to be related to this at all... Are you sure about that bug number?

[root@2sexi ~]# rpm -qa | grep ovirt-engine
ovirt-engine-setup-base-4.0.0.6-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-4.0.0.6-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-4.0.0.6-1.el7.centos.noarch
ovirt-engine-restapi-4.0.0.6-1.el7.centos.noarch
ovirt-engine-tools-4.0.0.6-1.el7.centos.noarch
ovirt-engine-dwh-4.0.0-2.gita09d329.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.1.0-1.el7.noarch
python-ovirt-engine-sdk4-4.0.0-0.3.a3.el7.centos.x86_64
ovirt-engine-dashboard-1.0.0-0.2.20160610git5d210ea.el7.centos.noarch
ovirt-engine-lib-4.0.0.6-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-4.0.0.6-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.0.0.6-1.el7.centos.noarch
ovirt-engine-extension-aaa-ldap-1.2.0-1.el7.noarch
ovirt-engine-websocket-proxy-4.0.0.6-1.el7.centos.noarch
ovirt-engine-setup-4.0.0.6-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.7.0-1.el7.centos.noarch
ovirt-engine-tools-backup-4.0.0.6-1.el7.centos.noarch
ovirt-engine-dbscripts-4.0.0.6-1.el7.centos.noarch
ovirt-engine-backend-4.0.0.6-1.el7.centos.noarch
ovirt-engine-4.0.0.6-1.el7.centos.noarch
ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
ovirt-engine-dwh-setup-4.0.0-2.gita09d329.el7.centos.noarch
ovirt-engine-wildfly-overlay-10.0.0-1.el7.noarch
ovirt-engine-wildfly-10.0.0-1.el7.x86_64
ovirt-engine-setup-plugin-ovirt-engine-4.0.0.6-1.el7.centos.noarch
ovirt-engine-extension-aaa-ldap-setup-1.2.0-1.el7.noarch
ovirt-engine-extensions-api-impl-4.0.0.6-1.el7.centos.noarch
ovirt-engine-webadmin-portal-4.0.0.6-1.el7.centos.noarch
ovirt-engine-userportal-4.0.0.6-1.el7.centos.noarch

Comment 10 Michal Skrivanek 2016-06-21 14:28:08 UTC
yeah, the number is correct, but there were others still preventing this. We looked at that in more detail today and AFAICT it's not possible to do it even in the latest version. We're working on a fix
As a workaround for now you'd need to shut down the engine and make a direct database modification upgrading the number in the table

Comment 11 Carl Thompson 2016-06-21 14:33:58 UTC
(In reply to Michal Skrivanek from comment #8)

> There might be a real issue with HE, still investigating, but to answer this
> specific point about 100 VMs - they need to be shut down if you want to move
> the cluster to newer cluster level or they need to have the per-VM cluster
> level override set.
> This should be improved. I guess it needs to be done soon

Thanks. However the issue I was alluding to there is that there doesn't seem to be a way to move VMs _en masse_ from one cluster to another. So I have these 100+ VMs (which are turned off currently) and the only way to move them from cluster 1 to cluster 2 is to click into the settings of each one individually. There doesn't seem to be a way to select them all and switch them to a different cluster at once. (I suppose I could try exporting them to the export domain and re-importing them into the new cluster but that seems like a really long workaround to solve a simple problem.)

BTW, this is a test cluster to evaluate whether we want to use oVirt in production so I can make changes and test as necessary to help resolve these issues.

Thanks,
Carl

Comment 12 Carl Thompson 2016-06-21 14:34:14 UTC
(In reply to Michal Skrivanek from comment #8)

> There might be a real issue with HE, still investigating, but to answer this
> specific point about 100 VMs - they need to be shut down if you want to move
> the cluster to newer cluster level or they need to have the per-VM cluster
> level override set.
> This should be improved. I guess it needs to be done soon

Thanks. However the issue I was alluding to there is that there doesn't seem to be a way to move VMs _en masse_ from one cluster to another. So I have these 100+ VMs (which are turned off currently) and the only way to move them from cluster 1 to cluster 2 is to click into the settings of each one individually. There doesn't seem to be a way to select them all and switch them to a different cluster at once. (I suppose I could try exporting them to the export domain and re-importing them into the new cluster but that seems like a really long workaround to solve a simple problem.)

BTW, this is a test cluster to evaluate whether we want to use oVirt in production so I can make changes and test as necessary to help resolve these issues.

Thanks,
Carl

Comment 13 Yedidyah Bar David 2016-06-21 14:42:20 UTC
(In reply to Carl Thompson from comment #12)
> Thanks. However the issue I was alluding to there is that there doesn't seem
> to be a way to move VMs _en masse_ from one cluster to another.

If indeed impossible from the UI, you can open an RFE to allow this.

Should probably be not-too-difficult with the python-sdk (or even ovirt-shell), but I never tried this so can't provide details.

Comment 14 Michal Skrivanek 2016-06-21 14:46:48 UTC
(In reply to Yedidyah Bar David from comment #13)
> (In reply to Carl Thompson from comment #12)
> > Thanks. However the issue I was alluding to there is that there doesn't seem
> > to be a way to move VMs _en masse_ from one cluster to another.
> 
> If indeed impossible from the UI, you can open an RFE to allow this.
> 
> Should probably be not-too-difficult with the python-sdk (or even
> ovirt-shell), but I never tried this so can't provide details.

we have a bug on mass edit VMs somewhere around...not going to happen any time soon though, not within the current UI infra:/

Comment 15 Michal Skrivanek 2016-06-21 14:48:44 UTC
(In reply to Carl Thompson from comment #12)
> Thanks. However the issue I was alluding to there is that there doesn't seem
> to be a way to move VMs _en masse_ from one cluster to another. So I have
> these 100+ VMs (which are turned off currently) and the only way to move
> them from cluster 1 to cluster 2 is to click into the settings of each one
> individually. There doesn't seem to be a way to select them all and switch
> them to a different cluster at once. (I suppose I could try exporting them
> to the export domain and re-importing them into the new cluster but that
> seems like a really long workaround to solve a simple problem.)


sure. Currently if the only offending VM is the HE VM you can perhaps move that one using the cross-cluster migration to a newly created 3.6 cluster to clear the one with 100 VMs so it lets you update it. There should be not so many drawbacks running a 4.0 engine in a HE VM running in 3.6 cluster - it's still a 4.0 engine.

Comment 16 Carl Thompson 2016-06-21 16:05:14 UTC
(In reply to Michal Skrivanek from comment #15)

> sure. Currently if the only offending VM is the HE VM you can perhaps move
> that one using the cross-cluster migration to a newly created 3.6 cluster to
> clear the one with 100 VMs so it lets you update it. There should be not so
> many drawbacks running a 4.0 engine in a HE VM running in 3.6 cluster - it's
> still a 4.0 engine.

It does not work to cross-cluster migrate the HE VM. The VM migrates to the new host in a different cluster fine but the HE VM is still associated with the previous cluster. In other words, the VM is in one cluster but ends up running on a  host that's in a different cluster. That itself might be considered a bug. Because the HE VM still is associated with the first cluster oVirt won't let me update the Compatibility Version of the first cluster (even though the HE VM is actually running on a host in a different cluster)!

Thanks,
Carl

Comment 17 Carl Thompson 2016-06-21 16:09:48 UTC
(In reply to Michal Skrivanek from comment #10)

> As a workaround for now you'd need to shut down the engine and make a direct
> database modification upgrading the number in the table

Is this all I need to do:

    engine=# UPDATE cluster SET compatibility_version='4.0' WHERE name='ClusterOne';

Thanks,
Carl

Comment 18 Michal Skrivanek 2016-06-21 16:43:38 UTC
(In reply to Carl Thompson from comment #17)
> (In reply to Michal Skrivanek from comment #10)
> 
> > As a workaround for now you'd need to shut down the engine and make a direct
> > database modification upgrading the number in the table
> 
> Is this all I need to do:
> 
>     engine=# UPDATE cluster SET compatibility_version='4.0' WHERE
> name='ClusterOne';
> 
> Thanks,
> Carl

Yes, that sounds about right. Should work. 

Didi, can you see comment #16 re HE behavior, maybe it needs to be changed in ovf or whatever is used for HE VM...

Comment 19 Carl Thompson 2016-06-21 16:47:50 UTC
(In reply to Michal Skrivanek from comment #18)
> (In reply to Carl Thompson from comment #17)
> > Is this all I need to do:
> > 
> >     engine=# UPDATE cluster SET compatibility_version='4.0' WHERE
> > name='ClusterOne';
> > 
> > Thanks,
> > Carl
> 
> Yes, that sounds about right. Should work. 

This worked, thanks. Obviously not ideal but my cluster's Compatibility Version is updated.

Comment 20 Carl Thompson 2016-06-21 17:25:29 UTC
Out of curiosity why is being able to change the Compatibility Version while VMs a bug? I can't imagine that changing the version would affect running VMs until they're restarted...

Comment 21 Yedidyah Bar David 2016-06-21 19:33:03 UTC
(In reply to Michal Skrivanek from comment #18)
> Didi, can you see comment #16 re HE behavior, maybe it needs to be changed
> in ovf or whatever is used for HE VM...

No idea about the engine's behavior re the hosted-engine vm. Roy - can you please have a look?

I'd just like to emphasize that there are two different, mostly-unrelated meanings, to the word "cluster" here:

1. A cluster in the engine (in its database)
2. A hosted-engine cluster - that is, all the hosts on which 'hosted-engine --deploy' was ran against a single hosted-storage.

Generally speaking, they should be identical in practice. But the code handling migrations inside the latter (2.), ovirt-hosted-engine-ha, does not check how the host it's running on is configured in the engine - all it needs is access to the hosted-storage. 'hosted-engine --deploy' should put them all in the same (1.)-cluster, but I am not sure if anything prevents moving hosts later between (1.)-clusters. If that's allowed (intentionally or not), the (2.)-cluster is not affected.

Comment 22 Carl Thompson 2016-06-21 21:04:36 UTC
For comment #16 and my other comments I was speaking only of clusters as they are configured in the engine's database. I was not speaking about HE clusters at all. The HE itself appears to be special cased in the engine's database which causes these issues.

Comment 23 Michal Skrivanek 2016-06-22 09:45:27 UTC
I would really appreciate if someone can clarify the upgrade procedure and whther it does or can work as of today (looking rgolan's way;)

Regardless, for the issue with "100 VMs needs to be shut down" I opened a separate bug 1348907 to allow for incremental upgrade while VMs are running

Comment 25 errata-xmlrpc 2016-06-29 16:20:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1364

Comment 26 Marina Kalinin 2016-07-13 16:15:55 UTC
I opened documentation bug for this:
https://bugzilla.redhat.com/show_bug.cgi?id=1356198

And here is the d/s bug to allow this change with running vms:
https://bugzilla.redhat.com/show_bug.cgi?id=1356194