Bug 1723578 - [downstream clone - 4.3.5] Increase of ClusterCompatibilityVersion to Cluster with virtual machines with outstanding configuration changes, those changes will be reverted
Summary: [downstream clone - 4.3.5] Increase of ClusterCompatibilityVersion to Cluster...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.2.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.3.5
: 4.3.5
Assignee: Shmuel Melamud
QA Contact: Liran Rotenberg
URL:
Whiteboard:
Depends On: 1650505
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-24 20:36 UTC by RHV bug bot
Modified: 2022-03-13 17:05 UTC (History)
4 users (show)

Fixed In Version: ovirt-engine-4.3.5.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1650505
Environment:
Last Closed: 2019-08-12 11:53:28 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2019:2431 0 None None None 2019-08-12 11:53:40 UTC
oVirt gerrit 97224 0 'None' MERGED core: Block cluster upgrade if there are VMs with NEXT_RUN 2020-12-22 15:25:21 UTC
oVirt gerrit 97688 0 'None' MERGED core: Take NEXT_RUN into account on cluster upgrade 2020-12-22 15:24:50 UTC

Description RHV bug bot 2019-06-24 20:36:00 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1650505 +++
======================================================================

Description of problem:
If one have virtual machines with outstanding configuration changes, which require the virtual machine to reboot and the Cluster CompatibilityVersion will be upgraded to a newer version, it will overwrite the change and will use the current active configuration instead.


Version-Release number of selected component (if applicable):
ovirt-engine-4.2.6.4-0.1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Make change to virtual machine (ex. increase iothread through api-call)
2. increase cluster version


Actual results:
The change is lost on upgrade cluster CompatibilityVersion

Expected results:
Beside the new ClusterCompatibility Version, the change should also included in the next_run configuration.
If not possible, Customers should made aware if increase Cluster Version on running virtual machines, any configuration change which is not already applied will be lost.

Additional info:
[root@rhhi ansible]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select num_of_io_threads,custom_compatibilit
y_version from vm_static where vm_name='faye-pulanco.crazy.lab'"
 num_of_io_threads | custom_compatibility_version
-------------------+------------------------------
                 0 |
(1 row)


[1]: made change of iothreads
======================
[root@rhhi ansible]# ansible localhost -m ovirt_vms -e @cred.yml --args='auth={{ ovirt_auth }} name=faye-pulanco.crazy
.lab io_threads=1'       


[2]: verify the change
================
[root@rhhi ansible]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -t -c "select vm_configuration from snapshots where snapshot_type='NEXT_RUN' and vm_id=(select vm_guid from vm_static where vm_name='faye-pulanco.crazy.lab');"| cut -c 2- | xmllint --format - | grep -E "CompatibilityVersion|NumOfIoThreads"
    <NumOfIoThreads>1</NumOfIoThreads>
    <ClusterCompatibilityVersion>4.1</ClusterCompatibilityVersion>


[3]: increase ClusterCompatVersion using UI and verify the Change!

[root@rhhi ansible]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -t -c "select vm_configuration from snapshots where snapshot_type='NEXT_RUN' and vm_id=(select vm_guid from vm_static where vm_name='faye-pulanco.crazy.lab');"| cut -c 2- | xmllint --format - | grep -E "CompatibilityVersion|NumOfIoThreads"
    <NumOfIoThreads>0</NumOfIoThreads>
    <ClusterCompatibilityVersion>4.2</ClusterCompatibilityVersion>


==> the prior change is reverted.

(Originally by Steffen Froemer)

Comment 1 RHV bug bot 2019-06-24 20:36:03 UTC
I would suggest to document the behaviour, aka telling the customer to update the cluster compatibility level once all pending changes have been applied as otherwise the changes will be lost.

@Michal: Thoughts?

(Originally by Martin Tessun)

Comment 2 RHV bug bot 2019-06-24 20:36:04 UTC
(In reply to Martin Tessun from comment #1)
> I would suggest to document the behaviour, aka telling the customer to
> update the cluster compatibility level once all pending changes

in admin guide we say "After you update the cluster’s compatibility version, you must update the cluster compatibility version of all running or suspended virtual machines by restarting them from within the Manager"

I agree it could be more verbose and talk about implications on configuration changes, suggest to do it at earliest convenient maintenance window, etc


It's also fixable in code, but updating the NEXT_RUN instead of active configuration should still be discouraged as it increases the likelihood of failing to apply it later.

(Originally by michal.skrivanek)

Comment 3 RHV bug bot 2019-06-24 20:36:06 UTC
My suggestion to this would be to don't allow increase of cluster’s compatibility version if there are outstanding configuration changes of running virtual machines.

This also needs to be documented in update guide, that's impossible to perform the upgrade, if there are VMs left, which require a restart.

That would prevent customers from unwanted behavior.

(Originally by Steffen Froemer)

Comment 4 RHV bug bot 2019-06-24 20:36:07 UTC
(In reply to Steffen Froemer from comment #3)
> My suggestion to this would be to don't allow increase of cluster’s
> compatibility version if there are outstanding configuration changes of
> running virtual machines.
> 
> This also needs to be documented in update guide, that's impossible to
> perform the upgrade, if there are VMs left, which require a restart.
> 
> That would prevent customers from unwanted behavior.

Ack. Sounds good to me.

(Originally by Martin Tessun)

Comment 5 RHV bug bot 2019-06-24 20:36:09 UTC
it does sound to me too restrictive though.

(Originally by michal.skrivanek)

Comment 6 RHV bug bot 2019-06-24 20:36:11 UTC
Re-targeting to 4.3.1 since it is missing a patch, an acked blocker flag, or both

(Originally by Ryan Barry)

Comment 11 Liran Rotenberg 2019-07-14 11:27:17 UTC
Verified on:
ovirt-engine-4.3.5.4-0.1.el7.noarch

Steps:
1. Create a 4.2 cluster.
2. Create a VM with iothreads=0.
3. Run the VM.
4. Verify iothreads:
# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select num_of_io_threads,custom_compatibility_version from vm_static where vm_name='<VM_NAME>'"

5. Edit the VM to iothreads=1.
6. Verify next run change:
# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -t -c "select vm_configuration from snapshots where snapshot_type='NEXT_RUN' and vm_id=(select vm_guid from vm_static where vm_name='<VM_NAME>');"| cut -c 2- | xmllint --format - | grep -E "CompatibilityVersion|NumOfIoThreads"

7. Upgrade cluster to 4.3, the compatibility of the VM moved to 4.3.
8. Check next run again with the command of step 6.
9. Reboot VM.
10. Verify iothreads with the command of step 4.

Results:
In step 4 we have io threads disabled:
 num_of_io_threads | custom_compatibility_version 
-------------------+------------------------------
                 0 | 
(1 row)

After changing it, on next run we have(step 6):
    <NumOfIoThreads>1</NumOfIoThreads>
    <ClusterCompatibilityVersion>4.2</ClusterCompatibilityVersion>

When it changed to 4.3 (step 8):
    <NumOfIoThreads>1</NumOfIoThreads>
    <ClusterCompatibilityVersion>4.3</ClusterCompatibilityVersion>

After the reboot the VM had:
 num_of_io_threads | custom_compatibility_version 
-------------------+------------------------------
                 1 | 
(1 row)

Comment 13 errata-xmlrpc 2019-08-12 11:53:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2431

Comment 14 Daniel Gur 2019-08-28 13:13:44 UTC
sync2jira

Comment 15 Daniel Gur 2019-08-28 13:17:57 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.