Bug 1723578

Summary: [downstream clone - 4.3.5] Increase of ClusterCompatibilityVersion to Cluster with virtual machines with outstanding configuration changes, those changes will be reverted
Product: Red Hat Enterprise Virtualization Manager Reporter: RHV bug bot <rhv-bugzilla-bot>
Component: ovirt-engineAssignee: Shmuel Melamud <smelamud>
Status: CLOSED ERRATA QA Contact: Liran Rotenberg <lrotenbe>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.0CC: michal.skrivanek, mtessun, rbarry, Rhev-m-bugs
Target Milestone: ovirt-4.3.5Keywords: ZStream
Target Release: 4.3.5Flags: lsvaty: testing_plan_complete-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-engine-4.3.5.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1650505 Environment:
Last Closed: 2019-08-12 11:53:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1650505    
Bug Blocks:    

Description RHV bug bot 2019-06-24 20:36:00 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1650505 +++
======================================================================

Description of problem:
If one have virtual machines with outstanding configuration changes, which require the virtual machine to reboot and the Cluster CompatibilityVersion will be upgraded to a newer version, it will overwrite the change and will use the current active configuration instead.


Version-Release number of selected component (if applicable):
ovirt-engine-4.2.6.4-0.1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Make change to virtual machine (ex. increase iothread through api-call)
2. increase cluster version


Actual results:
The change is lost on upgrade cluster CompatibilityVersion

Expected results:
Beside the new ClusterCompatibility Version, the change should also included in the next_run configuration.
If not possible, Customers should made aware if increase Cluster Version on running virtual machines, any configuration change which is not already applied will be lost.

Additional info:
[root@rhhi ansible]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select num_of_io_threads,custom_compatibilit
y_version from vm_static where vm_name='faye-pulanco.crazy.lab'"
 num_of_io_threads | custom_compatibility_version
-------------------+------------------------------
                 0 |
(1 row)


[1]: made change of iothreads
======================
[root@rhhi ansible]# ansible localhost -m ovirt_vms -e @cred.yml --args='auth={{ ovirt_auth }} name=faye-pulanco.crazy
.lab io_threads=1'       


[2]: verify the change
================
[root@rhhi ansible]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -t -c "select vm_configuration from snapshots where snapshot_type='NEXT_RUN' and vm_id=(select vm_guid from vm_static where vm_name='faye-pulanco.crazy.lab');"| cut -c 2- | xmllint --format - | grep -E "CompatibilityVersion|NumOfIoThreads"
    <NumOfIoThreads>1</NumOfIoThreads>
    <ClusterCompatibilityVersion>4.1</ClusterCompatibilityVersion>


[3]: increase ClusterCompatVersion using UI and verify the Change!

[root@rhhi ansible]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -t -c "select vm_configuration from snapshots where snapshot_type='NEXT_RUN' and vm_id=(select vm_guid from vm_static where vm_name='faye-pulanco.crazy.lab');"| cut -c 2- | xmllint --format - | grep -E "CompatibilityVersion|NumOfIoThreads"
    <NumOfIoThreads>0</NumOfIoThreads>
    <ClusterCompatibilityVersion>4.2</ClusterCompatibilityVersion>


==> the prior change is reverted.

(Originally by Steffen Froemer)

Comment 1 RHV bug bot 2019-06-24 20:36:03 UTC
I would suggest to document the behaviour, aka telling the customer to update the cluster compatibility level once all pending changes have been applied as otherwise the changes will be lost.

@Michal: Thoughts?

(Originally by Martin Tessun)

Comment 2 RHV bug bot 2019-06-24 20:36:04 UTC
(In reply to Martin Tessun from comment #1)
> I would suggest to document the behaviour, aka telling the customer to
> update the cluster compatibility level once all pending changes

in admin guide we say "After you update the cluster’s compatibility version, you must update the cluster compatibility version of all running or suspended virtual machines by restarting them from within the Manager"

I agree it could be more verbose and talk about implications on configuration changes, suggest to do it at earliest convenient maintenance window, etc


It's also fixable in code, but updating the NEXT_RUN instead of active configuration should still be discouraged as it increases the likelihood of failing to apply it later.

(Originally by michal.skrivanek)

Comment 3 RHV bug bot 2019-06-24 20:36:06 UTC
My suggestion to this would be to don't allow increase of cluster’s compatibility version if there are outstanding configuration changes of running virtual machines.

This also needs to be documented in update guide, that's impossible to perform the upgrade, if there are VMs left, which require a restart.

That would prevent customers from unwanted behavior.

(Originally by Steffen Froemer)

Comment 4 RHV bug bot 2019-06-24 20:36:07 UTC
(In reply to Steffen Froemer from comment #3)
> My suggestion to this would be to don't allow increase of cluster’s
> compatibility version if there are outstanding configuration changes of
> running virtual machines.
> 
> This also needs to be documented in update guide, that's impossible to
> perform the upgrade, if there are VMs left, which require a restart.
> 
> That would prevent customers from unwanted behavior.

Ack. Sounds good to me.

(Originally by Martin Tessun)

Comment 5 RHV bug bot 2019-06-24 20:36:09 UTC
it does sound to me too restrictive though.

(Originally by michal.skrivanek)

Comment 6 RHV bug bot 2019-06-24 20:36:11 UTC
Re-targeting to 4.3.1 since it is missing a patch, an acked blocker flag, or both

(Originally by Ryan Barry)

Comment 11 Liran Rotenberg 2019-07-14 11:27:17 UTC
Verified on:
ovirt-engine-4.3.5.4-0.1.el7.noarch

Steps:
1. Create a 4.2 cluster.
2. Create a VM with iothreads=0.
3. Run the VM.
4. Verify iothreads:
# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select num_of_io_threads,custom_compatibility_version from vm_static where vm_name='<VM_NAME>'"

5. Edit the VM to iothreads=1.
6. Verify next run change:
# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -t -c "select vm_configuration from snapshots where snapshot_type='NEXT_RUN' and vm_id=(select vm_guid from vm_static where vm_name='<VM_NAME>');"| cut -c 2- | xmllint --format - | grep -E "CompatibilityVersion|NumOfIoThreads"

7. Upgrade cluster to 4.3, the compatibility of the VM moved to 4.3.
8. Check next run again with the command of step 6.
9. Reboot VM.
10. Verify iothreads with the command of step 4.

Results:
In step 4 we have io threads disabled:
 num_of_io_threads | custom_compatibility_version 
-------------------+------------------------------
                 0 | 
(1 row)

After changing it, on next run we have(step 6):
    <NumOfIoThreads>1</NumOfIoThreads>
    <ClusterCompatibilityVersion>4.2</ClusterCompatibilityVersion>

When it changed to 4.3 (step 8):
    <NumOfIoThreads>1</NumOfIoThreads>
    <ClusterCompatibilityVersion>4.3</ClusterCompatibilityVersion>

After the reboot the VM had:
 num_of_io_threads | custom_compatibility_version 
-------------------+------------------------------
                 1 | 
(1 row)

Comment 13 errata-xmlrpc 2019-08-12 11:53:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2431

Comment 14 Daniel Gur 2019-08-28 13:13:44 UTC
sync2jira

Comment 15 Daniel Gur 2019-08-28 13:17:57 UTC
sync2jira