Bug 1557272

Summary: If VMs are paused while cluster compatiblity level is upgraded, they may receive 0 as maxmemory and fail to start
Product: [oVirt] ovirt-engine Reporter: Jiri Belka <jbelka>
Component: BLL.VirtAssignee: Michal Skrivanek <michal.skrivanek>
Status: CLOSED WORKSFORME QA Contact: meital avital <mavital>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.2CC: bugs, lleistne, lsvaty, mavital, ratamir, rbarry
Target Milestone: ovirt-4.4.0Keywords: Regression
Target Release: ---Flags: rule-engine: ovirt-4.4+
pm-rhel: blocker+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-20 14:40:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jiri Belka 2018-03-16 11:03:49 UTC
Description of problem:

By some miracle a few Windows VMs got this error during start:

~~~
VM brq-w2k8r2 is down with error. Exit message: XML error: both maximum memory size and memory slot count must be specified.
~~~

engine=# select vm_name,mem_size_mb,max_memory_size_mb from vms where vm_name like 'brq-w2k%';
   vm_name   | mem_size_mb | max_memory_size_mb 
-------------+-------------+--------------------
 brq-w2k12r2 |        4096 |              16384
 brq-w2k8r2  |        2048 |            4194304
(2 rows)

'brq-w2k12r2' memory issue was corrected by a collegue, current problematic - brq-w2k8r2 - got my attention and thus here is bug report.

No idea what did happen. Our env got odd after huge overload. We recently updated cluster compat level to 4.2 and the VMs probably were not power-cycled. The collegue says, those VMs are in paused state.

Version-Release number of selected component (if applicable):
ovirt-engine-4.2.2.1-0.1.el7.noarch

How reproducible:
1 in 1

Steps to Reproduce:
1.
2.
3.

Actual results:
vms got paused, got odd max memory size and cannot be started

Expected results:
working vms

Additional info:
- attachment upload in progress...

Comment 2 Michal Skrivanek 2018-04-16 12:29:17 UTC
I think it used to work, in case of 4TB the hot plug is supposed to be disabled

Comment 3 Michal Skrivanek 2018-04-16 13:00:37 UTC
*** Bug 1565627 has been marked as a duplicate of this bug. ***

Comment 4 Michal Skrivanek 2018-04-19 07:48:01 UTC
  <memory>2097152</memory>
  <currentMemory>2097152</currentMemory>
  <maxMemory slots="16">0</maxMemory>

Comment 5 Michal Skrivanek 2018-04-19 09:41:06 UTC
can you reproduce with debug logs enabled - just the VM run attempt?

Comment 7 Ryan Barry 2018-08-15 21:25:05 UTC
Deferring due to capacity and a lack of customer tickets so far

Comment 8 Red Hat Bugzilla Rules Engine 2018-08-15 21:25:13 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 9 Red Hat Bugzilla Rules Engine 2018-08-22 14:20:49 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 10 Ryan Barry 2019-01-21 13:34:21 UTC
Re-targeting, because these bugs either do not have blocker+, or do not have a patch posted

Comment 13 Michal Skrivanek 2019-03-14 13:22:09 UTC
right, I propose to close it. WDYS?

Comment 14 RHEL Program Management 2019-03-14 13:22:13 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 15 Ryan Barry 2019-03-19 10:14:44 UTC
Lukas, still reproducible?

Comment 16 Lucie Leistnerova 2019-03-20 14:40:10 UTC
I've tested cluster upgrade in 4.1.11 -> 4.2.8 -> 4.3.2 with paused VM and it always ran again without any problem. So this BZ can be closed and reopen, when the problem appears again.