Bug 1436247 - Can't turn stateless check box back for stateless VMs after making changes to their configuration and making a snapshot from them.
Summary: Can't turn stateless check box back for stateless VMs after making changes to...
Keywords:
Status: CLOSED DUPLICATE of bug 1438188
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.1.1.6
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Michal Skrivanek
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On: 1430009
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-27 13:49 UTC by Nikolai Sednev
Modified: 2017-04-05 07:14 UTC (History)
10 users (show)

Fixed In Version:
Clone Of: 1430009
Environment:
Last Closed: 2017-04-04 06:18:19 UTC
oVirt Team: Virt
Embargoed:
nsednev: planning_ack?
nsednev: devel_ack?
nsednev: testing_ack?


Attachments (Terms of Use)
Output from ovirt-shell showing Blank template and VM (4.47 KB, text/plain)
2017-03-27 17:13 UTC, Marcus Sundberg
no flags Details
Log trying to update VM first unsuccessfully, and then successfully (25.01 KB, text/plain)
2017-03-28 09:22 UTC, Marcus Sundberg
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 73802 0 None None None 2017-03-27 13:49:26 UTC
oVirt gerrit 73803 0 None None None 2017-03-27 13:49:26 UTC
oVirt gerrit 74251 0 None None None 2017-03-27 13:49:26 UTC
oVirt gerrit 74266 0 None None None 2017-03-27 13:49:26 UTC
oVirt gerrit 74338 0 None None None 2017-03-27 13:49:26 UTC

Description Nikolai Sednev 2017-03-27 13:49:27 UTC
+++ This bug was initially created as a clone of Bug #1430009 +++

It is possible to create a new VM based on some existing template. As well as, "admin" is able to change any parameter for new VM. If admin changes "Video type" from QXL to VGA then created VM fails to start.

ovirt-engine-4.1.1.3-0.1.el7.noarch

How reproducible: always

Steps to Reproduce:

1. Login to Admin portal.
2. Create a template. Video type must be QXL. Guest OS doesn't matter (Linux/Windows).
3. Start creation of a new VM.
4. Change template from "Blank" to template created at step #2.
5. Go to "Console" tab. Change "Video type" == "VGA".
6. Confirm creation of new VM. (OK)
7. Start created VM.

Actual results: Created VM fails to start.

--- Additional comment from Martin Betak on 2017-03-08 08:21:57 EST ---

@Andrei, can you please provide more complete engine log - and not just the immediate snippet? Thank you.

--- Additional comment from Andrei Stepanov on 2017-03-08 08:25 EST ---



--- Additional comment from Martin Betak on 2017-03-09 06:31:09 EST ---

Note for reproduction it is critical that the original template has not only QXL video type but also the USB Support set to ENABLED, since this was a bug related to treatment of USB controllers when changing the video type.

--- Additional comment from  on 2017-03-13 09:16:38 EDT ---



--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-03-13 09:30:23 EDT ---

This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

--- Additional comment from Nikolai Sednev on 2017-03-15 10:12:27 EDT ---

Is it possible that https://bugzilla.redhat.com/show_bug.cgi?id=1427104#c20 also related to this bug? I was not able to start VM even without changing it's video.

--- Additional comment from Michal Skrivanek on 2017-03-17 12:04:12 EDT ---

the fix seems to fix other upgrade-related problems in this area, the fix is low risk, better backport it to 4.1.1

--- Additional comment from  on 2017-03-17 15:20 EDT ---

It would be nice also test following scenario:

1. Change Blank template "USB Support" to Enabled (default is Disabled)
2. Create a VM using REST api and customize usb support to enable it.
   POST /vms
   <vm>
       <name>cute-vm</name>
	  <template>
	  	<name>Blank</name>
	  </template>
	  <cluster>
	  	<name>Default</name>
	  </cluster>
	 <usb>
             <enabled>false</enabled>
             <type>native</type>
         </usb>
         <!-- add bootable device customization -->
   </vm>
3. Run the vm

Actual result:
It fails in step 2. with stacktrace in attachment

Expected result:
VM boots

--- Additional comment from Yaniv Kaul on 2017-03-19 04:31:57 EDT ---

(In reply to Michal Skrivanek from comment #7)
> the fix seems to fix other upgrade-related problems in this area, the fix is
> low risk, better backport it to 4.1.1

It seems to have missed 4.1.1, setting to 4.1.2 for the time being. If it can make it to 4.1.1-1, even better.

--- Additional comment from Michal Skrivanek on 2017-03-20 06:49:51 EDT ---

JAkub, what about http://gerrit.ovirt.org/74266 ?

--- Additional comment from  on 2017-03-20 11:09:53 EDT ---

http://gerrit.ovirt.org/74266 and its 4.1 backport http://gerrit.ovirt.org/74338 will remain connected to this bug.

--- Additional comment from Nikolai Sednev on 2017-03-26 12:22:12 EDT ---

Works for me on these components on hosts:
ovirt-imageio-daemon-1.0.0-0.el7ev.noarch
ovirt-setup-lib-1.1.0-1.el7ev.noarch
ovirt-imageio-common-1.0.0-0.el7ev.noarch
sanlock-3.4.0-1.el7.x86_64
ovirt-vmconsole-1.0.4-1.el7ev.noarch
vdsm-4.19.10-1.el7ev.x86_64
ovirt-hosted-engine-ha-2.1.0.5-1.el7ev.noarch
ovirt-host-deploy-1.6.3-1.el7ev.noarch
qemu-kvm-rhev-2.6.0-28.el7_3.8.x86_64
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
ovirt-hosted-engine-setup-2.1.0.5-1.el7ev.noarch
libvirt-client-2.0.0-10.el7_3.5.x86_64
mom-0.5.9-1.el7ev.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch
Linux version 3.10.0-514.16.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Fri Mar 10 13:12:32 EST 2017
Linux 3.10.0-514.16.1.el7.x86_64 #1 SMP Fri Mar 10 13:12:32 EST 2017 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.3 (Maipo)

On engine:
rhevm-doc-4.1.0-2.el7ev.noarch
rhev-guest-tools-iso-4.1-4.el7ev.noarch
rhevm-4.1.1.6-0.1.el7.noarch
rhevm-branding-rhev-4.1.0-1.el7ev.noarch
rhevm-setup-plugins-4.1.1-1.el7ev.noarch
rhevm-dependencies-4.1.1-1.el7ev.noarch
Linux version 3.10.0-514.6.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Sat Dec 10 11:15:38 EST 2016
Linux 3.10.0-514.6.1.el7.x86_64 #1 SMP Sat Dec 10 11:15:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.3 (Maipo)

--- Additional comment from Marcus Sundberg on 2017-03-27 09:17 EDT ---

I get the 'At most one USB controller expected' error even with ovirt-engine
4.1.1.6.

I have a number of stateless VMs for testing. They are started and stopped
several times a day, and has been working fine even after upgrading to
oVirt 4.1.1.

Today however I needed to update the contents of the VMs, so I did the
following:

* Uncheck the Stateless box and save - ok.
* Start VMs - ok.
* Shutdown VMs - ok.
* Create snapshot - ok.
* Check the Stateless box and save - FAIL with attached log.

Now even if I do not change anything when editing the VM in the engine
GUI I get the same internal error when clicking Ok, and just doing:
update vm vmname
in ovirt-shell produces the same.

I have not changed the "Blank" template that the VMs are based on
for several months, but it has been changed in the past.

--- Additional comment from Marcus Sundberg on 2017-03-27 09:25:35 EDT ---

Trying to clone the VM from the newly created snapshot fails with the same
error. Cloning from a snapshot taken back in October works fine.

--- Additional comment from Nikolai Sednev on 2017-03-27 09:41:14 EDT ---

I did not worked with stateless VMs.
I think this is different issue as reproduction steps are different from the originally described.
Cloning this bug to new one.

Comment 1 Red Hat Bugzilla Rules Engine 2017-03-27 13:49:39 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 2 Red Hat Bugzilla Rules Engine 2017-03-27 13:49:39 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Nikolai Sednev 2017-03-27 13:54:51 UTC
https://bugzilla.redhat.com/attachment.cgi?id=1266651

Comment 4 Martin Betak 2017-03-27 16:13:13 UTC
Hi Marcus,

I tried to reproduce your issue, following your steps with changing the stateless flag after snapshot but didn't find any issues - I was able to snapshot, edit, run the VM, even create a new one from the snapshot and run that one.

Could you please provide logs or any more details about the VM (cluster version, ...)

Thank you very much in advance,

Martin

Comment 5 Marcus Sundberg 2017-03-27 17:13:05 UTC
Created attachment 1266720 [details]
Output from ovirt-shell showing Blank template and VM

This seems very related to bug #1430009. I noted that the "usb-enabled"
setting differed between the template and the VM, so on a chance I
switched usb-enabled from False to True in the VM, and that worked!

After doing that operation I can switch usb-enabled back to False,
or do any other changes in the VM including enabling stateless mode
and then disable stateless mode.

I would thus guess that something in the past brought the USB state
of these VMs to some weird state.

show vm vmname --all_content
looks exactly the same before and after "fixing" the VM as above,
so I'm not really sure where the state that makes things break
is stored? Let me know if you want some data dumped from postgresql,
as I still have couple of VMs in the broken state.

Comment 6 Nikolai Sednev 2017-03-28 09:10:40 UTC
Can you provide more details on reproduction please?
From which version you're making an upgrade to which version?
When did you made the initial template?

I've tried to reproduce this scenario on clean environment on 4.1.1.6 and not reproduced it.
I've followed these steps:
1. Login to Admin portal.
2. Create a template. Video type must be QXL. Guest OS doesn't matter (Linux/Windows).
3. Start creation of a new VM.
4. Change template from "Blank" to template created at step #2.
5. Go to "Console" tab. Change "Video type" == "VGA".
6. Confirm creation of new VM. (OK)
7. Start created VM.

Step 7 was successful.

Comment 7 Marcus Sundberg 2017-03-28 09:18:29 UTC
The system has been continuously upgraded from 3.5 to 3.6 to 4.0 to 4.1
over a couple of years. The template is the "Blank" template. I do not
have a changelog for it, but like I said it has not been changed for
several months. The problematic VMs were created in October, which means
we were then running 4.0.4.

Comment 8 Marcus Sundberg 2017-03-28 09:22:51 UTC
Created attachment 1266889 [details]
Log trying to update VM first unsuccessfully, and then successfully

Attaching log showing first the failing:
update vm vmname
and then the successful:
update vm vmname --usb-enabled True

Comment 9 Michal Skrivanek 2017-03-28 11:23:49 UTC
Hi Marcus,
db dump of or just a screenshot of VM Devices subtab of that VM would be great. 
Also, any chance you ever tried a PCI hotplug of a USB? Not even once or long time ago?
Thanks.

Comment 12 Marcus Sundberg 2017-03-29 09:58:31 UTC
Too late for a screenshot as we needed the VMs to be brought up and fixed
(using the fix described in comment #8). I do have the DB dump from
when the VMs were still broken, I'm however reluctant to make that publicly available as the complete list of VM names gives away internal information.
Could you provide an SQL query that I can run that would give you enough
information about the affected VMs without needing the entire DB dump?

Comment 13 Tomas Jelinek 2017-03-29 12:28:13 UTC
This should give us all the devices of the VM which failed from comment 8:

select d.* from vm_static s inner join vm_device d on s.vm_guid = d.vm_id where s.vm_guid = '5ccc7003-7a17-4d06-8f59-d1ffadd04646'

or, you can do: 
select d.* from vm_static s inner join vm_device d on s.vm_guid = d.vm_id where s.vm_name = 'THE VM NAME'

to filter by vm name.

Comment 14 Tomas Jelinek 2017-04-04 06:18:19 UTC
We hit the same issue on a different setup. The exact flow what happens is described in https://bugzilla.redhat.com/show_bug.cgi?id=1438188#c3

So, to not to track this same issue on 2 places, closing this bug as duplicate.

*** This bug has been marked as a duplicate of bug 1438188 ***

Comment 15 Nikolai Sednev 2017-04-04 09:17:37 UTC
My bug is the original and was reported on 2017-03-27 09:49 EDT by Nikolai Sednev and was opened earlier than 1438188, latest was opened on 2017-04-01 14:59 EDT by Israel Pinto.
Please close 1438188 as duplicate of 1436247 and not vice versa.

Comment 16 Tomas Jelinek 2017-04-05 07:14:21 UTC
the order of bugs opened is not relevant, the https://bugzilla.redhat.com/show_bug.cgi?id=1438188 is shorter and easier to navigate in, leaving that one opened.


Note You need to log in before you can comment on or make changes to this bug.