Description of problem: In RHV 4.0 VM restarts are handled differently compared to earlier versions of RHEV - now VMs are performing so called 'cold reboot' which leads to situation where VM could restart on a different host than the one where VM has been running on before reboot, including RunOnce stateless (and stateful as well) VMs which were set by user to RunOnce on specific hypervisor. It's clear that the above-mentioned change in RHV 4.0 is supposed to address an issue with 'kickstart' use-case where both RunOnce boot order and/or temporarily attached installation media were preserved across reboots initiated from within VM and VM kept booting from installation media or PXE and avoiding boot from HD. But forcing VM to do a 'cold reboot' causes issues for 'VDS' use-case where customer wants to keep the ability to start the VMs on any hypervisor of his choice and keep them running there, and to avoid pinning VMs to the hosts just in order to guarantee that the VM would restart on the same hypervisor. Version-Release number of selected component (if applicable): RHV 4.0.0 and above Actual results: In a VDS scenario VM that was started on one hypervisor reboots on a different machine Expected results: Either introduce a way to force VM to do 'warm reboot' as in pre-RHV4.0 versions or provide a different way to force VM to reboot on the same host as where it was started.
Hi Igor, as far as I know a "reboot" of a VM does not kill/terminate the qemu process, but the reboot happens within this process. As such there is no changes in placement, as for RHV-M the VM is still "up and running" (unless the guest agent tells RHV-M otherwise). So I just tested this on my 4.0 setup. Scenario 1: - Start a RHEL7 VM with guest-agent installed and working. - Check the qemu process-ID and start time - Reboot the VM (from inside the guest) - Check the qemu process-ID and start time - Reboot the VM from RHV-M - Check the qemu process-ID and start time This whole procedure showed that the qemu process ID (as well as the start time of the process) did not change. So I tried another scenario with no guest-agent installed: Scenario 2: - Start a RHEL7 VM with no guest-agent running. - Check the qemu process-ID and start time - Reboot the VM (from inside the guest) - Check the qemu process-ID and start time - Reboot the VM from RHV-M - Check the qemu process-ID and start time This test did also behave the same way as the previous test (so no changes of qemu PID as it was never recycled). So could you please let us know, where you experienced the changed behaviour you are talking about, as I am not able to reproduce this behaviour with the above test scenario. Thanks! Martin
Hi Igor, just did some follow up tests after discussing this: Scenario 3: - Start a RHEL7 VM RunOnce pinned to a specific host. - Check the qemu process-ID and start time - Reboot the VM (from inside the guest) - Check the qemu process-ID and start time - Reboot the VM from RHV-M - Check the qemu process-ID and start time This one showed that rebooting from within the VM still did not change any settings, but booting from RHV-M itself, the VM was shutdown and the "RunOnce" was therefore no longer valid (and the VM was started with default configuration). So I believe (after re-reading your scenario) that the Manager initiated reboot for "RunOnce" VMs is the one you are talking about. The change for this was done on purpose to avoid e.g. installation or reboot loops, as you already mentioned. @Michal: Could it be possible and feasible for "RunOnce" VMs to just "clear" stuff like cloud-init disks, etc. (so mainly Boot Options, Linux Boot Options and Initial Run) but besides this, keep the other values intact (so stuff like System, Host, Console and Custom Properties). Another approach would be to have a switch for the reboot to either keep the RunOnce settings or to drop them with the reboot. After reading the BZs that Igor mentioned, also a simple eject of the attahced floppy or CDROM would do for these issues, or do I miss something. Personally I believe a reboot should still be a reboot and if something else is wanted, one should do a "powercycle", so maybe adding this as an alternative to the reboot would also be an option. (so you don't need to issue a shutdown and a start separately). Any further thoughts on this one?
(In reply to Martin Tessun from comment #4) > Hi Igor, > > just did some follow up tests after discussing this: > > Scenario 3: > - Start a RHEL7 VM RunOnce pinned to a specific host. > - Check the qemu process-ID and start time > - Reboot the VM (from inside the guest) > - Check the qemu process-ID and start time > - Reboot the VM from RHV-M > - Check the qemu process-ID and start time > > This one showed that rebooting from within the VM still did not change any > settings, but booting from RHV-M itself, the VM was shutdown and the > "RunOnce" was therefore no longer valid (and the VM was started with default > configuration). How is it different from previous test? In previous Scenarios the same reboot from RHV-M did not perform cold reboot?
(In reply to Michal Skrivanek from comment #9) > (In reply to Martin Tessun from comment #4) > > Hi Igor, > > > > just did some follow up tests after discussing this: > > > > Scenario 3: > > - Start a RHEL7 VM RunOnce pinned to a specific host. > > - Check the qemu process-ID and start time > > - Reboot the VM (from inside the guest) > > - Check the qemu process-ID and start time > > - Reboot the VM from RHV-M > > - Check the qemu process-ID and start time > > > > This one showed that rebooting from within the VM still did not change any > > settings, but booting from RHV-M itself, the VM was shutdown and the > > "RunOnce" was therefore no longer valid (and the VM was started with default > > configuration). > > How is it different from previous test? In previous Scenarios the same > reboot from RHV-M did not perform cold reboot? Correct. Previously (RHV 3.x), if you selected "Reboot" in the WebUI it did a warm reboot, so no machine settings did change. So having a differentiator for warm vs. cold reboots would be great to mimic the old behaviour again (and helping customers to easier migrate their automated setups to RHV 4.x)
(In reply to Martin Tessun from comment #10) > (In reply to Michal Skrivanek from comment #9) > > (In reply to Martin Tessun from comment #4) > > > Hi Igor, > > > > > > just did some follow up tests after discussing this: > > > > > > Scenario 3: > > > - Start a RHEL7 VM RunOnce pinned to a specific host. > > > - Check the qemu process-ID and start time > > > - Reboot the VM (from inside the guest) > > > - Check the qemu process-ID and start time > > > - Reboot the VM from RHV-M > > > - Check the qemu process-ID and start time > > > > > > This one showed that rebooting from within the VM still did not change any > > > settings, but booting from RHV-M itself, the VM was shutdown and the > > > "RunOnce" was therefore no longer valid (and the VM was started with default > > > configuration). > > > > How is it different from previous test? In previous Scenarios the same > > reboot from RHV-M did not perform cold reboot? > > Correct. Previously (RHV 3.x), if you selected "Reboot" in the WebUI it did > a warm reboot, so no machine settings did change. are you sure? It still does warm reboot, only in 4.0 it performs a cold reboot when the warm one fails/times out. This was implemented in bug 751854 / bug 1054070. > So having a differentiator for warm vs. cold reboots would be great to mimic > the old behaviour again (and helping customers to easier migrate their > automated setups to RHV 4.x) There are couple of abandoned patches in the bugs above, but it didn't happen back then due to lack of consensus on behavior.
(In reply to Michal Skrivanek from comment #11) > (In reply to Martin Tessun from comment #10) > > (In reply to Michal Skrivanek from comment #9) > > > (In reply to Martin Tessun from comment #4) > > > > Hi Igor, > > > > > > > > just did some follow up tests after discussing this: > > > > > > > > Scenario 3: > > > > - Start a RHEL7 VM RunOnce pinned to a specific host. > > > > - Check the qemu process-ID and start time > > > > - Reboot the VM (from inside the guest) > > > > - Check the qemu process-ID and start time > > > > - Reboot the VM from RHV-M > > > > - Check the qemu process-ID and start time > > > > > > > > This one showed that rebooting from within the VM still did not change any > > > > settings, but booting from RHV-M itself, the VM was shutdown and the > > > > "RunOnce" was therefore no longer valid (and the VM was started with default > > > > configuration). > > > > > > How is it different from previous test? In previous Scenarios the same > > > reboot from RHV-M did not perform cold reboot? > > > > Correct. Previously (RHV 3.x), if you selected "Reboot" in the WebUI it did > > a warm reboot, so no machine settings did change. > > are you sure? It still does warm reboot, only in 4.0 it performs a cold > reboot when the warm one fails/times out. This was implemented in bug 751854 > / bug 1054070. In 4.1 it does work this way. So I would suggest to CLOSE CURRENTRELEASE here. Please reopen if it doesn't work this way for you. > > > So having a differentiator for warm vs. cold reboots would be great to mimic > > the old behaviour again (and helping customers to easier migrate their > > automated setups to RHV 4.x) > > There are couple of abandoned patches in the bugs above, but it didn't > happen back then due to lack of consensus on behavior.
Hi, i have tested this in rhevm-4.1.4.2-0.1.el7.noarch vdsm-4.19.24-1.el7ev.x86_64 my results are - Start a RHEL7 VM RunOnce pinned to a specific host. - Check the qemu process-ID and start time - Reboot the VM (from inside the guest) - Check the qemu process-ID and start time RESULT - warm reboot, vm process pid remains, "run once" mode remains - Reboot the VM from RHV-M - Check the qemu process-ID and start time RESULT - cold reboot - vm was restarted on the other node without "run once mode" Thus reopening the RFE
(In reply to Marian Jankular from comment #13) > Hi, > > i have tested this in > > rhevm-4.1.4.2-0.1.el7.noarch > vdsm-4.19.24-1.el7ev.x86_64 > > my results are > > - Start a RHEL7 VM RunOnce pinned to a specific host. > - Check the qemu process-ID and start time > - Reboot the VM (from inside the guest) > - Check the qemu process-ID and start time > > RESULT - warm reboot, vm process pid remains, "run once" mode remains > > > - Reboot the VM from RHV-M > - Check the qemu process-ID and start time > > RESULT - cold reboot - vm was restarted on the other node without "run once > mode" > > Thus reopening the RFE works for me. Add logs and more details then. Check specifically that the guest OS does react to request in correct way (ACPI enabled, and/or ovirt-guest-agent)
ping
Created attachment 1324870 [details] engine, vdsm, qemu logs, ovirt agent logs vm_guid | vm_name --------------------------------------+----------- 20cbb3fc-b6e4-4601-9784-3df77789d41e | rhel-test i did new test with rhel 7.4 (last time it was centos so no agents installed) current results: - Start a RHEL7 VM RunOnce pinned to a specific host. - Check the qemu process-ID and start time - Reboot the VM (from inside the guest) - Check the qemu process-ID and start time RESULT - warm reboot, vm process pid remains, "run once" mode remains - Reboot the VM from RHV-M - Check the qemu process-ID and start time RESULT - cold reboot - vm was restarted on the same node without "run once mode" attaching the logs
thanks. next time please narrow down the occurrence (VM name, time frame) We may want to improve logging, but for RunOnce the behavior is to perform a cold reboot. If it doesn't reproduce with regular runs then this is not a bug
vm: rhel-test 1st reboot 2017-09-12 13:13:06 - 2017-09-12 13:14:09 2nd reboot 2017-09-12 14:32:10 - 2017-09-12 14:32:49
Im sorry, I have forgotten about a feature called "volatile run". What it does is this: - in the run once dialog there is a new option called "Trap guest reboots" - in API it is called "volatile" - by default it is false (both in API and in the UI) - if it is false, the guest reboot will not trigger cold reboots - if it is true, it will So basically the only non-configurable option here is the reboot from the webadmin, which is always cold. I would propose to make this "volatile run" option persisted in DB so it will apply both for warm and cold reboots.
To summarize this BZ and an offline discussion with Martin: - Everything here is about VMs running as run once. - in 4.1: - the reboot from Web-UI/REST is always cold - the reboot from inside of guest is always warm - in 4.2.alpha: - the reboot from Web-UI/REST is always cold - the reboot from inside the guest is configurable in run once dialog/REST. The option is called "Trap guest reboots". - the proposed patch (https://gerrit.ovirt.org/#/c/82652/) renames the option to "Preserve this configuration during reboots" and making both the reboots from inside of guest and from Web-UI/REST configurable by the same option. It is easy to merge this patch to master but depends on some code from 4.2 making it not-so-easy to backport. Martin has proposed to merge https://gerrit.ovirt.org/#/c/82652/ to master (getting it in 4.2) and don't backport it. Is it acceptable?
Verify with: Version 4.2.0-0.5.master.el7 Steps: Polarion test case RHEVM3/workitem?id=RHEVM-24361 Polarion test case RHEVM3/workitem?id=RHEVM-24251 Polarion test case RHEVM3/workitem?id=RHEVM-23495 Results: PASS
*** Bug 1519708 has been marked as a duplicate of this bug. ***
RUN: https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/testrun?id=12121&tab=records&result=passed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488
BZ<2>Jira Resync