Description of problem: When executing changes in VMs that cannot be applied at runtime, a reboot of the VM do not trigger the apply of the changes on that VM Version-Release number of selected component (if applicable): RHV 4.2.6 How reproducible: always Steps to Reproduce: 1. turn on a VM with guest agents installed 2. execute change in the VM config that require a shutdown to be applied and produce the delta sign (like a VM rename or the change in the graphic subsystem from Spice to VNC) 3a. reboot the VM from guest 3b. reboot from the RHV Manager Actual results: the delta persist Expected results: RHV is smart enough to detect a reboot is on-going and to transform a reboot in a shutdown with subsequent poweron Additional info: This feature was declared as present in 4.2 but has not been demonstrated to be functional. This behaviour is needed in order to reduce the maintenance effort of large clusters operated by more than one person, to get advantage of guest reboot to apply changes on the VM.
Works for me correctly (VM without agent running, changed memory (&apply later), simulate Ctrl+Alt+Del in guest) Please add more details/exact steps It did not work for me when VM was not in Up state and clicked on Reboot in GUI (VM stayed down instead) - that's probably a bug: 2018-09-20 12:33:36,988+02 INFO [org.ovirt.engine.core.bll.RebootVmCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [8c643518-1762-4a21-8f84-84c3848a162d] Running command: RebootVmCommand internal: false. Entities affected : ID: 3371c2fe-d473-44dd-a1ff-5e3dada567fe Type: VMAction group REBOOT_VM with role type USER 2018-09-20 12:33:36,994+02 INFO [org.ovirt.engine.core.bll.RebootVmCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [8c643518-1762-4a21-8f84-84c3848a162d] VM 'test' is performing cold reboot; run once: 'false', running as volatile: 'false', has next run configuration: 'true' 2018-09-20 12:33:37,134+02 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [4ff4cb7e] Running command: ShutdownVmCommand internal: true. Entities affected : ID: 3371c2fe-d473-44dd-a1ff-5e3dada567fe Type: VMAction group SHUT_DOWN_VM with role type USER 2018-09-20 12:33:37,138+02 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [4ff4cb7e] Entered (VM 'test'). 2018-09-20 12:33:37,139+02 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [4ff4cb7e] Cannot shutdown VM 'test', status is not up. Stopping instead. 2018-09-20 12:33:37,261+02 INFO [org.ovirt.engine.core.bll.StopVmCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [329cef07] Running command: StopVmCommand internal: true. Entities affected : ID: 3371c2fe-d473-44dd-a1ff-5e3dada567fe Type: VMAction group STOP_VM with role type USER 2018-09-20 12:33:37,305+02 INFO [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [329cef07] START, DestroyVmVDSCommand( DestroyVmVDSCommandParameters:{hostId='ebe13310-0fef-45ce-af53-6488b6120959', vmId='3371c2fe-d473-44dd-a1ff-5e3dada567fe', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='false'}), log id: 26b37f32 2018-09-20 12:33:37,308+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [329cef07] START, DestroyVDSCommand(HostName = dev-25.rhev.lab.eng.brq.redhat.com, DestroyVmVDSCommandParameters:{hostId='ebe13310-0fef-45ce-af53-6488b6120959', vmId='3371c2fe-d473-44dd-a1ff-5e3dada567fe', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='false'}), log id: 3a4579e 2018-09-20 12:33:38,534+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [329cef07] FINISH, DestroyVDSCommand, return: , log id: 3a4579e 2018-09-20 12:33:38,534+02 INFO [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (EE-ManagedThreadFactory-engine-Thread-102967) [329cef07] FINISH, DestroyVmVDSCommand, return: , log id: 26b37f32 2018-09-20 12:33:38,538+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-2) [] VM '3371c2fe-d473-44dd-a1ff-5e3dada567fe' was reported as Down on VDS 'ebe13310-0fef-45ce-af53-6488b6120959'(dev-25.rhev.lab.eng.brq.redhat.com) 2018-09-20 12:33:38,539+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-2) [] START, DestroyVDSCommand(HostName = dev-25.rhev.lab.eng.brq.redhat.com, DestroyVmVDSCommandParameters:{hostId='ebe13310-0fef-45ce-af53-6488b6120959', vmId='3371c2fe-d473-44dd-a1ff-5e3dada567fe', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: 4ab995bb 2018-09-20 12:33:38,542+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-2) [] Failed to destroy VM '3371c2fe-d473-44dd-a1ff-5e3dada567fe' because VM does not exist, ignoring 2018-09-20 12:33:38,542+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-2) [] FINISH, DestroyVDSCommand, return: , log id: 4ab995bb 2018-09-20 12:33:38,542+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-2) [] VM '3371c2fe-d473-44dd-a1ff-5e3dada567fe'(test) moved from 'PoweringUp' --> 'Down' 2018-09-20 12:33:38,553+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-102967) [329cef07] EVENT_ID: USER_STOPPED_VM_INSTEAD_OF_SHUTDOWN(76), VM test was powered off ungracefully by admin@internal-authz (Host: dev-25.rhev.lab.eng.brq.redhat.com). 2018-09-20 12:33:38,573+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-2) [] EVENT_ID: VM_DOWN(61), VM test is down. 2018-09-20 12:33:38,608+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-102967) [329cef07] EVENT_ID: USER_REBOOT_VM(157), User admin@internal-authz initiated reboot of VM test. 2018-09-20 12:33:38,614+02 INFO [org.ovirt.engine.core.bll.ProcessDownVmCommand] (EE-ManagedThreadFactory-engine-Thread-102968) [55bbcd2e] Running command: ProcessDownVmCommand internal: true.
Two tests attempted: a) change of the compatibility version of the cluster and the datacenter. After this change, all the vm had the delta and under the tab snapshot there is a snapshot with the description: Next Run configuration snapshot b) rename of the VM in both cases the reboot initiated from the guest did not trigger the resolution of the delta. Is it possible that only certain type of pending changes (deltas) are addressed by the reboot transformed in stop + start? Please let me know if you could benefit from logs as well.
Hi Michal, I confirm that for your scenario the delta has been addressed correctly. The expectation is to have this behavior fully functional for every kind of delta, especially for compatibility version updates.
(In reply to Andrea Perotti from comment #4) > Hi Michal, I confirm that for your scenario the delta has been addressed > correctly. > > The expectation is to have this behavior fully functional for every kind of > delta, especially for compatibility version updates. so is there anything not working for you? The name was changed recently not to require a restart by popular demand, bug 1601514.
(In reply to Michal Skrivanek from comment #5) > The name was changed recently not to require a restart by popular demand, bug 1601514. And that was a great improvement, at the same time, a /!\ appears next to the VM when you change name, to remember that while the name is immediately changed, underlying log and process names are still the old ones, and a shutdown/startup is required to have complete coherence (name of VM + name in the logs + name in the qemu process). Another scenario not working as expected is when you change of the compatibility version of the cluster and the datacenter. After the change, all the vm had the /!\ and under the tab snapshot there is a snapshot with the description: Next Run configuration snapshot . In both situation, the reboot of a VM started from guest or from the Manager did not addressed the pending changes. Hoped this better clarify the current issue.
(In reply to Andrea Perotti from comment #6) > (In reply to Michal Skrivanek from comment #5) > > The name was changed recently not to require a restart by popular demand, bug 1601514. > > And that was a great improvement, at the same time, a /!\ appears next to > the VM when you change name, to remember that while the name is immediately > changed, underlying log and process names are still the old ones, and a > shutdown/startup is required to have complete coherence (name of VM + name > in the logs + name in the qemu process). I can see arguments for both behaviors...it was highly requested to not require a restart, so it's not doing it automatically > Another scenario not working as expected is when you change of the > compatibility version of the cluster and the datacenter. After the change, > all the vm had the /!\ and under the tab snapshot there is a snapshot with > the description: Next Run configuration snapshot . ok. I see. YEah, changes in Cluster are "fake" from the VM perspective, they don't change anything in the VM itself. It should work, so if it doesn't it is certainly a bug > In both situation, the reboot of a VM started from guest or from the Manager > did not addressed the pending changes. > > Hoped this better clarify the current issue. yes, thanks
ntoe the Cluster level upgrade is the only thing which should trigger that behavior - is it the change you've tried? And then is it reboot from guest or webadmin UI or both that doesn't trigger the cold reboot?
(In reply to Michal Skrivanek from comment #8) > ntoe the Cluster level upgrade is the only thing which should trigger that > behavior - is it the change you've tried? Yes, that is the situation where this has been experienced. > And then is it reboot from guest or webadmin UI or both that doesn't trigger the cold reboot? from the guest for sure, cannot guarantee if also webmin UI had the problem or not.
ok, then logs would be handy. If you change a cluster level then a guest-initiated reboot should trigger a cold reboot and shut down&power on instead. We set that property on the VM so it can be checked easily in its xml (take a look at that running VM and use virsh -r dumpxml <vm>", before and after cluster upgrade. you should see <ovirt-vm:destroy_on_reboot type="bool">False</ovirt-vm:destroy_on_reboot> changing into <ovirt-vm:destroy_on_reboot type="bool">True</ovirt-vm:destroy_on_reboot>. And if that is set and you initiate guest reboot it should go thorugh a cold reboot.
(In reply to Michal Skrivanek from comment #10) > If you change a cluster level then a > guest-initiated reboot should trigger a cold reboot and shut down&power on > instead. > We set that property on the VM so it can be checked easily in its xml (take > a look at that running VM and use virsh -r dumpxml <vm>", before and after > cluster upgrade. Exactly from which release this is supposed to be present and fully working? I've done some test on a RHV 4.1 env, upgraded in hosts and manager to RHV 4.2 with some VMs running on it. > you should see <ovirt-vm:destroy_on_reboot > type="bool">False</ovirt-vm:destroy_on_reboot> changing into > <ovirt-vm:destroy_on_reboot type="bool">True</ovirt-vm:destroy_on_reboot>. > And if that is set and you initiate guest reboot it should go thorugh a cold > reboot. We have seen 'destroy_on_reboot' set to True only when doing actions on VMs created when the cluster was already in 4.2 compatibility and so VMs were 4.2 type as well. With VMs with the delta triggered by the migration to cluster 4.2, so effectively VMs still in 4.1, 'destroy_on_reboot' is always False, even when doing the RAM increase with "apply later" flag. Michal, can you please confirm if this feature is supposed to work from 4.2 so *only* with manager in 4.2, hypervisors in 4.2, cluster in compatibility mode 4.2 and VMs started in 4.2 mode (not with delta). If you could confirm it, this would explain the experienced behaviour. thanks
(In reply to Andrea Perotti from comment #11) ot. > > We have seen 'destroy_on_reboot' set to True only when doing actions on VMs > created when the cluster was already in 4.2 compatibility and so VMs were > 4.2 type as well. yes, it's a 4.2 feature, bug 1519708 > > With VMs with the delta triggered by the migration to cluster 4.2, so > effectively VMs still in 4.1, 'destroy_on_reboot' is always False, even when > doing the RAM increase with "apply later" flag. > > Michal, can you please confirm if this feature is supposed to work from 4.2 > so *only* with manager in 4.2, hypervisors in 4.2, cluster in compatibility > mode 4.2 and VMs started in 4.2 mode (not with delta). If you could confirm > it, this would explain the experienced behaviour. yes. everything needs to be 4.2
This solve the mystery then. Thanks Michal for your support.