1. Proposed title of this feature request Allow virtual machines to stop automatically with the Host. This feature can be found in VMware: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=850 https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1008182 3. What is the nature and description of the request? When a host is internally shutdown or rebooted, all the Virtual Machines are killed. This is not desirable considering RHEV is an Enterprise solution and the reliability of the Virtual Machines can be affected by ungraceful shutdowns (aka killed). 4. Why does the customer need this? In cases of emergency, non RHEV admins may be required to shutdown the hosts to preserve the Data Center. These people may not the technical capabilities or the credentials to login to RHEV-M and start this process gracefully. Also, when backup power is installed in the hosts, it may initiate the shutdown process, perhaps an ACPI signal. Currently it is impossible to achieve any good solution with systemd services and API calls (see other information below). In short, this is an area RHEV is lacking and we must start working on it, later we can add more features such as integrating nicely with power solutions etc. But there must be a starting point and it must be easy so people can actually use it (via RHEV-M GUI). 5. How would the customer like to achieve this? In RHEV-M GUI, have options to manage shutdown (and maybe start) options for hosts. -> Gracefully shutdown all VMs that are running, updating RHEV-M DB -> Switch to maintenance mode, updating RHEV-M DB -> Perhaps have a list of VM the host will try to start when it comes up? 7. Is there already an existing RFE upstream or in Red Hat Bugzilla? Mostly about starting the VMs automatically. In the RFE being proposed here it's mostly about shutdown and it's problems. BZ 1108678 - [RFE] Automatic VM start after a power outage 10. List any affected packages or components. Filled under engine but I can see libvirt, vdsm and ovirt-engine changes. Other information: Well, I tried to achieve this by using API calls from the host to RHEV-M. The idea was: A) systemd service that on shutdown would call the RHEV API to get a list of VMs it has running on it B) Issue API calls to shutdown all these VMs C) Put the Host into Maintenance mode D) Continue shutdown process Problems: 1) Scopes My systemd service was always called AFTER the VMs were destroyed, even if it was 'After' vdsmd+libvirt and tons of targets. This is due to libvirt. When it creates a VM, it tells systemd to assign a scope to it. To make the destruction of the scope dependent on another service, libvirt needs to specifically tell it to systemd. Current implementation is hardcoded in libvirt as 'libvirt-guests.sevice". So in RHEV, these scopes do not depend on anything, so systemd shuts them down quite early in the shutdown process, killing all the VMs. [ 175.376302] systemd-logind[489]: Power key pressed. [ 175.376312] systemd-logind[489]: Powering Off... ... [ 175.388845] systemd[1]: systemd-machined.service changed running -> stop-sigterm [ 175.388854] systemd[1]: Stopping Virtual Machine and Container Registration Service... [ 175.389604] systemd[1]: machine-qemu\x2d2\x2dCentOS7.scope changed running -> stop-sigterm [ 175.389613] systemd[1]: Stopping Virtual Machine qemu-2-CentOS7. Looking at libvirt code, it is hard-coded to make this scopes depend only on libvirt-guests.service. int virSystemdCreateMachine(const char *name, ... { .... if (virDBusCallMethod(conn, NULL, &error, "org.freedesktop.machine1", "/org/freedesktop/machine1", "org.freedesktop.machine1.Manager", "CreateMachineWithNetwork", "sayssusa&ia(sv)", name, 16, uuid[0], uuid[1], uuid[2], uuid[3], uuid[4], uuid[5], uuid[6], uuid[7], uuid[8], uuid[9], uuid[10], uuid[11], uuid[12], uuid[13], uuid[14], uuid[15], creatorname, iscontainer ? "container" : "vm", (unsigned int)pidleader, rootdir ? rootdir : "", nnicindexes, nicindexes, 3, "Slice", "s", slicename, "After", "as", 1, "libvirtd.service", "Before", "as", 1, "libvirt-guests.service") < 0) ... } So systemd won't care about my service depending on vdsm because the VMs are assigned to these scopes and not the vdsm service. Result: only the libvirt-guests.service is able to do something here. 2) VDSM To the point: vdsm does not like the libvirt-guests.service. [Unit] Description=Virtual Desktop Server Manager ... Conflicts=libvirt-guests.service ... So from systemd shutdown dependencies side the only thing can could help to make the shutdown more graceful here is having the libvirt-guests.service modified to issue a bunch of API calls. But it conflicts with vdsm. From my experience systemd shuts down the libvirt-guests.service during runtime (not at shutdown/reboot), which triggers the API calls and shuts down all VMs and puts the Host in Maintenance mode, but the host is not shutting down or rebooting. Kind of a chicken and egg problem. So, why not have a nice feature, where all this could be done in RHEV-M GUI, without having to recompile libvirt or changing vdsm, messing with systemd services or writing API scripts. Similar to what VMware have in it's interface (check both links at the top of the RFE).
Maybe a good first step would be to get this hard-coded string "Before", "as", 1, "libvirt-guests.service") < 0) <----------- out of libvirt and make it customizable. Or add another 'Before' line for a new 'rhev-guests.service'.
(In reply to Germano Veit Michel from comment #2) > Maybe a good first step would be to get this hard-coded string > > "Before", "as", 1, "libvirt-guests.service") < 0) <----------- > > out of libvirt and make it customizable. Or add another 'Before' line for a > new 'rhev-guests.service'. The above sounds like a libvirt issue, then?
(In reply to Yaniv Kaul from comment #3) > The above sounds like a libvirt issue, then? Yes, correct. I even considered opening a separate BZ for that. As this RFE will likely spawn a few other child RFE (libvirt, vdsm, ..) I would prefer to not get ahead of myself and let engineering properly decide what needs to be done in each component. So I created just this one, as a feature request that has the ultimate goal of having a nice, usable feature. And also describing the problems I found to provide some help/context. We could go ahead an open the libvirt specific BZ, but first it would be good to have someone else assess what exactly would be the best implementation on libvirt's side to help this RFE.
Michal? Isn't this a Virt area?
Thinking about a bit with Moran and msivak, it seems to me that a feasible approach would be: - put a message to /etc/motd that this host is part of xyz environment and should be put to maintenance before any actual system maintenance - drop in a shell alias to echo a similar message when admin/root types poweroff, shutdown, reboot. Should be fairly simple to add to host deploy and NGN, what do you think, Sandro? other alternatives to actually block it from happening are way more complicated to do. The above should be good enough for most cases
(In reply to Michal Skrivanek from comment #6) > Thinking about a bit with Moran and msivak, it seems to me that a feasible > approach would be: > > - put a message to /etc/motd that this host is part of xyz environment and > should be put to maintenance before any actual system maintenance > - drop in a shell alias to echo a similar message when admin/root types > poweroff, shutdown, reboot. > > Should be fairly simple to add to host deploy and NGN, what do you think, > Sandro? > > other alternatives to actually block it from happening are way more > complicated to do. The above should be good enough for most cases What about setting systemd to disable poweroff and shutdown[1] and re-enable it on host moved to maintenance? [1] https://www.freedesktop.org/software/systemd/man/systemd-inhibit.html Should probably be handled by VDSM/infra in this case.
Hm, that would be great, but...it seems to me it assumes a command to execute. And would require manipulation from vdsm. Note that currently vdsm is not aware of its maintenance status Sounds too complex to me...
Just to summarize what I understood so far and to have a discussion starting point here. vdsm does disable the libvirt-guests.service. This leads to systemd killing the VMs. The default behaviour of the guests service is to suspend the VMs in case of a shutdown (which is in any case not wanted by vdsm/RHV). According to the service definition this can be changed in /etc/sysconfig/libvirt-guests: ON_SHUTDOWN=shutdown Also the parallel shutdown could be increased. (PARALLEL_SHUTDOWN=<n>) If I understood this correctly, using a configured libvirt-guests service could easily solve this issue, as a reboot would then do the following: 1. Shutdown the VMs via the libvirt-guest service 2. Shutdown libvirt and vdsm ---> here eventually fencing comes into place, but the VMs should already been stopped. For the nicer approach we can add some UI elements that would change the config of libvirt-guests (disable/enable it). So the implementation could be: - vdsm-config configures libvirt-guest service to shutdown VMs instead of suspending thenm. - Enable/Disable the service can be achieved via the UI by selecting a checkbox like "Gracefully shutdown VMs in case of external Poweroff". If this is checked the libvirt-guest service is enabled, otherwise it is disabled. Thoughts?
Forgot to mention: OB_BOOT=ignore needs also be set so that the VMs are not restarted by libvirt during restart, as this could lead to VMs running more than once.
I see no use of keeping the current behavior, a shutdown should be the default. The suspend is useless as it won't work when initiated by libvirt. Another more complicated enhancement is to trigger a host maintenance instead....but the shutdown seems like a good enough starting point.
please make sure to verify also the case explained in https://bugzilla.redhat.com/show_bug.cgi?id=1485967
*** Bug 1485967 has been marked as a duplicate of this bug. ***
Depends on https://github.com/oVirt/ovirt-ansible/pull/62 Cannot be tested before this gets merged.
Hey Petr I'm reviewing the Doc Text that you wrote, and also, where required, opening documentation bugs to covers RFEs. In the doc text, don't you think we should mention that the solution should be enabled/disabled via the UI? As discussed here: https://bugzilla.redhat.com/show_bug.cgi?id=1334982#c10
Another question Petr. Does the 5 second configurable delay that you mentioned a new feature? I don't really understand the connection with what you wrote and what is written by Martin wrote here: https://bugzilla.redhat.com/show_bug.cgi?id=1334982#c10 Can you please clarify exactly what changes will be made. Thanks!
Hi Emma, thanks for the questions. > In the doc text, don't you think we should mention that the solution should be enabled/disabled via the UI? Currently, there is no GUI to enable or disable this functionality. The feature is on by default. > Does the 5 second configurable delay that you mentioned a new feature? This is not a feature, although it is a tweakable parameter. I am not expecting this to change a lot. The 5 seconds delay exists only to allow engine finish all communication before the host machine shuts down.
I am testing on version: 4.2.0-0.5.master.el7 (build 7) I set the host_deploy_shutdown_timeout to 6000 I run the first case with 2 VMs on same host and run command "poweroff -f" on the host. (see https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/wiki/Compute/_RFE_%20Gracefully%20shutdown%20Virtual%20Machines%20on%20Host%20reboot_shutdown_) I see that the VMs did not stopped by vdsm before host powered off. From engine log i see that VMs (test_vm, test_vm_2) is down with error: 2017-12-04 10:48:30,206+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-58) [] EVENT_ID: VM_DOWN_ERROR(119), VM test_vm_2 is down with error. Exit message: VM terminated with error. 2017-12-04 10:48:31,011+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-58) [] VM '635a2291-f060-48a0-9fce-c4c05d6722d8'(test_vm) moved from 'Unknown' --> 'Down' 2017-12-04 10:48:31,039+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-58) [] EVENT_ID: VM_DOWN_ERROR(119), VM test_vm is down with error. Exit message: VM terminated with error. From my script I see that VMs running till host and vdsm went down: (Sampling 3 time in second) check #225 10:44:07 Id Name State ---------------------------------------------------- 1 test_vm_2 running 2 test_vm running vdsm: Active: active (running) since Mon 2017-12-04 10:40:08 IST; 3min 59s ago ================================================ check #226 10:44:07 Id Name State ---------------------------------------------------- 1 test_vm_2 running 2 test_vm running vdsm: Active: active (running) since Mon 2017-12-04 10:40:08 IST; 3min 59s ago ================================================ check #227 10:44:07 Id Name State ---------------------------------------------------- 1 test_vm_2 running 2 test_vm running vdsm: Active: active (running) since Mon 2017-12-04 10:40:08 IST; 4min 0s ago ================================================ check #228 10:44:08 Id Name State ---------------------------------------------------- 1 test_vm_2 running 2 test_vm running vdsm: Active: active (running) since Mon 2017-12-04 10:40:08 IST; 4min 0s ago ================================================
(In reply to Israel Pinto from comment #22) please add logs. Namely journal from host and vdsm.log, and qemu log for one of those VMs
Created attachment 1365319 [details] logs(engine,vdsm, vms-qemu, script log)
still need journal and exact time you run poweroff
Created attachment 1365484 [details] logs(engine,vdsm, vms-qemu, script log, journalctl)
That doesn't seem to capture the time you did the test. Please note down when you did things and doublecheck timestamps in attached logs
1. poweroff command date: [root@puma42 home]# date Sun Dec 10 08:58:16 IST 2017 [root@puma42 home]# poweroff -f Powering off. Michal i see the jump in the time also in /var/log/messages I did the poweroff at 08:58:16 but i don't see this time not in journal and /var/log/message. If you need i will re-run it on different host.
Update: Engine: Software Version:4.2.1.6-0.1.el7 Host: OS Version:RHEL - 7.4 - 18.el7 Kernel Version: 3.10.0 - 693.17.1.el7.x86_64 KVM Version: 2.9.0 - 16.el7_4.14 LIBVIRT Version:libvirt-3.9.0-8.el7 VDSM Version: vdsm-4.20.17-1.el7ev Steps: 1. Create vm pool with 10 vms 2. Shutdown host The vms are down with error [root@compute-ge-6 ovirt-engine]# grep -ir test-2 engine.log:2018-02-07 10:07:07,563+02 INFO [org.ovirt.engine.core.bll.AddVmCommand] (default task-7) [283f0642-f07b-47ed-9e78-8326598ed7c5] Lock Acquired to object 'EngineLock:{exclusiveLocks='[test-2=VM_NAME]', sharedLocks='[ad6dc22c-9274-4664-a6e6-e1a23e57de3d=VM_POOL, c6080af6-8fe5-48d0-a8c4-99c33a644f27=TEMPLATE, e778d44a-a7a9-4eec-8481-0fad5f008f13=DISK]'}' engine.log:2018-02-07 10:07:09,667+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [22859810] EVENT_ID: USER_ADD_VM_STARTED(37), VM test-2 creation was initiated by admin@internal-authz. engine.log:2018-02-07 10:07:29,123+02 INFO [org.ovirt.engine.core.bll.AddVmCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] Lock freed to object 'EngineLock:{exclusiveLocks='[test-2=VM_NAME]', sharedLocks='[ad6dc22c-9274-4664-a6e6-e1a23e57de3d=VM_POOL, c6080af6-8fe5-48d0-a8c4-99c33a644f27=TEMPLATE, e778d44a-a7a9-4eec-8481-0fad5f008f13=DISK]'}' engine.log:2018-02-07 10:07:29,154+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] EVENT_ID: USER_ADD_VM_FINISHED_SUCCESS(53), VM test-2 creation has been completed. engine.log:2018-02-07 11:02:22,614+02 INFO [org.ovirt.engine.core.vdsbroker.CreateVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1399) [20dd9e65-48d0-4f59-a52c-71857bd0134d] START, CreateVDSCommand( CreateVDSCommandParameters:{hostId='4a284ac7-8075-430b-98ee-c7c45d1cc04a', vmId='7dbb3560-8299-44fc-a97d-d669ac7dd31e', vm='VM [test-2]'}), log id: 69c52db4 engine.log:2018-02-07 11:02:22,617+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1399) [20dd9e65-48d0-4f59-a52c-71857bd0134d] START, CreateBrokerVDSCommand(HostName = host_mixed_1, CreateVDSCommandParameters:{hostId='4a284ac7-8075-430b-98ee-c7c45d1cc04a', vmId='7dbb3560-8299-44fc-a97d-d669ac7dd31e', vm='VM [test-2]'}), log id: 1b4977b engine.log: <name>test-2</name> engine.log:2018-02-07 11:02:22,803+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1399) [20dd9e65-48d0-4f59-a52c-71857bd0134d] EVENT_ID: USER_STARTED_VM(153), VM test-2 was started by admin@internal-authz (Host: host_mixed_1). engine.log:2018-02-07 11:02:40,180+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-61) [] VM '7dbb3560-8299-44fc-a97d-d669ac7dd31e'(test-2) moved from 'WaitForLaunch' --> 'PoweringUp' engine.log:2018-02-07 11:02:56,199+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-84) [] VM '7dbb3560-8299-44fc-a97d-d669ac7dd31e'(test-2) moved from 'PoweringUp' --> 'Up' engine.log:2018-02-07 11:02:56,341+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-84) [] EVENT_ID: USER_RUN_VM(32), VM test-2 started on Host host_mixed_1 engine.log:2018-02-07 11:36:16,228+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1784) [] EVENT_ID: VM_SET_TO_UNKNOWN_STATUS(142), VM test-2 was set to the Unknown status. engine.log:2018-02-07 11:42:35,335+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [] VM '7dbb3560-8299-44fc-a97d-d669ac7dd31e'(test-2) moved from 'Unknown' --> 'Down' engine.log:2018-02-07 11:42:35,359+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [] EVENT_ID: VM_DOWN_ERROR(119), VM test-2 is down with error. Exit message: VM terminated with error. engine.log:2018-02-07 11:42:39,963+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-1904) [64f3541a] Couldn't update VM 'test-2' (7dbb3560-8299-44fc-a97d-d669ac7dd31e) version from it's template, continue with restoring stateless snapshot. logs: https://drive.google.com/open?id=1vzLUwlP3oB6Z2PFWbq_Y8ahlLPfnpI1E
Failing on vm pools, VM is down with error. Failed QA. see c31.
Hi I can see there is no libvirt-guests entry in the journal log. Is the journal from host or from guest? Moreover, on which RHEL were the tests done? There is a known bud in 7.4 for VM shutdown, which is fixed in 7.5. If so, can you please retest it for RHEL 7.5? Next, can you please redo the test for 10 VMs which are not in the poll? This will help eliminate the Pool as the source of issue. Many thanks, Petr
let's wait for confirmation
I retested with: Engine version:4.2.2-0.1.el7 Host: OS Version:RHEL - 7.5 - 6.el7 Kernel Version:3.10.0 - 851.el7.x86_64 KVM Version:2.10.0 - 20.el7 LIBVIRT Version:libvirt-3.9.0-13.el7 VDSM Version:vdsm-4.20.18-1.el7ev Cases: 1. Create VM pool with 15 VMs, Run all VMs shutdown the host 2. Create 12 VMs, Run all VMs shutdown the host Results: Case 1: 2 VMs stay in Unknown status Case 1: 1 VM stay in Unknown status See logs at: https://drive.google.com/open?id=1YdeA3msXn6w8fNMLlXVmO3qO3RPe1bab Case 1 pool with 15 vms: VM pool_test_-7 and pool_test_-7 were set to "Unknown status" shutdown Command: [root@puma42 ~]# shutdown Shutdown scheduled for Wed 2018-02-21 09:21:03 IST, use 'shutdown -c' to cancel. [root@puma42 ~]# Broadcast message from root.lab.tlv.redhat.com (Wed 2018-02-21 09:20:03 IST): The system is going down for power-off at Wed 2018-02-21 09:21:03 IST! Engine log: 2018-02-21 09:21:10,962+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-65) [] VM '9206a475-bcc5-4e24-a125-13085a2b02be'(pool_test_-1) moved from 'Up' --> 'PoweringDown' 2018-02-21 09:21:10,971+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-65) [] VM '5c0b4c2f-d1f0-4e48-93c9-131c7ef61ef1'(pool_test_-3) moved from 'Up' --> 'Down' 2018-02-21 09:21:11,112+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-65) [] EVENT_ID: VM_DOWN(61), VM pool_test_-3 is down. Exit message: User shut down from within the guest 2018-02-21 09:21:11,475+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21509) [27f2ac26] Couldn't update VM 'pool_test_-3' (5c0b4c2f-d1f0-4e48-93c9-131c7ef61ef1) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:21:26,176+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [] VM '2ddcad0a-e94e-4f17-9268-91624d9bd9da'(pool_test_-2) moved from 'Up' --> 'Down' 2018-02-21 09:21:26,297+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [] EVENT_ID: VM_DOWN(61), VM pool_test_-2 is down. Exit message: User shut down from within the guest 2018-02-21 09:21:26,306+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [] VM '9206a475-bcc5-4e24-a125-13085a2b02be'(pool_test_-1) moved from 'PoweringDown' --> 'Down' 2018-02-21 09:21:26,341+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [] EVENT_ID: VM_DOWN(61), VM pool_test_-1 is down. Exit message: User shut down from within the guest 2018-02-21 09:21:26,422+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21514) [3fbc9284] Couldn't update VM 'pool_test_-1' (9206a475-bcc5-4e24-a125-13085a2b02be) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:21:26,516+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21514) [4b3e995b] Couldn't update VM 'pool_test_-2' (2ddcad0a-e94e-4f17-9268-91624d9bd9da) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:21:41,395+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [] VM 'b5af689a-6f2e-411f-89bf-dfca090a495e'(pool_test_-4) moved from 'Up' --> 'Down' 2018-02-21 09:21:41,536+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [] EVENT_ID: VM_DOWN(61), VM pool_test_-4 is down. Exit message: User shut down from within the guest 2018-02-21 09:21:41,549+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [] VM '3d95f01e-1e92-4ad7-bbcd-7d00167929a1'(pool_test_-14) moved from 'Up' --> 'Down' 2018-02-21 09:21:41,570+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [] EVENT_ID: VM_DOWN(61), VM pool_test_-14 is down. Exit message: User shut down from within the guest 2018-02-21 09:21:41,661+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21519) [bc90f5a] Couldn't update VM 'pool_test_-4' (b5af689a-6f2e-411f-89bf-dfca090a495e) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:21:41,764+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21519) [2d7151cd] Couldn't update VM 'pool_test_-14' (3d95f01e-1e92-4ad7-bbcd-7d00167929a1) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:21:56,621+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-40) [] VM '2dd11809-dd6d-4041-ae01-b4d402f99579'(pool_test_-6) moved from 'Up' --> 'Down' 2018-02-21 09:21:56,646+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-40) [] EVENT_ID: VM_DOWN(61), VM pool_test_-6 is down. Exit message: User shut down from within the guest 2018-02-21 09:21:56,728+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21524) [ac7ac7b] Couldn't update VM 'pool_test_-6' (2dd11809-dd6d-4041-ae01-b4d402f99579) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:22:11,697+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] VM '52fd7858-cdc1-47fb-9603-34fb874c34fc'(pool_test_-8) moved from 'Up' --> 'Down' 2018-02-21 09:22:11,721+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] EVENT_ID: VM_DOWN(61), VM pool_test_-8 is down. Exit message: User shut down from within the guest 2018-02-21 09:22:11,802+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21529) [2bb28719] Couldn't update VM 'pool_test_-8' (52fd7858-cdc1-47fb-9603-34fb874c34fc) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:22:26,765+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] VM 'd6d9f51c-323e-46a2-82da-f43da155425f'(pool_test_-11) moved from 'Up' --> 'Down' 2018-02-21 09:22:26,786+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] EVENT_ID: VM_DOWN(61), VM pool_test_-11 is down. Exit message: User shut down from within the guest 2018-02-21 09:22:26,794+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] VM 'd21fd87b-95e9-435f-9155-96ec25ccb272'(pool_test_-15) moved from 'Up' --> 'Down' 2018-02-21 09:22:26,810+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] EVENT_ID: VM_DOWN(61), VM pool_test_-15 is down. Exit message: User shut down from within the guest 2018-02-21 09:22:26,880+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21534) [127a20ab] Couldn't update VM 'pool_test_-15' (d21fd87b-95e9-435f-9155-96ec25ccb272) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:22:26,963+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21534) [52917cf7] Couldn't update VM 'pool_test_-11' (d6d9f51c-323e-46a2-82da-f43da155425f) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:22:41,857+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-7) [] VM '8f441bc7-148f-44c0-8890-a7f8db9eca57'(pool_test_-9) moved from 'Up' --> 'Down' 2018-02-21 09:22:41,899+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-7) [] EVENT_ID: VM_DOWN(61), VM pool_test_-9 is down. Exit message: User shut down from within the guest 2018-02-21 09:22:41,981+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21537) [760ea1af] Couldn't update VM 'pool_test_-9' (8f441bc7-148f-44c0-8890-a7f8db9eca57) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:22:56,938+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-87) [] VM '389f1314-cd45-4e55-ba70-c62ff1e6e715'(pool_test_-10) moved from 'Up' --> 'Down' 2018-02-21 09:22:56,972+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-87) [] EVENT_ID: VM_DOWN(61), VM pool_test_-10 is down. Exit message: User shut down from within the guest 2018-02-21 09:22:57,041+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21542) [4aedf5e2] Couldn't update VM 'pool_test_-10' (389f1314-cd45-4e55-ba70-c62ff1e6e715) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:23:12,012+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-56) [] VM '52ed58bf-7a49-4776-b51e-58eabceb5c60'(pool_test_-12) moved from 'Up' --> 'Down' 2018-02-21 09:23:12,186+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-56) [] EVENT_ID: VM_DOWN(61), VM pool_test_-12 is down. Exit message: User shut down from within the guest 2018-02-21 09:23:12,432+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21547) [63fb6556] Couldn't update VM 'pool_test_-12' (52ed58bf-7a49-4776-b51e-58eabceb5c60) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:23:27,222+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-75) [] VM '7bbd15b8-6bdc-4df0-956b-8d3dd7409485'(pool_test_-13) moved from 'Up' --> 'Down' 2018-02-21 09:23:27,247+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-75) [] EVENT_ID: VM_DOWN(61), VM pool_test_-13 is down. Exit message: User shut down from within the guest 2018-02-21 09:23:27,321+02 WARN [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-21550) [3069f554] Couldn't update VM 'pool_test_-13' (7bbd15b8-6bdc-4df0-956b-8d3dd7409485) version from it's template, continue with restoring stateless snapshot. 2018-02-21 09:25:02,768+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-21578) [] EVENT_ID: VM_SET_TO_UNKNOWN_STATUS(142), VM pool_test_-7 was set to the Unknown status. 2018-02-21 09:25:02,778+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-21578) [] EVENT_ID: VM_SET_TO_UNKNOWN_STATUS(142), VM pool_test_-5 was set to the Unknown status. ------------------------------------------------------------------------------- Case 2 info with 12 vms, vm_2 was stay in "Unknown status": shutdown command: [root@puma42 ~]# shutdown Shutdown scheduled for Wed 2018-02-21 10:30:00 IST, use 'shutdown -c' to cancel. [root@puma42 ~]# Broadcast message from root.lab.tlv.redhat.com (Wed 2018-02-21 10:29:00 IST): The system is going down for power-off at Wed 2018-02-21 10:30:00 IST! Engine log, vm_2 was set to the Unknown status: 2018-02-21 10:30:28,324+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] VM '1ba7d522-17cf-4c68-a829-271ec4c8288f'(vm_1) moved from 'Up' --> 'Down' 2018-02-21 10:30:28,478+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] EVENT_ID: VM_DOWN(61), VM vm_1 is down. Exit message: User shut down from within the guest 2018-02-21 10:30:28,491+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] VM '49adc5c8-aa6a-4576-93ba-734750c713fb'(vm_11) moved from 'Up' --> 'Down' 2018-02-21 10:30:28,611+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] EVENT_ID: VM_DOWN(61), VM vm_11 is down. Exit message: User shut down from within the guest 2018-02-21 10:30:28,620+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] VM 'b36b411b-864a-4cce-a6df-143ecb345538'(vm_5) moved from 'Up' --> 'Down' 2018-02-21 10:30:28,744+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-55) [] EVENT_ID: VM_DOWN(61), VM vm_5 is down. Exit message: User shut down from within the guest 2018-02-21 10:30:43,804+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-6) [] VM 'c215a1be-7a75-41d8-b470-bea29d0ccd32'(vm_7) moved from 'Up' --> 'Down' 2018-02-21 10:30:43,926+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-6) [] EVENT_ID: VM_DOWN(61), VM vm_7 is down. Exit message: User shut down from within the guest 2018-02-21 10:30:58,975+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] VM '80349a03-8886-4cf0-a577-a39f7221875d'(vm_12) moved from 'Up' --> 'Down' 2018-02-21 10:30:59,115+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] EVENT_ID: VM_DOWN(61), VM vm_12 is down. Exit message: User shut down from within the guest 2018-02-21 10:30:59,134+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] VM '8625c439-792b-4a46-9c71-f85d0c1cdd08'(vm_8) moved from 'Up' --> 'Down' 2018-02-21 10:30:59,159+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] EVENT_ID: VM_DOWN(61), VM vm_8 is down. Exit message: User shut down from within the guest 2018-02-21 10:31:14,202+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-86) [] VM 'a7822cf9-6086-45e7-9ffc-b501457709a0'(vm_6) moved from 'Up' --> 'Down' 2018-02-21 10:31:14,223+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-86) [] EVENT_ID: VM_DOWN(61), VM vm_6 is down. Exit message: User shut down from within the guest 2018-02-21 10:31:14,233+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-86) [] VM '710c361d-e66a-49c6-a4fc-4aed3a5ca2d6'(vm_3) moved from 'Up' --> 'Down' 2018-02-21 10:31:14,253+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-86) [] EVENT_ID: VM_DOWN(61), VM vm_3 is down. Exit message: User shut down from within the guest 2018-02-21 10:31:29,292+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-4) [] VM 'e46b3232-4cbb-47fb-8f24-8c1b72486482'(vm_4) moved from 'Up' --> 'Down' 2018-02-21 10:31:29,425+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-4) [] EVENT_ID: VM_DOWN(61), VM vm_4 is down. Exit message: User shut down from within the guest 2018-02-21 10:31:29,433+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-4) [] VM '3fd29d04-0e93-49a4-b1ed-464aa2b4b73c'(vm_9) moved from 'Up' --> 'Down' 2018-02-21 10:31:29,451+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-4) [] EVENT_ID: VM_DOWN(61), VM vm_9 is down. Exit message: User shut down from within the guest 2018-02-21 10:31:44,487+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-53) [] VM 'e61ac7f6-6026-42e9-bc8b-f42113a89815'(vm_10) moved from 'Up' --> 'Down' 2018-02-21 10:31:44,511+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-53) [] EVENT_ID: VM_DOWN(61), VM vm_10 is down. Exit message: User shut down from within the guest 2018-02-21 10:31:59,541+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-6) [] VM 'b8da67a1-7c10-4170-a9e6-e87cf6d397e1'(vm_13) moved from 'Up' --> 'Down' 2018-02-21 10:31:59,563+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-6) [] EVENT_ID: VM_DOWN(61), VM vm_13 is down. Exit message: User shut down from within the guest 2018-02-21 10:33:33,656+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-22791) [] EVENT_ID: VM_SET_TO_UNKNOWN_STATUS(142), VM vm_2 was set to the Unknown status. 2018-02-21 10:33:33,656+02 INFO [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engine-Thread-22791) [] Server failed to respond, vds_id='4a284ac7-8075-430b-98ee-c7c45d1cc04a', vds_name='host_mixed_1', vm_count=2, spm_status='SPM', non-responsive_timeout (seconds)=81, error: java.net.NoRouteToHostException: No route to host
Hi, I have examined the logs and I can not find the conclusive result. Please do repeat the test, this time with debug logs switched on. This issue can not be solved without it. Thanks you
Engine: Software Version:4.2.2.1-0.1.el7 Host: OS Version:RHEL - 7.5 - 6.el7 Kernel Version:3.10.0 - 855.el7.x86_64 KVM Version:2.10.0 - 21.el7 LIBVIRT Version: libvirt-3.9.0-13.el7 VDSM Version: vdsm-4.20.19-1.el7ev I did rerun with pools and vms and open the logs to debug see logs at: https://drive.google.com/open?id=1QktaTFoo_fdq9k61-v0HvF4iRqmniKnM same results we have vms in unknown state after shutting down the host. Failed QA
we've sent the event during shutdown as expected 2018-02-27 13:36:38,841+0200 DEBUG (libvirt/events) [jsonrpc.Notification] Sending event {"params": {"7c2a20fc-12a9-44c5-95e7-1bfb3282d056": {"status": "Down", "exitMessage": "User shut down from within the guest", "vmId": "7c2a20fc-12a9-44c5-95e7-1bfb3282d056", "exitReason": 7, "exitCode": 0}, "notify_time": 4364298540}, "jsonrpc": "2.0", "method": "|virt|VM_status|7c2a20fc-12a9-44c5-95e7-1bfb3282d056"} (__init__:181) 2018-02-27 13:36:38,842+0200 DEBUG (libvirt/events) [vds] Sending notification |virt|VM_status|7c2a20fc-12a9-44c5-95e7-1bfb3282d056 with params {'7c2a20fc-12a9-44c5-95e7-1bfb3282d056': {'status': 'Down', 'exitMessage': 'User shut down from within the guest', 'vmId': '7c2a20fc-12a9-44c5-95e7-1bfb3282d056', 'exitReason': 7, 'exitCode': 0}, 'notify_time': 4364298540} (clientIF:212) 2018-02-27 13:36:43,121+0200 DEBUG (MainThread) [vds.MultiProtocolAcceptor] Stopping Acceptor (protocoldetector:214) 2018-02-27 13:36:43,122+0200 INFO (MainThread) [jsonrpc.JsonRpcServer] Stopping JsonRPC Server (__init__:703) but there's only following error on engine side and nothing else. 2018-02-27 13:36:29,930+02 INFO [org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (EE-ManagedThreadFactory-engine-Thread-30299) [cc26175] Lock freed to object 'EngineLock:{exclusiveLocks='[b46f4e62-d5eb-4f81-ad75-00163341ac65=VM]', sharedLocks=''}' 2018-02-27 13:36:46,956+02 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [] Unable to process messages Broken pipe Israel, debug log level was NOT enabled on the engine side so we cannot see whether the message was received or not. Please repeat with debug logs on all sides.
Retested with: Engine: Software Version:4.2.2.2-0.1.el7 Host: OS Version:RHEL - 7.5 - 8.el7 Kernel Version:3.10.0 - 858.el7.x86_64 KVM Version:2.10.0 - 21.el7 LIBVIRT Version:libvirt-3.9.0-13.el7 VDSM Version:vdsm-4.20.19-1.el7ev See logs: https://drive.google.com/open?id=1wkEIyXsmSpTywZvFfr_0k8ucFHSePu4G This time did graceful shutdown with power management: From val/log/messages: Mar 5 15:39:08 puma42 systemd: Starting Power-Off... Mar 5 15:39:08 puma42 systemd: Stopping Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling... Mar 5 15:39:08 puma42 systemd: Shutting down. Mar 5 15:39:08 puma42 systemd-shutdown[1]: Sending SIGTERM to remaining processes... Mar 5 15:39:08 puma42 journal: Journal stopped Run again: Mar 5 17:41:16 puma42 journal: Runtime journal is using 8.0M (max allowed 1.5G, trying to leave 2.3G free of 15.5G available → current limit 1.5G). Mar 5 17:41:16 puma42 kernel: microcode: microcode updated early to revision 0x710, date = 2013-06-17 vm_16 set to unknown state: engine.log:2018-03-05 15:37:29,261+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-709) [] EVENT_ID: VM_SET_TO_UNKNOWN_STATUS(142), VM vm_16 was set to the Unknown status.
it seems like two issues really 1) all VMs are shutdown, but few of the last VMs (when there are more than just a few) do not seem to send their update correctly and engine declares them Unknown (event is lost) 2) all VMs are shut down as "user shut down from the guest" due to mis-identification of the shutdown reason
Hi Derek, I think with this behaviour and HA VMs (with and without leases), we need some documentation section that explains the flow(s) so that the customer understands the impact. In case a HA-VM is shut down cleanly (be it due to this BZ or from the UI or from within the VM), the HA VM isn't restarted. While this might be expected for the UI and in-VM usecase it is probably not 100% clear for the host shutdown usecase. As such, I would like to have that clearly documented in the HA-VM section. Do you think this is feasible? Thanks! Martin
In a word, yes. Can we meet on BlueJeans after next week (travelling to the OPS QBR next week) to go through the detail planning discussion for docs?
(In reply to Michal Skrivanek from comment #41) > it seems like two issues really > 1) all VMs are shutdown, but few of the last VMs (when there are more than > just a few) do not seem to send their update correctly and engine declares > them Unknown (event is lost) > 2) all VMs are shut down as "user shut down from the guest" due to > mis-identification of the shutdown reason Who's looking at this?
I have retested the setup mentioned by the QE. All of the VMs terminated gracefully and without the error. It seems that the issue was due to the unrelated storage error. As for the second part. I started working on the patch that sets the right status to the terminating VM during system shutdown.
Verify with: Engine: 4.2.3.3-0.1.el7 Host: OS Version:RHEL - 7.5 - 8.el7 Kernel Version:3.10.0 - 862.el7.x86_64 KVM Version:2.10.0 - 21.el7_5.2 LIBVIRT Version:libvirt-3.9.0-14.el7_5.2 VDSM Version:vdsm-4.20.27-1.el7ev SPICE Version:0.14.0 - 2.el7 Test run: https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/testrun?id=42-4&tab=records&result=passed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488
BZ<2>Jira Resync