Previously, in cases of emergency, users were required to shut down the hosts to preserve the data center. This caused running virtual machines to be killed by the systemd process without performing a graceful shutdown. As a result, the virtual machine's state became undefined which led to problematic scenarios for virtual machines running databases such as Oracle and SAP. In this release, virtual machines can be gracefully shut down by delaying the systemd process. Only after the virtual machines are shut down, does the systemd process take control and continue the shut down. The VDSM is only shut down after the virtual machines have been gracefully shut down, after passing information to the Manager and waiting 5 seconds for the Manager to acknowledge the virtual machines have been shut down. Remove note in 7.5.6 that discusses damage caused by ungraceful shutdown. This should no longer happen. Check for other places in the documentation + VM Guide that may also discuss this behavior. Note that the default 5 secs can be changed in in vdsm.con
Accepting into GA program and assigning to Tahlia for review. The related eng RFE is on MODIFIED, but it appears the main functionality has been shipped already.
Option in vdsm.conf is timeout_engine_clear_vms Derek, did anything come of the conversation in https://bugzilla.redhat.com/show_bug.cgi?id=1334982#c42 ?
Just tested this now (version 4.2.3-0.1.el7), and the VMs went into Unknown state (did not shut down gracefully). Could be due to an unrelated issue. I can still make the required docs changes, but I'll keep the merge request as a WIP until BZ#1334982 is Verified.
Removed the note in Configuring Host Power Management Settings and made maintenance mode a step in the procedure instead. The HA section of the VMM Guide says this: "High availability means that a virtual machine will be automatically restarted if its process is interrupted. This happens if the virtual machine is terminated by methods other than powering off from within the guest, powering off the host by the administrator, or sending the shutdown command from the Manager. When these events occur, the highly available virtual machine is automatically restarted, either on its original host or another host in the cluster." What this is trying to say is that the VM is NOT restarted if it is powered off from within the guest, powered off through the Manager, or if the host is powered off (the use case of this BZ). However, the phrasing (particularly the last sentence) is ambiguous and could be interpreted the opposite way if you're not reading closely enough. So, before I move this bug along, I'll improve this paragraph. Stay tuned.
Now published at https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.2/html/virtual_machine_management_guide/sect-improving_uptime_with_virtual_machine_high_availability#What_is_high_availability