Bug 1341106

Summary: HA vms do not start after successful power-management.
Product: Red Hat Enterprise Virtualization Manager Reporter: Ulhas Surse <usurse>
Component: ovirt-engineAssignee: Vinzenz Feenstra [evilissimo] <vfeenstr>
Status: CLOSED ERRATA QA Contact: Artyom <alukiano>
Severity: medium Docs Contact:
Priority: urgent    
Version: 3.6.5CC: ahadas, alukiano, bgraveno, deepak.jagtap, dmoessne, emarquez, fsun, lsurette, mgoldboi, michal.skrivanek, mtessun, pmatyas, pstehlik, rbalakri, Rhev-m-bugs, sacpatil, srevivo, tjelinek, vfeenstr, ykaul
Target Milestone: ovirt-4.1.1Keywords: Triaged, ZStream
Target Release: ---Flags: deepak.jagtap: needinfo?
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, when virtual machines where stopped during a shutdown/reboot, the shutdown appeared to the VDSM to have been performed gracefully from with the guest Operating System. The Red Hat Virtualization Manager therefore did not start high availability virtual machines on a different host because it considered the stopping of the virtual machine to be user initiated. This update ensures that VDSM will now detect that a virtual machine was shutdown from within the system, and can differentiate the unplanned shutdown and reports this information. As a result, virtual machines stopped on a system shutdown, are now restarted on a different system.
Story Points: ---
Clone Of:
: 1389332 1404623 1406033 (view as bug list) Environment:
Last Closed: 2017-04-25 00:51:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1389332, 1404623, 1406033    
Attachments:
Description Flags
engine log none

Description Ulhas Surse 2016-05-31 09:39:32 UTC
Description of problem:
With successful power management configured, the vms Marked HA should start on other host but those are getting shutdown.

Things Tried till now:

A) Configure Power Management for the Hosts. 
B) Mark VM as High Available (HIGH)

 1] Click Power Management drop down menu and select - restart [VMs remains down with Exit message: User shut down from within the guest] 
 2] Enter command reboot/ init 6 / init 0 in host - [VMs remains down with Exit message: User shut down from within the guest] 
 3] From Hypervisor console - Poweroff Host - [VMs remains down with Exit message: User shut down from within the guest]
 4] Abrupt Shutdown - [VMs restarted on other host once host fenced]
 5] Ifdown interface - [vms Unknown and start once the host is UP]

Version-Release number of selected component (if applicable):
RHEVM 3.6.5

How reproducible:
Always

Steps to Reproduce:
1. Configure power management for the host.
2. Mark the VM as Highly Available 
3. Try to gracefully shutdown the host or choose from "Host --> Power Management --> (dropdown) Restart"

Host fence is successful but VM down with error: Exit message: User shut down from within the guest

Actual results:
HA vms are not restarting on other or the same host.

Expected results:
HA vms should restart on other host.

Additional info:
Tried with this also: echo c >/proc/sysrq-trigger
With this option, vm was restarted on other host.

Comment 4 Martin Tessun 2016-06-01 09:45:38 UTC
Just some further findings:

It looks like the IMM2 Board from IBM/Lenovo does always send ACPI signals to the OS.

This is why systemd jumps in and kills the VM.
So we need to
a) either get systemd to ignore the ACPI signals (and as such not killing the VM)
b) Get IBM to not send ACPI signals to the OS in case of a "Immediate Power off"
   resp. "power off" without "-s" as it obviously does in that case.


Taken from the logs prior to a Poweroff-event from the IMM
(still waiting for some further logs for final confirmation):

qemu: terminating on signal 15 from pid 1
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2016-05-26 05:48:42.924+0000: starting up libvirt version: 1.2.17, packa
ge: 13.el7_2.4 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-03-02-11:10:27, x86-034.build.eng.bos.redhat.com), qemu version: 2.3.0 (qemu-kvm-rhev-
2.3.0-31.el7_2.10)

Comment 7 Martin Tessun 2016-06-01 10:49:03 UTC
Just checking the logs from my previous tests and I found the following in the messages:

2016-06-01T08:19:37.172092Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 85.839000 ms, bitrate 1721352 bps (1.641609 Mbps) LOW BANDWIDTH
inputs_connect: inputs channel client create
red_dispatcher_set_cursor_peer: 
===> qemu: terminating on signal 15 from pid 1 <=== 

This shows the shutdown is triggered by systemd before the system is powered off.

An even better evidence can be found in the messages:
Jun  1 08:06:08 IDCRHLV01 root: PowerOff Test started
Jun  1 08:07:16 IDCRHLV01 systemd-logind: Power key pressed.
Jun  1 08:07:16 IDCRHLV01 systemd-logind: Powering Off...
Jun  1 08:07:16 IDCRHLV01 systemd-logind: System is powering down.
Jun  1 08:07:16 IDCRHLV01 systemd: Unmounting RPC Pipe File System...
Jun  1 08:07:16 IDCRHLV01 systemd: Stopped Dump dmesg to /var/log/dmesg.
Jun  1 08:07:16 IDCRHLV01 systemd: Stopping Dump dmesg to /var/log/dmesg...
Jun  1 08:07:16 IDCRHLV01 systemd: Stopped target Timers.
Jun  1 08:07:16 IDCRHLV01 systemd: Stopping Timers.
Jun  1 08:07:16 IDCRHLV01 systemd: Stopping LVM2 PV scan on device 8:144...
[...]

Comment 8 Martin Tessun 2016-06-01 10:50:20 UTC
Sorry submitted too early.
So maybe we should disable powermanagement for Hypervisors by default.

E.g.:

1. Shutdown and disable acpid
   # systemctl disable acpid
   # systemctl stop acpid

2. Change the ACPI Actions of systemd to "IGNORE":
   # mkdir -m 755 /etc/systemd/logind.conf.d
   # cat > /etc/systemd/logind.conf.d/acpi.conf <<EOF
[Login]
HandlePowerKey=ignore
HandleSuspendKey=ignore
HandleHibernateKey=ignore
HandleLidSwitch=ignore
HandleLidSwitchDocked=ignore
EOF

3. Restart systemd-logind
   # systemctl restart systemd-logind

Comment 9 Michal Skrivanek 2016-06-01 18:03:19 UTC
(In reply to Martin Tessun from comment #7)
> ===> qemu: terminating on signal 15 from pid 1 <=== 
> 
> This shows the shutdown is triggered by systemd before the system is powered
> off.
> 
this would be ok, I guess, we should handle that and still identify that as ungraceful shutdown. Was there any acpi event inside the guest perhaps? What about libvirt, did it get sigterm too?

Generally, it is a desired behavior that it tries to gracefully terminate, but then we have to rethink how HA behaves and maybe restart HA VM regardless what guest does and allow to shut down HA VM only from UI...which might be annoying

Comment 10 Yaniv Kaul 2016-06-01 18:05:48 UTC
(In reply to Martin Tessun from comment #8)
> Sorry submitted too early.
> So maybe we should disable powermanagement for Hypervisors by default.

No, we shouldn't. In case of a disaster, I expect IT to go into the server room and shutdown using the ON/OFF button, expecting a graceful shutdown. 
This change is unexpected.
I'm quite sure there's a way in IBM, via BMC or whatnot, to ungracefully kill the host. We should look into it, in the fence-agents code.

Comment 11 Martin Tessun 2016-06-02 07:17:49 UTC
(In reply to Yaniv Kaul from comment #10)
> (In reply to Martin Tessun from comment #8)
> > Sorry submitted too early.
> > So maybe we should disable powermanagement for Hypervisors by default.
> 
> No, we shouldn't. In case of a disaster, I expect IT to go into the server
> room and shutdown using the ON/OFF button, expecting a graceful shutdown. 
> This change is unexpected.

Well in case of a desaster, I don't expect anyone to go to the server room. It is a desaster, so probably there might be some risks to go into the server room.

In my 20 years of Administration, I never used the Power Off button to gracefully shutdown a server. Either I have a serial console I can reach, or I do a hard poweroff (maybe even NMI triggered to get a crashdump), but probably never graceful, as this usually does not work in these cases.

Anyways, I can accept this point of view and it would of course break the current behaviour, which might lead to other cases requesting the exact opposite.

> I'm quite sure there's a way in IBM, via BMC or whatnot, to ungracefully
> kill the host. We should look into it, in the fence-agents code.

Sure from my point of view thw IBM IMM2 / BMC card has a firmware issue, as there is an option to gracefully shut down the server (power off -s).

Still I agree with Michal that we should somehow handle these sort of issues (so in case the VM is killed from systemd) at least for HA VMs.

Comment 16 Ahmed El-Rayess 2016-10-29 08:17:43 UTC
I have tried a couple more test cases related to this and maybe we can consider it in the same BZ.

when the admin logs on to the hypervisor and issues a shutdown or reboot, all the VMS running on the host exit with the same message "User shut down from within the guest"

which means the guest will never start up again automatically and the admin will have to manually start all these VMs up again.

the solution to this could be either:
1- to enable maintenance mode on the host as part of the shutdown sequence, which would gracefully move all VMs from that host to another functional one in the cluster.
2- to forcibly kill the guest VM processes instead of attempting shutdown, this would then be picked up by RHV-M and automatically start the VMs on another host

Comment 17 Vinzenz Feenstra [evilissimo] 2016-10-29 18:08:17 UTC
@aelrayes:

The workaround applied here will work for those scenarios as well (A hypervisor shut-down by an administrator, it's not a user shutdown)

1) is the correct way to do this for an administrator

2) should be avoided if not necessary

Comment 23 Artyom 2017-01-26 08:41:44 UTC
Created attachment 1244637 [details]
engine log

Checked on:
Guest OS
============================
Red Hat Enterprise Linux Server release 7.3 (Maipo)
ovirt-guest-agent-common-1.0.13-3.el7ev.noarch

Host
============================
vdsm-4.19.2-2.el7ev.x86_64

Engine
============================
rhevm-4.1.0.2-0.2.el7.noarch

Checked scenario:
1) Power off the HA VM from the engine - the engine does not start the VM (PASS)
2) Power off the HA VM from the guest OS - the engine does not start the VM (PASS)
3) Power off the host where the HA VM runs via the engine power management action - the engine does not start the VM (FAILED)
4) Power off the host where HA VM runs via host OS - the engine does not start the VM (FAILED)

I do not know what the reason why the engine does not restart the HA VM in 3 and 4 cases. It worked fine for 4.0, so looks like we have some additional regression in the code for 4.1

You can start to look at the engine log from the line:
2017-01-26 03:35:09,730-05 INFO  [org.ovirt.engine.core.bll.pm.RestartVdsCommand] (org.ovirt.thread.pool-6-thread-49) [0a550b8c-1c30-4a65-8aac-fb5216ebc233] Running command: RestartVdsCommand internal: false. Entities affected :  ID: 44daf5ed-4e89-4472-9733-4376a055efcd Type: VDSAction group MANIPULATE_HOST with role type ADMIN

Comment 24 Moran Goldboim 2017-02-05 11:13:15 UTC
(In reply to Artyom from comment #23)
> Created attachment 1244637 [details]
> engine log
> 
> Checked on:
> Guest OS
> ============================
> Red Hat Enterprise Linux Server release 7.3 (Maipo)
> ovirt-guest-agent-common-1.0.13-3.el7ev.noarch
> 
> Host
> ============================
> vdsm-4.19.2-2.el7ev.x86_64
> 
> Engine
> ============================
> rhevm-4.1.0.2-0.2.el7.noarch
> 
> Checked scenario:
> 1) Power off the HA VM from the engine - the engine does not start the VM
> (PASS)
> 2) Power off the HA VM from the guest OS - the engine does not start the VM
> (PASS)
> 3) Power off the host where the HA VM runs via the engine power management
> action - the engine does not start the VM (FAILED)
> 4) Power off the host where HA VM runs via host OS - the engine does not
> start the VM (FAILED)
> 
> I do not know what the reason why the engine does not restart the HA VM in 3
> and 4 cases. It worked fine for 4.0, so looks like we have some additional
> regression in the code for 4.1
> 
> You can start to look at the engine log from the line:
> 2017-01-26 03:35:09,730-05 INFO 
> [org.ovirt.engine.core.bll.pm.RestartVdsCommand]
> (org.ovirt.thread.pool-6-thread-49) [0a550b8c-1c30-4a65-8aac-fb5216ebc233]
> Running command: RestartVdsCommand internal: false. Entities affected :  ID:
> 44daf5ed-4e89-4472-9733-4376a055efcd Type: VDSAction group MANIPULATE_HOST
> with role type ADMIN

Per Artyom's comment - raising up priority here.

Comment 25 Vinzenz Feenstra [evilissimo] 2017-02-07 07:48:13 UTC
@Arik:

I have just tested it and it is really working with VDSM 4.1 on a 4.0 engine, 
so this must be a regression in the engine logic.
VDSM did always return success there, I am not sure where the problem is, however in 4.1 the engine seemes to have changed the behavior

Comment 29 Michal Skrivanek 2017-02-15 10:08:39 UTC
*** Bug 1421975 has been marked as a duplicate of this bug. ***

Comment 31 Artyom 2017-02-20 08:05:29 UTC
Verified

Guest OS
============================
Red Hat Enterprise Linux Server release 7.3 (Maipo)
ovirt-guest-agent-common-1.0.13-3.el7ev.noarch

Host
============================
vdsm-4.19.6-1.el7ev.x86_64

Engine
============================
rhevm-4.1.1.2-0.1.el7.noarch

Checked scenario:
1) Power off the HA VM from the engine - the engine does not start the VM (PASS)
2) Power off the HA VM from the guest OS - the engine does not start the VM (PASS)
3) Power off the host where the HA VM runs via the engine power management action - the engine does not start the VM (PASS)
4) Power off the host where HA VM runs via host OS - the engine does not start the VM (PASS)

Comment 32 Petr Matyáš 2017-02-20 08:49:05 UTC
(In reply to Artyom from comment #31)
> 3) Power off the host where the HA VM runs via the engine power management
> action - the engine does not start the VM (PASS)

not? it should

> 4) Power off the host where HA VM runs via host OS - the engine does not
> start the VM (PASS)

not? it should

Comment 33 Artyom 2017-02-20 09:36:33 UTC
Copy paste mistake:
Checked scenario:
1) Power off the HA VM from the engine - the engine does not start the VM (PASS)
2) Power off the HA VM from the guest OS - the engine does not start the VM (PASS)
3) Power off the host where the HA VM runs via the engine power management action - the engine starts the VM (PASS)
4) Power off the host where HA VM runs via host OS - the engine starts the VM (PASS)

Comment 34 Petr Matyáš 2017-02-20 16:38:53 UTC
Actually I tested this right now and when the VM didn't have guest agent it wasn't restarted on different host after fencing of it's host.

Comment 35 Vinzenz Feenstra [evilissimo] 2017-02-21 06:40:12 UTC
(In reply to Petr Matyáš from comment #34)
> Actually I tested this right now and when the VM didn't have guest agent it
> wasn't restarted on different host after fencing of it's host.

There's nothing written that he didn't test it without an agent. For this feature to work the agent is required now. So without it it won't work because it looks to VDSM like a normal VM shutdown initiated by the VM itself aka 'User Shutdown' which we handle in this case not to restart the HA VM. (This is how we did it in the past and we have to keep this behavior, I actually argued against that but well)

Anyway the behavior you (Petr) tested is exactly how it is expected to be now.

Comment 36 Yaniv Kaul 2017-02-21 07:13:11 UTC
Do we have any integration with pvpanic?

Comment 37 Vinzenz Feenstra [evilissimo] 2017-02-21 07:15:56 UTC
Well this is the job of libvirt, not ours:

"pvpanic device is a simulated ISA device, through which a guest panic
event is sent to qemu, and a QMP event is generated. This allows
management apps (e.g. libvirt) to be notified and respond to the event.

The management app has the option of waiting for GUEST_PANICKED events,
and/or polling for guest-panicked RunState, to learn when the pvpanic
device has fired a panic event."

So I will boldly say yes and if it doesn't work its a regression.

Comment 38 Yaniv Kaul 2017-02-21 08:16:09 UTC
(In reply to Vinzenz Feenstra [evilissimo] from comment #37)
> Well this is the job of libvirt, not ours:
> 
> "pvpanic device is a simulated ISA device, through which a guest panic
> event is sent to qemu, and a QMP event is generated. This allows
> management apps (e.g. libvirt) to be notified and respond to the event.
> 
> The management app has the option of waiting for GUEST_PANICKED events,
> and/or polling for guest-panicked RunState, to learn when the pvpanic
> device has fired a panic event."

We (VDSM) are the management application. Where do we act on such event?
> 
> So I will boldly say yes and if it doesn't work its a regression.

Comment 39 Vinzenz Feenstra [evilissimo] 2017-02-21 08:30:58 UTC
(In reply to Yaniv Kaul from comment #38)
> (In reply to Vinzenz Feenstra [evilissimo] from comment #37)
> > Well this is the job of libvirt, not ours:
> > 
> > "pvpanic device is a simulated ISA device, through which a guest panic
> > event is sent to qemu, and a QMP event is generated. This allows
> > management apps (e.g. libvirt) to be notified and respond to the event.
> > 
> > The management app has the option of waiting for GUEST_PANICKED events,
> > and/or polling for guest-panicked RunState, to learn when the pvpanic
> > device has fired a panic event."
> 
> We (VDSM) are the management application. Where do we act on such event?
> > 
> > So I will boldly say yes and if it doesn't work its a regression.

well there's no explicit handling of VIR_DOMAIN_EVENT_CRASHED.

However this is off topic for this particular BZ if you have questions about the support for this please move it to devel thanks.

Comment 40 Michal Skrivanek 2017-02-24 08:14:41 UTC
(In reply to Vinzenz Feenstra [evilissimo] from comment #39)

> However this is off topic for this particular BZ if you have questions about
> the support for this please move it to devel thanks.

pvpanic is not related here, it's a host-side termination signal sent to qemu which qemu and libvirt doesn't detect in proper way so we can't distinguish clean and unclean shutdown (bug 1384007 and bug 1418927)
Other alternative is to redefine the HA feature to always restart the VM no matter the reason.

Vinzenz, please also look to possible messing up with the default OS config to stop systemd acting up

Comment 42 deepak 2017-09-12 01:01:23 UTC
Hey Guys,

I am also observing same issue on RHEVM 4.1.5. Is this fix made in to 4.1.5?
I have ubuntu guest vms, does guest vm agent needs to be installed on all guest vms for HA to work?

Thanks & Regards,
Deepak