Bug 1049860

Summary:

Guest agent command hang there after restore the guest from the save file

Product:

Red Hat Enterprise Linux 7

Reporter:

zhenfeng wang <zhwang>

Component:

seabios

Assignee:

Laszlo Ersek <lersek>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Virtualization Bugs <virt-bugs>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

7.0

CC:

acathrow, ajia, areis, dyuan, flang, gsun, hhuang, jdenemar, jiahu, juzhang, juzhou, marcel, mazhang, mst, mzhan, qzhang, sluo, virt-maint, xfu, ydu

Target Milestone:

Keywords:

Reopened

Target Release:

---

Hardware:

x86_64

OS:

Unspecified

Whiteboard:

Fixed In Version:

seabios-1.7.2.2-11.el7

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

1049858

Environment:

Last Closed:

2014-06-13 09:48:06 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
libvirdt.log copied from bug 1049858	none

Description zhenfeng wang 2014-01-08 11:37:16 UTC

+++ This bug was initially created as a clone of Bug #1049858 +++

Description of problem:
Guest agent command hang there after restore the guest from the save file
Version:
virtio-win-1.6.7-2.el6.noarch
qemu-kvm-rhev-0.12.1.2-2.415.el6_5.3.x86_64
kernel-2.6.32-432.el6.x86_64
libvirt-0.10.2-29.el6_5.2.x86_64

1.# getenforce
Enforcing

2.Prepare a guest with qemu-ga ENV,add below config to domain xml. 
...
<pm>
    <suspend-to-mem enabled='yes'/>
    <suspend-to-disk enabled='yes'/>
</pm>
...
<channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/r6.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
...

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 30    r6                             running
3. Run the following cmds, the guest agent command will hang there
# virsh dompmsuspend r6 --target mem
Domain r6 successfully suspended
# virsh dompmwakeup r6
Domain r6 successfully woken up
# virsh save r6 /tmp/r6.save

Domain r6 saved to /tmp/r6.save

# virsh restore /tmp/r6.save 
Domain restored from /tmp/r6.save

# virsh dompmsuspend r6 --target mem
^C                                                                <======hung here.
# virsh save r6 /tmp/r6.save
error: Failed to save domain r6 to /tmp/r6.save
error: Timed out during operation: cannot acquire state change lock

Comment 2 Jiri Denemark 2014-01-09 08:21:44 UTC

Created attachment 847524 [details]
libvirdt.log copied from bug 1049858

Comment 3 Jiri Denemark 2014-01-09 08:28:05 UTC

This seems to be a guest or guest-agent issue. Let's close this clone. See https://bugzilla.redhat.com/show_bug.cgi?id=1049858#c3 for more details.

*** This bug has been marked as a duplicate of bug 1049858 ***

Comment 4 Jiri Denemark 2014-01-09 09:41:20 UTC

As noted in bug 1049858, this issue can be reproduced with both RHEL-6 and RHEL-7 guests, so I'm reopening this bug (and moving to qemu-kvm for further investigation). I expect the issue to be similar in both cases but I feel it's best to track that in both products separately as the resolution may differ.

Relevant comments copied from bug 1049858:

bug 1049858#c3:

According to libvirt logs qemu-agent responded to "guest-sync" command and libvirt is waiting for "guest-suspend-ram" command to either return an error or result in a suspended domain. This is a known issue with qemu-agent design and our interaction with it which is covered by bug 1028927. The question is why qemu-agent does not report any error while still failing to actually suspend the guest. I'm moving this bug to qemu-kvm for further investigation.

BTW, what OS runs in the guest? And does changing it (as in RHEL6 vs. RHEL7) make any difference?


bug 1049858#c3:

My guest os is a rhel6 guest, and i can also reproduce this issue in rhel7 with a rhel7 guest, I will attach rhel7's libivrtd log to the attachment later.

Comment 5 Laszlo Ersek 2014-01-23 17:01:36 UTC

So my first question here would be, whether the actual guest command (ie. suspend the domain from inside) has any relevance. The steps in comment 0 say:

(1) S3 suspend/resume from the inside [qga]
(2) dump/restore
(3) S3 suspend/resume from the inside [qga]

what happens if (1) and (3) are replaced with another qga command?

Second, if S3 is indeed needed to reproduce the bug, then for another test, we should just execute (1) and (2), then log into the guest via the normal graphical console, and run pm-suspend manually. See how that works. If it fails, then we might immediately have a qemu bug related to dump/restore.

Third, I'll have to run gdb in the guest, likely...

Comment 6 Laszlo Ersek 2014-01-24 12:58:22 UTC

Narrowing it down a little bit:

I configured the domain XML so that libvirt sets up an agent channel for me, but doesn't use it (name='org.qemu.guest_agent.1'). I connected to it with socat:

  socat unix-connect:/var/lib/libvirt/qemu/seabios.rhel6.agent readline

Then I changed the test to:

1a. send the guest to sleep:

{"execute":"guest-ping"}
{"return": {}}

{"execute":"guest-suspend-ram"}
<no answer, guest is suspended>

1b. wake up the guest (from separate root shell):
virsh qemu-monitor-command seabios.rhel6 --hmp system_wakeup

2a. save the guest as before:
virsh save seabios.rhel6 /tmp/seabios.rhel6.save --verbose

At this point the qemu process exits, and so does the socat process (seeing EOF on the unix domain socket). 

2b. restore the guest as before:
virsh restore /tmp/seabios.rhel6.save ; rm -f /tmp/seabios.rhel6.save

3a. reconnect:
socat unix-connect:/var/lib/libvirt/qemu/seabios.rhel6.agent readline

3b. try to send the guest to sleep again:
{"execute":"guest-ping"}
{"return": {}}

{"execute":"guest-suspend-ram"}
<no answer, guest continues to run>

That is, the virtio-serial line is alive, the guest agent is running and can communicate, "only" the guest-suspend-ram command fails (without any answer)

4. try to suspend the guest from an in-guest root shell:

Now this guest doesn't have the "pm-suspend" command installed (pm-utils package). The "pm-is-supported" utility is also not available.

So the question becomes, why and how did the suspend work in step 1a at all?

Comment 7 Laszlo Ersek 2014-01-24 13:13:06 UTC

OK, so the guest agent reads /sys/power/state for the supported suspend (Sx) states, and if "mem" is supported (and pm-utils is absent), then qga facilitates S3 by writing "mem" back into /sys/power/state.

So, in step 4 above, I tried to do just that, manually, in the guest, with

echo -n mem >/sys/power/state

And what happens is, the guest goes to sleep, but it also wakes up *immediately*. (This is visible in dmesg and /var/log/messages.) All the while qemu doesn't seem to emit any events -- at least libvirt doesn't log any.

I don't think that the guest kernel is the culprit. I suspect the ACPI emulation code in qemu more. Something likely goes wrong during save/resume.

Comment 8 Laszlo Ersek 2014-01-24 13:15:44 UTC

In addition, if in this state of the guest, I initiate a guest shutdown by issuing "shutdown -h now" at the guest root prompt, then the services are brought down correctly, but the *final* ACPI act of powering off the VM doesn't succeed. The qemu process stays running.

Comment 9 Laszlo Ersek 2014-01-24 13:42:46 UTC

In the following tests, "suspend" always means

  guest# echo -n mem >/sys/power/state

resume always means

  host# virsh qemu-monitor-command seabios.rhel6 --hmp system_wakeup

save always means

  host# virsh save seabios.rhel6 /tmp/seabios.rhel6.save --verbose

restore always means:

  # virsh restore /tmp/seabios.rhel6.save ; rm -f /tmp/seabios.rhel6.save

Tests (full shutdown between each of the three):

suspend, resume,                suspend, resume: PASS
                 save, restore, suspend, resume: PASS
suspend, resume, save, restore, suspend, XXXXXX: FAIL

The second suspend in the third test fails to suspend the VM.

Comment 10 Laszlo Ersek 2014-01-24 15:39:44 UTC

In the failing state (ie. after suspend/resume/save/restore), the ACPI PM1a control block simply doesn't exist. Writes to it don't trap to the correct handler function (ie. acpi_pm_cnt_write()). Diffing the output of "info mtree" between "right after started" and "in failing state":

--- when-started        2014-01-24 16:27:38.381024937 +0100
+++ when-broken 2014-01-24 16:26:30.418657946 +0100
@@ -5,7 +5,6 @@
   00000000000c0000-00000000000c3fff (prio 1, R-): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff
   00000000000c4000-00000000000c7fff (prio 1, R-): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff
   00000000000c8000-00000000000cbfff (prio 1, R-): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff
-  00000000000ca000-00000000000ccfff (prio 1000, RW): alias kvmvapic-rom @pc.ram 00000000000ca000-00000000000ccfff
   00000000000cc000-00000000000cffff (prio 1, R-): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff
   00000000000d0000-00000000000d3fff (prio 1, RW): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff
   00000000000d4000-00000000000d7fff (prio 1, RW): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff
@@ -61,10 +60,6 @@
   000000000000ae00-000000000000ae0e (prio 0, RW): apci-pci-hotplug
   000000000000af00-000000000000af1f (prio 0, RW): apci-cpu-hotplug
   000000000000afe0-000000000000afe3 (prio 0, RW): apci-gpe0
-  000000000000b000-000000000000b03f (prio 0, RW): piix4-pm
-    000000000000b000-000000000000b003 (prio 0, RW): acpi-evt
-    000000000000b004-000000000000b005 (prio 0, RW): acpi-cnt
-    000000000000b008-000000000000b00b (prio 0, RW): acpi-tmr
   000000000000b100-000000000000b13f (prio 0, RW): pm-smbus
   000000000000c000-000000000000c03f (prio 1, RW): virtio-pci
   000000000000c040-000000000000c05f (prio 1, RW): uhci

No idea why "kvmvapic-rom" is gone, and it's probably not important for now. However, the entire "piix4-pm" block is gone (which is configured otherwise by the piix4_pm_initfn() function [hw/acpi/piix4.c] and the functions it calls.

It looks like after the first suspend-resume, either savevm does something wrong (it doesn't dump the piix4-pm vmstate), or it is dumped in such a form that loadvm can't restore it.

Comment 11 Laszlo Ersek 2014-01-24 16:17:49 UTC

Comparing the relevant parts of the vmstate files (they start at different offsets, but we care about piix4_pm only):

--- after-start 2014-01-24 17:16:08.508147474 +0100
+++ after-suspend-resume        2014-01-24 17:16:08.020144606 +0100
@@ -1,23 +1,23 @@
 00 1f 15 30 30 30 30 3a  30 30 3a 30 31 2e 33 2f  |...0000:00:01.3/|
 70 69 69 78 34 5f 70 6d  00 00 00 00 00 00 00 03  |piix4_pm........|
 00 00 00 02 86 80 13 71  03 01 80 02 03 00 80 06  |.......q........|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 f4 1a 00 11 00 00 00 00  00 00 00 00 00 00 00 00  |ô...............|
 09 01 00 00 01 b0 00 00  00 00 00 00 00 00 00 00  |.....°..........|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 02  |................|
 00 00 00 10 00 00 00 60  00 00 00 08 00 00 00 00  |.......`........|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
-00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
+00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 00 00 00 00 01 b1 00 00  00 00 00 00 00 00 00 00  |.....ą..........|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 00 00 00 00 00 00 09 00  00 00 00 00 00 00 00 00  |................|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
-00 00 00 00 00 01 01 20  00 01 f1 00 ff ff ff ff  |....... ..ń.˙˙˙˙|
-ff ff ff ff 00 00 00 00  00 00 00 00 00 00 ff ff  |˙˙˙˙..........˙˙|
+00 00 00 00 00 01 01 20  04 01 f1 00 ff ff ff ff  |....... ..ń.˙˙˙˙|
+ff ff ff ff 00 00 00 00  05 80 00 00 00 00 ff ff  |˙˙˙˙..........˙˙|
 00 00 00 f8 00 00 00 00  04 00 00 00 20 11 30 30  |...ř........ .00|
 30 30 3a 30 30 3a 30 31  2e 32 2f 75 68 63 69 00  |00:00:01.2/uhci.|

Comment 12 Laszlo Ersek 2014-01-24 16:46:11 UTC

Theory:

vmstate_acpi_post_load()
  pm_io_space_update()
    memory_region_set_enabled(&s->io, s->dev.config[0x80] & 1);

and in the diff above, the first difference is a 01 vs 00 byte.

This could be explained by the following:

(1) The initial suspend/resume pair includes a system reset (this is how resume starts). At this point:

  piix4_reset()
    pci_conf[0x80] = 0;

Now, in RHEL-7 we don't yet have Michael's upstream commit

  commit c046e8c4a26c902ca1b4f5bdf668a2da6bc75f54
  Author: Michael S. Tsirkin <mst>
  Date:   Wed Sep 11 13:33:31 2013 +0300

      piix4: disable io on reset

because this commit would *immediately* kill off the PM1a control block (by calling pm_io_space_update()). So, that control block remains enabled in the guest, which is a bug in itself, but anyway this is what happens in RHEL-7 now. Then,

(2) When the guest is saved to a file, this pci_conf[0x80] byte is saved (with contents 0, due to the reset in (1)).

(3) When the guest is reloaded from the file, the pci_conf[0x80] byte is loaded too (with contents 0), and at this time we *do* call pm_io_space_update(), from vmstate_acpi_post_load(). Hence the PM1a control block disappears.

So, the situation is as follows (to be verified of course):

- We need to backport Michael's upstream commit c046e8c4. At first this will only make things worse, because even after the first suspend/resume pair (ie. step (1)) the PM1a control block will be absent, and even directly subsequent suspend/resume attempts won't work.

- We need to *unbreak* the whole thing in SeaBIOS (--> new BZ), by backporting Marcel's upstream SeaBIOS patch

  commit 40d020f56226aee7c75a6c29f471c4b866765732
  Author: Marcel Apfelbaum <marcel.a>
  Date:   Wed Jan 15 14:20:06 2014 +0200

      resume: restore piix pm config registers after resume


I'm going to test this theory now.

Comment 14 Laszlo Ersek 2014-01-24 18:28:29 UTC

It suffices to backport Marcel's SeaBIOS commit 40d020f5.

Michael's qemu commit c046e8c4 *depends* on the former, and it improves qemu's correctness, but we don't need it to fix this BZ.

Comment 16 Miroslav Rezanina 2014-02-05 11:42:18 UTC

Fix included in seabios-1.7.2.2-11.el7

Comment 18 mazhang 2014-02-11 06:14:41 UTC

Reproduce this bug.

Host:
qemu-kvm-1.5.3-45.el7.x86_64
kernel-3.10.0-84.el7.x86_64
seabios-1.7.2.2-10.el7.x86_64

Guest:
kernel-3.10.0-64.el7.x86_64

Steps:
1 Start vm.
<domain type='kvm'>
  <name>vm1</name>
  <uuid>ce397040-fbe3-40e8-9301-30d8d8d9c387</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
    <loader>/usr/share/seabios/bios.bin</loader>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='yes'/>
    <suspend-to-disk enabled='yes'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/home/rhel7-64.raw'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <controller type='usb' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:bc:2f:12'/>
      <source bridge='switch'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/vm1.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <graphics type='spice' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='65536' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
</domain>

2. Install qemu-ga inside guest, and start service.
#systemctl start qemu-guest-agent.service

3. Run following command.
virsh # start vm1
Domain vm1 started

virsh #
virsh # list
 Id    Name                           State
----------------------------------------------------
 8     vm1                            running

virsh # dompmsuspend vm1 --target mem
Domain vm1 successfully suspended
virsh # dompmwakeup vm1
Domain vm1 successfully woken up
virsh # save vm1 /tmp/vm1.save

Domain vm1 saved to /tmp/vm1.save

virsh # restore /tmp/vm1.save
Domain restored from /tmp/vm1.save

virsh # dompmsuspend vm1 --target mem


^C

Result:
Virsh command line and guest hung when second time suspend vm.

Comment 19 mazhang 2014-02-11 06:20:53 UTC

Update to the latest seabios package and re-test this problem.

Host:
qemu-kvm-1.5.3-45.el7.x86_64
kernel-3.10.0-84.el7.x86_64
seabios-1.7.2.2-11.el7.x86_64

Guest:
kernel-3.10.0-64.el7.x86_64

Result:
virsh # start vm1
Domain vm1 started

virsh # dompmsuspend vm1 --target mem
Domain vm1 successfully suspended
virsh # dompmwakeup vm1
Domain vm1 successfully woken up
virsh # save vm1 /tmp/vm1.save

Domain vm1 saved to /tmp/vm1.save

virsh # restore /tmp/vm1.save
Domain restored from /tmp/vm1.save

virsh # 
virsh # dompmsuspend vm1 --target mem
Domain vm1 successfully suspended
virsh # dompmwakeup vm1
Domain vm1 successfully woken up

Virsh command deliver normally, guest works well, so this bug has been fixed.

Comment 21 Ludek Smid 2014-06-13 09:48:06 UTC

This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.