Bug 1049860
Summary: | Guest agent command hang there after restore the guest from the save file | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | zhenfeng wang <zhwang> | ||||
Component: | seabios | Assignee: | Laszlo Ersek <lersek> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.0 | CC: | acathrow, ajia, areis, dyuan, flang, gsun, hhuang, jdenemar, jiahu, juzhang, juzhou, marcel, mazhang, mst, mzhan, qzhang, sluo, virt-maint, xfu, ydu | ||||
Target Milestone: | rc | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | seabios-1.7.2.2-11.el7 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1049858 | Environment: | |||||
Last Closed: | 2014-06-13 09:48:06 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
zhenfeng wang
2014-01-08 11:37:16 UTC
Created attachment 847524 [details] libvirdt.log copied from bug 1049858 This seems to be a guest or guest-agent issue. Let's close this clone. See https://bugzilla.redhat.com/show_bug.cgi?id=1049858#c3 for more details. *** This bug has been marked as a duplicate of bug 1049858 *** As noted in bug 1049858, this issue can be reproduced with both RHEL-6 and RHEL-7 guests, so I'm reopening this bug (and moving to qemu-kvm for further investigation). I expect the issue to be similar in both cases but I feel it's best to track that in both products separately as the resolution may differ. Relevant comments copied from bug 1049858: bug 1049858#c3: According to libvirt logs qemu-agent responded to "guest-sync" command and libvirt is waiting for "guest-suspend-ram" command to either return an error or result in a suspended domain. This is a known issue with qemu-agent design and our interaction with it which is covered by bug 1028927. The question is why qemu-agent does not report any error while still failing to actually suspend the guest. I'm moving this bug to qemu-kvm for further investigation. BTW, what OS runs in the guest? And does changing it (as in RHEL6 vs. RHEL7) make any difference? bug 1049858#c3: My guest os is a rhel6 guest, and i can also reproduce this issue in rhel7 with a rhel7 guest, I will attach rhel7's libivrtd log to the attachment later. So my first question here would be, whether the actual guest command (ie. suspend the domain from inside) has any relevance. The steps in comment 0 say: (1) S3 suspend/resume from the inside [qga] (2) dump/restore (3) S3 suspend/resume from the inside [qga] what happens if (1) and (3) are replaced with another qga command? Second, if S3 is indeed needed to reproduce the bug, then for another test, we should just execute (1) and (2), then log into the guest via the normal graphical console, and run pm-suspend manually. See how that works. If it fails, then we might immediately have a qemu bug related to dump/restore. Third, I'll have to run gdb in the guest, likely... Narrowing it down a little bit: I configured the domain XML so that libvirt sets up an agent channel for me, but doesn't use it (name='org.qemu.guest_agent.1'). I connected to it with socat: socat unix-connect:/var/lib/libvirt/qemu/seabios.rhel6.agent readline Then I changed the test to: 1a. send the guest to sleep: {"execute":"guest-ping"} {"return": {}} {"execute":"guest-suspend-ram"} <no answer, guest is suspended> 1b. wake up the guest (from separate root shell): virsh qemu-monitor-command seabios.rhel6 --hmp system_wakeup 2a. save the guest as before: virsh save seabios.rhel6 /tmp/seabios.rhel6.save --verbose At this point the qemu process exits, and so does the socat process (seeing EOF on the unix domain socket). 2b. restore the guest as before: virsh restore /tmp/seabios.rhel6.save ; rm -f /tmp/seabios.rhel6.save 3a. reconnect: socat unix-connect:/var/lib/libvirt/qemu/seabios.rhel6.agent readline 3b. try to send the guest to sleep again: {"execute":"guest-ping"} {"return": {}} {"execute":"guest-suspend-ram"} <no answer, guest continues to run> That is, the virtio-serial line is alive, the guest agent is running and can communicate, "only" the guest-suspend-ram command fails (without any answer) 4. try to suspend the guest from an in-guest root shell: Now this guest doesn't have the "pm-suspend" command installed (pm-utils package). The "pm-is-supported" utility is also not available. So the question becomes, why and how did the suspend work in step 1a at all? OK, so the guest agent reads /sys/power/state for the supported suspend (Sx) states, and if "mem" is supported (and pm-utils is absent), then qga facilitates S3 by writing "mem" back into /sys/power/state. So, in step 4 above, I tried to do just that, manually, in the guest, with echo -n mem >/sys/power/state And what happens is, the guest goes to sleep, but it also wakes up *immediately*. (This is visible in dmesg and /var/log/messages.) All the while qemu doesn't seem to emit any events -- at least libvirt doesn't log any. I don't think that the guest kernel is the culprit. I suspect the ACPI emulation code in qemu more. Something likely goes wrong during save/resume. In addition, if in this state of the guest, I initiate a guest shutdown by issuing "shutdown -h now" at the guest root prompt, then the services are brought down correctly, but the *final* ACPI act of powering off the VM doesn't succeed. The qemu process stays running. In the following tests, "suspend" always means guest# echo -n mem >/sys/power/state resume always means host# virsh qemu-monitor-command seabios.rhel6 --hmp system_wakeup save always means host# virsh save seabios.rhel6 /tmp/seabios.rhel6.save --verbose restore always means: # virsh restore /tmp/seabios.rhel6.save ; rm -f /tmp/seabios.rhel6.save Tests (full shutdown between each of the three): suspend, resume, suspend, resume: PASS save, restore, suspend, resume: PASS suspend, resume, save, restore, suspend, XXXXXX: FAIL The second suspend in the third test fails to suspend the VM. In the failing state (ie. after suspend/resume/save/restore), the ACPI PM1a control block simply doesn't exist. Writes to it don't trap to the correct handler function (ie. acpi_pm_cnt_write()). Diffing the output of "info mtree" between "right after started" and "in failing state": --- when-started 2014-01-24 16:27:38.381024937 +0100 +++ when-broken 2014-01-24 16:26:30.418657946 +0100 @@ -5,7 +5,6 @@ 00000000000c0000-00000000000c3fff (prio 1, R-): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff 00000000000c4000-00000000000c7fff (prio 1, R-): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff 00000000000c8000-00000000000cbfff (prio 1, R-): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff - 00000000000ca000-00000000000ccfff (prio 1000, RW): alias kvmvapic-rom @pc.ram 00000000000ca000-00000000000ccfff 00000000000cc000-00000000000cffff (prio 1, R-): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff 00000000000d0000-00000000000d3fff (prio 1, RW): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff 00000000000d4000-00000000000d7fff (prio 1, RW): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff @@ -61,10 +60,6 @@ 000000000000ae00-000000000000ae0e (prio 0, RW): apci-pci-hotplug 000000000000af00-000000000000af1f (prio 0, RW): apci-cpu-hotplug 000000000000afe0-000000000000afe3 (prio 0, RW): apci-gpe0 - 000000000000b000-000000000000b03f (prio 0, RW): piix4-pm - 000000000000b000-000000000000b003 (prio 0, RW): acpi-evt - 000000000000b004-000000000000b005 (prio 0, RW): acpi-cnt - 000000000000b008-000000000000b00b (prio 0, RW): acpi-tmr 000000000000b100-000000000000b13f (prio 0, RW): pm-smbus 000000000000c000-000000000000c03f (prio 1, RW): virtio-pci 000000000000c040-000000000000c05f (prio 1, RW): uhci No idea why "kvmvapic-rom" is gone, and it's probably not important for now. However, the entire "piix4-pm" block is gone (which is configured otherwise by the piix4_pm_initfn() function [hw/acpi/piix4.c] and the functions it calls. It looks like after the first suspend-resume, either savevm does something wrong (it doesn't dump the piix4-pm vmstate), or it is dumped in such a form that loadvm can't restore it. Comparing the relevant parts of the vmstate files (they start at different offsets, but we care about piix4_pm only): --- after-start 2014-01-24 17:16:08.508147474 +0100 +++ after-suspend-resume 2014-01-24 17:16:08.020144606 +0100 @@ -1,23 +1,23 @@ 00 1f 15 30 30 30 30 3a 30 30 3a 30 31 2e 33 2f |...0000:00:01.3/| 70 69 69 78 34 5f 70 6d 00 00 00 00 00 00 00 03 |piix4_pm........| 00 00 00 02 86 80 13 71 03 01 80 02 03 00 80 06 |.......q........| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| f4 1a 00 11 00 00 00 00 00 00 00 00 00 00 00 00 |ô...............| 09 01 00 00 01 b0 00 00 00 00 00 00 00 00 00 00 |.....°..........| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 |................| 00 00 00 10 00 00 00 60 00 00 00 08 00 00 00 00 |.......`........| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| -00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 |................| +00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 01 b1 00 00 00 00 00 00 00 00 00 00 |.....ą..........| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 00 00 09 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| -00 00 00 00 00 01 01 20 00 01 f1 00 ff ff ff ff |....... ..ń.˙˙˙˙| -ff ff ff ff 00 00 00 00 00 00 00 00 00 00 ff ff |˙˙˙˙..........˙˙| +00 00 00 00 00 01 01 20 04 01 f1 00 ff ff ff ff |....... ..ń.˙˙˙˙| +ff ff ff ff 00 00 00 00 05 80 00 00 00 00 ff ff |˙˙˙˙..........˙˙| 00 00 00 f8 00 00 00 00 04 00 00 00 20 11 30 30 |...ř........ .00| 30 30 3a 30 30 3a 30 31 2e 32 2f 75 68 63 69 00 |00:00:01.2/uhci.| Theory: vmstate_acpi_post_load() pm_io_space_update() memory_region_set_enabled(&s->io, s->dev.config[0x80] & 1); and in the diff above, the first difference is a 01 vs 00 byte. This could be explained by the following: (1) The initial suspend/resume pair includes a system reset (this is how resume starts). At this point: piix4_reset() pci_conf[0x80] = 0; Now, in RHEL-7 we don't yet have Michael's upstream commit commit c046e8c4a26c902ca1b4f5bdf668a2da6bc75f54 Author: Michael S. Tsirkin <mst> Date: Wed Sep 11 13:33:31 2013 +0300 piix4: disable io on reset because this commit would *immediately* kill off the PM1a control block (by calling pm_io_space_update()). So, that control block remains enabled in the guest, which is a bug in itself, but anyway this is what happens in RHEL-7 now. Then, (2) When the guest is saved to a file, this pci_conf[0x80] byte is saved (with contents 0, due to the reset in (1)). (3) When the guest is reloaded from the file, the pci_conf[0x80] byte is loaded too (with contents 0), and at this time we *do* call pm_io_space_update(), from vmstate_acpi_post_load(). Hence the PM1a control block disappears. So, the situation is as follows (to be verified of course): - We need to backport Michael's upstream commit c046e8c4. At first this will only make things worse, because even after the first suspend/resume pair (ie. step (1)) the PM1a control block will be absent, and even directly subsequent suspend/resume attempts won't work. - We need to *unbreak* the whole thing in SeaBIOS (--> new BZ), by backporting Marcel's upstream SeaBIOS patch commit 40d020f56226aee7c75a6c29f471c4b866765732 Author: Marcel Apfelbaum <marcel.a> Date: Wed Jan 15 14:20:06 2014 +0200 resume: restore piix pm config registers after resume I'm going to test this theory now. It suffices to backport Marcel's SeaBIOS commit 40d020f5. Michael's qemu commit c046e8c4 *depends* on the former, and it improves qemu's correctness, but we don't need it to fix this BZ. Fix included in seabios-1.7.2.2-11.el7 Reproduce this bug. Host: qemu-kvm-1.5.3-45.el7.x86_64 kernel-3.10.0-84.el7.x86_64 seabios-1.7.2.2-10.el7.x86_64 Guest: kernel-3.10.0-64.el7.x86_64 Steps: 1 Start vm. <domain type='kvm'> <name>vm1</name> <uuid>ce397040-fbe3-40e8-9301-30d8d8d9c387</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> <loader>/usr/share/seabios/bios.bin</loader> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <pm> <suspend-to-mem enabled='yes'/> <suspend-to-disk enabled='yes'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source file='/home/rhel7-64.raw'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'/> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='bridge'> <mac address='52:54:00:bc:2f:12'/> <source bridge='switch'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/vm1.agent'/> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'/> <input type='mouse' bus='ps2'/> <graphics type='spice' autoport='yes' listen='0.0.0.0'> <listen type='address' address='0.0.0.0'/> </graphics> <video> <model type='qxl' ram='65536' vram='65536' heads='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </memballoon> </devices> </domain> 2. Install qemu-ga inside guest, and start service. #systemctl start qemu-guest-agent.service 3. Run following command. virsh # start vm1 Domain vm1 started virsh # virsh # list Id Name State ---------------------------------------------------- 8 vm1 running virsh # dompmsuspend vm1 --target mem Domain vm1 successfully suspended virsh # dompmwakeup vm1 Domain vm1 successfully woken up virsh # save vm1 /tmp/vm1.save Domain vm1 saved to /tmp/vm1.save virsh # restore /tmp/vm1.save Domain restored from /tmp/vm1.save virsh # dompmsuspend vm1 --target mem ^C Result: Virsh command line and guest hung when second time suspend vm. Update to the latest seabios package and re-test this problem. Host: qemu-kvm-1.5.3-45.el7.x86_64 kernel-3.10.0-84.el7.x86_64 seabios-1.7.2.2-11.el7.x86_64 Guest: kernel-3.10.0-64.el7.x86_64 Result: virsh # start vm1 Domain vm1 started virsh # dompmsuspend vm1 --target mem Domain vm1 successfully suspended virsh # dompmwakeup vm1 Domain vm1 successfully woken up virsh # save vm1 /tmp/vm1.save Domain vm1 saved to /tmp/vm1.save virsh # restore /tmp/vm1.save Domain restored from /tmp/vm1.save virsh # virsh # dompmsuspend vm1 --target mem Domain vm1 successfully suspended virsh # dompmwakeup vm1 Domain vm1 successfully woken up Virsh command deliver normally, guest works well, so this bug has been fixed. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |