RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1049858 - Guest agent command hang there after restore the guest from the save file
Summary: Guest agent command hang there after restore the guest from the save file
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.5
Hardware: x86_64
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Ademar Reis
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 912287
TreeView+ depends on / blocked
 
Reported: 2014-01-08 11:33 UTC by zhenfeng wang
Modified: 2014-06-09 14:59 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1049860 (view as bug list)
Environment:
Last Closed: 2014-06-05 22:16:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
The libvirtd's log while the guest agent hang (3.90 MB, text/plain)
2014-01-09 07:16 UTC, zhenfeng wang
no flags Details
The libvirtd's log while the guest agent hang in rhel7 host (4.43 MB, text/plain)
2014-01-09 10:18 UTC, zhenfeng wang
no flags Details

Description zhenfeng wang 2014-01-08 11:33:58 UTC
Description of problem:
Guest agent command hang there after restore the guest from the save file
Version:
virtio-win-1.6.7-2.el6.noarch
qemu-kvm-rhev-0.12.1.2-2.415.el6_5.3.x86_64
kernel-2.6.32-432.el6.x86_64
libvirt-0.10.2-29.el6_5.2.x86_64

1.# getenforce
Enforcing

2.Prepare a guest with qemu-ga ENV,add below config to domain xml. 
...
<pm>
    <suspend-to-mem enabled='yes'/>
    <suspend-to-disk enabled='yes'/>
</pm>
...
<channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/r6.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
...

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 30    r6                             running
3. Run the following cmds, the guest agent command will hang there
# virsh dompmsuspend r6 --target mem
Domain r6 successfully suspended
# virsh dompmwakeup r6
Domain r6 successfully woken up
# virsh save r6 /tmp/r6.save

Domain r6 saved to /tmp/r6.save

# virsh restore /tmp/r6.save 
Domain restored from /tmp/r6.save

# virsh dompmsuspend r6 --target mem
^C                                                                <======hung here.
# virsh save r6 /tmp/r6.save
error: Failed to save domain r6 to /tmp/r6.save
error: Timed out during operation: cannot acquire state change lock

Comment 1 Jiri Denemark 2014-01-08 12:23:14 UTC
Can you provide libvirtd logs?

Comment 2 zhenfeng wang 2014-01-09 07:16:38 UTC
Created attachment 847502 [details]
The libvirtd's log while the guest agent hang

Comment 3 Jiri Denemark 2014-01-09 08:26:00 UTC
According to libvirt logs qemu-agent responded to "guest-sync" command and libvirt is waiting for "guest-suspend-ram" command to either return an error or result in a suspended domain. This is a known issue with qemu-agent design and our interaction with it which is covered by bug 1028927. The question is why qemu-agent does not report any error while still failing to actually suspend the guest. I'm moving this bug to qemu-kvm for further investigation.

BTW, what OS runs in the guest? And does changing it (as in RHEL6 vs. RHEL7) make any difference?

Comment 4 Jiri Denemark 2014-01-09 08:28:05 UTC
*** Bug 1049860 has been marked as a duplicate of this bug. ***

Comment 5 zhenfeng wang 2014-01-09 09:18:01 UTC
Hi jiri
My guest os is a rhel6 guest, and i can also reproduce this issue in rhel7 with a rhel7 guest, I will attach rhel7's libivrtd log to the attachment later. BTW, The bug 1028927 was cloned from bug 890648, As Peter said in bug 890648 in commen12 that the issue in this bug was a separate issue with bug 890648, so i think the bug 1028927 can't cover the bug 1049860, we should still regard it as a bug which cloned from this bug and continue use it to trace this issue in rhel7 ,right? please help recheck it thanks.

Comment 6 Jiri Denemark 2014-01-09 09:36:09 UTC
Ah, thanks. In that case, I'll reopen 1049860 as this issue affects both RHEL-6 and RHEL-7 guests.

Comment 7 zhenfeng wang 2014-01-09 10:18:20 UTC
Created attachment 847551 [details]
The libvirtd's log while the guest agent hang in rhel7 host

Comment 8 Sibiao Luo 2014-04-16 06:07:06 UTC
Tried this issue using savevm/loadvm with virt-agent which guest agent commands did not hang after restore the guest from the save.

host info:
# uname -r && rpm -q qemu-kvm-rhev
2.6.32-448.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.424.el6.x86_64
guest info:
2.6.32-448.el6.x86_64
qemu-guest-agent-0.12.1.2-2.424.el6.x86_64

Steps:
1.launch a QEMU guest.
# /usr/libexec/qemu-kvm -M pc -S -cpu SandyBridge -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -no-kvm-pit-reinjection -usb -device usb-tablet,id=input0 -name sluo -uuid 990ea161-6b67-47b2-b803-19fb01d30d30 -rtc base=localtime,clock=host,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 -drive file=/home/RHEL6.5-20131019.1_Server_x86_64.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,vectors=0,bus=pci.0,addr=0x4,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=00:01:02:B6:40:21,bus=pci.0,addr=0x5 -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x6 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :2 -spice disable-ticketing,port=5932 -monitor stdio
2.install guest agent RPM and start the virtagent service in guest.
# service qemu-ga restart
Stopping qemu-ga:                                          [  OK  ]
Starting qemu-ga:                                          [  OK  ]
3.check thevirtagent if work well.
# nc -U /tmp/qga.sock
{"execute":"guest-ping"}
{"return": {}}
4.savevm.
(qemu) savevm snap1
(qemu) info status 
VM status: running
5.check thevirtagent if work well.
{"execute":"guest-ping"}
{"return": {}}
6.loadvm.
(qemu) loadvm snap1
inputs_detach_tablet: 
(qemu) info status 
VM status: running
7.check thevirtagent if work well.
{"execute":"guest-ping"}
{"return": {}}
{"execute":"guest-shutdown"}
       <-----shutdown guest correctly.

Best Regards,
sluo

Comment 9 Sibiao Luo 2014-04-16 06:55:35 UTC
Also tried the windows guest (win7 64bit), the results is same to comment #8.

areis, does the virsh save/restore equal to QEMU savevm/loadvm ? As i cann't reproduce this issue with QEMU command line.

host info:
# uname -r && rpm -q qemu-kvm-rhev
2.6.32-448.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.424.el6.x86_64
guest info:
win7 64bit
qemu-ga-win-7.0-8
virtio-win-prewhql-0.1-79


Best Regards,
sluo

Comment 10 Ademar Reis 2014-04-16 14:41:48 UTC
(In reply to Sibiao Luo from comment #9)
> Also tried the windows guest (win7 64bit), the results is same to comment #8.
> 
> areis, does the virsh save/restore equal to QEMU savevm/loadvm ? As i cann't
> reproduce this issue with QEMU command line.

Jiri, can you please tell us what's the difference between the direct save/restore procedure (by hand) and the save/restore feature from virsh?

> 
> host info:
> # uname -r && rpm -q qemu-kvm-rhev
> 2.6.32-448.el6.x86_64
> qemu-kvm-rhev-0.12.1.2-2.424.el6.x86_64
> guest info:
> win7 64bit
> qemu-ga-win-7.0-8
> virtio-win-prewhql-0.1-79
>

Comment 11 Jiri Denemark 2014-04-18 10:54:26 UTC
savevm/loadvm is used by libvirt for doing snapshots, which is a bit different. The save/restore virsh commands make use of migration. That is, save will migrate the domain to a file and restore will take the file and feed it to a new qemu process with -incoming option.

Comment 12 Sibiao Luo 2014-04-24 09:27:44 UTC
(In reply to Jiri Denemark from comment #11)
> savevm/loadvm is used by libvirt for doing snapshots, which is a bit
> different.
Yes, savevm/loadvm just for snapshots. 
> The save/restore virsh commands make use of migration. That is,
> save will migrate the domain to a file and restore will take the file and
> feed it to a new qemu process with -incoming option.

I also tried the offline migration which savevm/loadvm to an external state file with both rhel6 and win7 64bit guests which virtagent still work well after offline migration, So I don't think it was qemu-kvm issue, maybe zhwang used the wrong qemu-ga-win package.

host info:
# uname -r && rpm -q qemu-kvm-rhev
2.6.32-448.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.424.el6.x86_64
guest info:
rhel6: kernel-2.6.32-448.el6.x86_64
win7 64bit
qemu-ga-win-7.0-8
virtio-win-prewhql-0.1-79

1.boot up a KVM guest with virtagent.
e.g.:...-device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0
2.check the virtagent if work.
# nc -U /tmp/qga.sock
{"execute":"guest-ping"}
3.save VM state into a compressed file.
(qemu) stop
(qemu) info status 
VM status: paused
(qemu) migrate "exec:gzip -c > /home/sluo.gz"                        
4.load VM state from the compressed file.
<qemu-command-line> -incoming "exec: gzip -c -d /home/sluo.gz"
(qemu) info status 
VM status: running
5.check the virtagent if work.
# nc -U /tmp/qga.sock
{"execute":"guest-ping"}
{"execute":"guest-shutdown"}

Results:
after step 2, virtagent work correctly.
# nc -U /tmp/qga.sock
{"execute":"guest-ping"}
{"return": {}}
after step 5, virtagent still work correctly.
# nc -U /tmp/qga.sock
{"execute":"guest-ping"}
{"return": {}}
{"execute":"guest-shutdown"}
            <-------shutdown guest successfully.

Best Regards,
sluo

Comment 13 Sibiao Luo 2014-04-24 09:35:08 UTC
(In reply to zhenfeng wang from comment #0)
> Version:
> virtio-win-1.6.7-2.el6.noarch
Maybe the qemu-ga-win package (virtio-win-1.6.7-2.el6) you used was too old, we have so much change for qemu-ga-win recently, you should use the latest qemu-ga-win package for your libvirt ordinary testing, like:
https://brewweb.devel.redhat.com/packageinfo?packageID=44209
Please refer to comment #12, and could you help check if work well using above latest qemu-ga-win package, thanks.
> qemu-kvm-rhev-0.12.1.2-2.415.el6_5.3.x86_64
> kernel-2.6.32-432.el6.x86_64
> libvirt-0.10.2-29.el6_5.2.x86_64

Best Regards,
sluo

Comment 14 zhenfeng wang 2014-04-25 03:28:09 UTC
Hi Sibiao
Maybe this bug didn't have relationship with virtio-win package, since i met this issue on the rhel guest originally and i can also reproduce this bug on libvirt side with rhel6 guest on rhel6 host, the following was my reproduce steps

pkg info
libvirt-0.10.2-29.el6_5.7.x86_64
kernel-2.6.32-431.14.1.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.424.el6.x86_64
qemu-guest-agent-0.12.1.2-2.424.el6.x86_64

steps
1.Prepare a guest with qemu-ga env, start the guest
#virsh start rhel6n

# ps aux|grep rhel6n
qemu     25257  1.9  0.8 1729292 288924 ?      Sl   11:21   0:05 /usr/libexec/qemu-kvm -name rhel6n -S -M rhel6.5.0 -enable-kvm -m 1024 -realtime mlock=off -smp 2,maxcpus=3,sockets=3,cores=1,threads=1 -uuid ce3a1930-0c66-9229-8cf8-6aad9da6ade1 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel6n.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/rhel6n.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:8a:31:05,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/rhel6n.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -incoming fd:24 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7



2. Run the following cmds, The guest agent will hang there after i restore the guest from the save file

# virsh dompmsuspend rhel6n --target mem
Domain rhel6n successfully suspended
# virsh list
 Id    Name                           State
----------------------------------------------------
 5     rhel6n                         pmsuspended

# virsh dompmwakeup rhel6n
Domain rhel6n successfully woken up

# virsh list
 Id    Name                           State
----------------------------------------------------
 5     rhel6n                         running

# virsh save rhel6n /tmp/rhel6n123.save

Domain rhel6n saved to /tmp/rhel6n123.save

# virsh restore /tmp/rhel6n123.save
Domain restored from /tmp/rhel6n123.save

# virsh dompmsuspend rhel6n --target mem
^C


3.check the guest's status in another terminal
# virsh list
 Id    Name                           State
----------------------------------------------------
 4     rhel6n                         running

4.login the guest, then do the S3 inside the guest, the guest will wakeup automatically immediately after i do the S3 operation inside the guest
guest#pm-suspend

Comment 15 Sibiao Luo 2014-04-28 08:49:56 UTC
Hi amit, 
 
   Could you help check this bug whether is ACPI problem while not the qemu-ga issue? thanks in advance.

Best Regards,
sluo

Comment 16 Amit Shah 2014-05-20 06:51:23 UTC
I'm sorry I haven't been able to get to this; I hope Marcel can provide more info.

Comment 17 Marcel Apfelbaum 2014-06-05 07:30:48 UTC
I am going to try to reproduce this without the qemu agent in the next few days.

Comment 18 Ademar Reis 2014-06-05 22:16:45 UTC
S3/S4 support is tech-preview in RHEL6 and it'll be promoted to fully supported
at some point, but only in RHEL7.

Therefore we're closing all S3/S4 related bugs in RHEL6. New bugs will be
considered only if they're regressions or break some important use-case or
certification.

RHEL7 is being more extensively tested and effort from QE is underway in
certifying that this particular bug is not present there.

Please reopen with a justification if you believe this bug should not be
closed. We'll consider them on a case-by-case basis following a best effort
approach.


Thank you.

Comment 19 Marcel Apfelbaum 2014-06-09 14:59:40 UTC
SInce the bug is closed, I stopped looking into it.


Note You need to log in before you can comment on or make changes to this bug.