RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 816893 - qemu-ga: commands may fail before a 'guest-ping'
Summary: qemu-ga: commands may fail before a 'guest-ping'
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.3
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Luiz Capitulino
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 826696 (view as bug list)
Depends On:
Blocks: 804141 820481 822062 831387
TreeView+ depends on / blocked
 
Reported: 2012-04-27 09:10 UTC by Qunfang Zhang
Modified: 2013-02-21 07:51 UTC (History)
16 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.297.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-21 07:34:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
when second time do S4 (24.63 KB, image/png)
2012-10-10 02:52 UTC, langfang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:0527 0 normal SHIPPED_LIVE qemu-kvm bug fix and enhancement update 2013-02-20 21:51:08 UTC

Description Qunfang Zhang 2012-04-27 09:10:16 UTC
Description of problem:
Boot guest and start "qemu-ga" service inside guest. Then send some commands to guest for example {"execute":"guest-suspend-ram"} or {"execute":"guest-suspend-disk"}. But it prompts "Unsupported" error instead of suspend guest and prompts {"return": {}}. 
Then if I send {"execute":"guest-ping"} and re-send {"execute":"guest-suspend-ram"} again, it works. 
This issue happens when start qemu-ga service inside guest.
If I start command "#qemu-ga -m virtio-serial -p /dev/virtio-ports/org.qemu.guest_agent.0" instead of start the qemu-ga service, have no this problem.

Version-Release number of selected component (if applicable):
Guest: 
qemu-guest-agent-0.12.1.2-2.285.el6.x86_64
kernel-2.6.32-262.el6.x86_64

Host:
kernel-2.6.32-262.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.285.el6.x86_64
seabios-0.6.1.2-19.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest with virtio serial 
 /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu Conroe -enable-kvm -uuid d782bf5c-e817-411b-a9cf-545ae7c0f101 -rtc base=localtime,driftfix=slew -m 8G -smp 2,sockets=1,cores=2,threads=1 -name rhel6.3-64 -drive file=/home/rhel6.3-64-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=zhang,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,scsi=off -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:42:0b:00,bus=pci.0,addr=0x3 -monitor stdio -boot c -qmp tcp:0:5555,server,nowait -vnc :10 -device sga -device virtio-balloon-pci,id=balloon0,bus=pci.0,id=0x6 -bios /usr/share/seabios/bios-pm.bin  -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device  virtio-serial -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0


2. Install qemu-guest-agent-0.12.1.2-2.285.el6.x86_64 inside guest

3. Inside guest: #service qemu-ga start

4. On host: 
#nc -U /tmp/qga.sock
{"execute":"guest-suspend-ram"}

5. On host:
{"execute":"guest-ping"}
{"execute":"guest-suspend-ram"}
  
Actual results:
After step 4: 
{"execute":"guest-suspend-ram"}
{"error": {"class": "Unsupported", "data": {}}}

After step 5:
{"execute":"guest-ping"}
{"return": {}}
{"execute":"guest-suspend-ram"}
{"return": {}}

Guest suspend to mem correctly.

Expected results:
After step 4, guest can suspend to mem successfully.

Additional info:
If:
Step 3:
#qemu-ga -m virtio-serial -p /dev/virtio-ports/org.qemu.guest_agent.0
(instead of "service qemu-ga start")
This problem will be gone.

Comment 1 Luiz Capitulino 2012-04-27 12:56:07 UTC
Is this 100% reproducible or is it difficult to reproduce? If it's difficult to reproduce, then this is likely to be bug 805533.

Comment 2 Luiz Capitulino 2012-04-27 12:57:21 UTC
By the way, the bug description says "some commands", have you tested this against other commands or does it only happen with guest-suspend-ram?

Comment 3 Alex Jia 2012-04-27 15:35:20 UTC
(In reply to comment #1)
> Is this 100% reproducible or is it difficult to reproduce? If it's difficult to
> reproduce, then this is likely to be bug 805533.

Hi Luiz,
I originally found the bug in libvirt side(bug 766958, see comment 48 ), if I started qema-ga as a service such as 'service qemu-ga start' then I can 100% reproduce the issue.

Regards,
Alex

Comment 4 Luiz Capitulino 2012-04-27 18:20:22 UTC
Ok, I'll investigate this soon.

Comment 5 Qunfang Zhang 2012-04-28 02:56:54 UTC
(In reply to comment #1)
> Is this 100% reproducible or is it difficult to reproduce? If it's difficult to
> reproduce, then this is likely to be bug 805533.
Yes as Alex replied, if using qemu-ga service, it's 100% reproduced.

(In reply to comment #2)
> By the way, the bug description says "some commands", have you tested this
> against other commands or does it only happen with guest-suspend-ram?

The following commands will not work before a 'guest-ping':
{"execute":"guest-suspend-ram"}
{"error": {"class": "Unsupported", "data": {}}}

{"execute":"guest-suspend-disk"}
{"error": {"class": "Unsupported", "data": {}}}

{"execute":"guest-suspend-hybrid"}
{"error": {"class": "Unsupported", "data": {}}}

{"execute":"guest-sync"}
{"error": {"class": "InvalidParameterType", "data": {"name": "id", "expected": "integer"}}}

Other supported commands works before 'guest-ping'.

Comment 6 Luiz Capitulino 2012-05-02 16:45:17 UTC
Thanks a lot Qunfang for the clarification.

Haven't started looking at this yet, is this urgent?

Comment 7 Daniel Veillard 2012-05-08 07:51:03 UTC
C.f. the comment 63 on bug 766958 that's annoying for QE testing of the
S4 feature at least !

Daniel

Comment 8 Luiz Capitulino 2012-05-09 20:22:12 UTC
I've started looking at this today, so let me update you about my progress.

First, yes, it's a qemu-ga bug and I can reproduce it on RHEL6.3 and on upstream. It took very long for me to debug this because most debugging code I added made the bug go away. Even enabling logging on qemu-ga upstream setup makes the bug go away.

One important information is that the bug only happens when --daemon is passed to qemu-ga. That's probably why it works if you run it by hand. Another important info is that I already know what causes the Unsupported error to be returned, although I don't know why nor why --daemon is related. Lastly, a call to g_logv() seems to serialize things and that's why guest-ping makes things work.

We have three options here:

1. keep investigating the real cause and propose a fix
2. (test and if it works) backport the "make guest-shutdown and guest-suspend-* synchronous" patches (not posted upstream yet though)
3. Add a hack to libvirt to issue guest-ping before calling the suspend functions

As I'm almost sure this is a race or stupid bug in bios_supports_mode() and as the series mentioned in item 2 re-works that function entirely, I really think that that's going to be the upstream fix.

In any case, I'll debug this a bit more tomorrow and could do item 1 for RHEL6.3 if it turns out to be simple.

Item 3 should be our last resource.

Comment 9 Luiz Capitulino 2012-05-09 20:32:42 UTC
Oh, knew it could be something "stupid". Looks like we're messing with fds in qemu-ga. Will only have time to fully confirm this tomorrow though.

Comment 10 Luiz Capitulino 2012-05-10 19:56:56 UTC
Yes, this is caused by a bug on how qemu-ga handles its fds. I've posted the fix upstream (will post the link here when it appears in the archive).

Now, we're past snapshot3 for RHEL6.3 and for now on only blockers will be accepted. This obviously isn't a blocker, but will certainly impact suspend testing. So I'll check if it's possible to get this into a z-stream, otherwise this will have to be moved to 6.4.

Comment 12 Luiz Capitulino 2012-05-10 21:08:50 UTC
Upstream fix:

http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg01507.html

Comment 14 Luiz Capitulino 2012-05-14 18:25:48 UTC
As this one has missed the deadline, I'll get it fixed for 6.4 first and then will propose it for 6.3.z.

This is not urgent, but I think it can hurt qemu-ga testing.

Comment 15 Luiz Capitulino 2012-05-31 14:22:19 UTC
*** Bug 826696 has been marked as a duplicate of this bug. ***

Comment 20 langfang 2012-10-09 10:45:14 UTC
test this bug as follow version:
host:
# uname -r
2.6.32-315.el6.x86_64
rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.320.el6.x86_64
guest:
# uname -r
2.6.32-325.el6.x86_64

steps:
1.boot guest:
/usr/libexec/qemu-kvm -M rhel6.3.0 -cpu Penryn -enable-kvm -uuid `uuidgen` -rtc base=localtime,driftfix=slew -m 8G -smp 2,sockets=1,cores=2,threads=1 -name rhel6.3-64 -drive file=/home/RHEL-Server-6.3-64-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=zhang,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,scsi=off -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=04:1a:4a:42:0b:00,bus=pci.0,addr=0x3 -monitor stdio -boot c -qmp tcp:0:5555,server,nowait -vnc :10 -device sga -device virtio-balloon-pci,id=balloon0,bus=pci.0,id=0x6  -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device  virtio-serial -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0  -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0
2.Inside guest: #service qemu-ga start

3. On host:
nc -U /tmp/qga.sock
{"execute":"guest-suspend-ram"}
{"error": {"class": "Unsupported", "data": {}}}
{"execute":"guest-suspend-hybrid"}
{"error": {"class": "Unsupported", "data": {}}}
{"execute":"guest-ping"}
{"return": {}}
{"execute":"guest-suspend-ram"}--->do S3
{"return": {}}
{"execute":"guest-suspend-disk"}--->do S4
{"return": {}}

4.boot guest with same CLI (guest resume from S4)
5.Inside guest: 
#service qemu-ga status
qemu-ga (pid  2227) is running ...
6.on host
# nc -U /tmp/qga.sock
{"execute":"guest-suspend-hybrid"}
{"error": {"class": "Unsupported", "data": {}}}
{"execute":"guest-suspend-ram"}
{"return": {}}
{"execute":"guest-suspend-hybrid"}
{"error": {"class": "Unsupported", "data": {}}}
{"execute":"guest-suspend-hybrid"}
{"error": {"class": "Unsupported", "data": {}}}
{"execute":"guest-suspend-ram"}--->do S3
{"return": {}}



above above test ,this issue still exist ,so reassign this bug.

Comment 21 Luiz Capitulino 2012-10-09 12:49:33 UTC
Which version of qemu-ga you installed in the *guest*? You should install qemu-guest-agent .297 or later.

Comment 22 langfang 2012-10-10 02:49:53 UTC
Luiz ,thank very much your reminder.

verify this bug again as follow version:
host:
# uname -r
2.6.32-315.el6.x86_64
rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.320.el6.x86_64
guest:
# uname -r
2.6.32-325.el6.x86_64
qemu-guest-agent-0.12.1.2-2.321.el6.x86_64


steps:
1.boot guest 
/usr/libexec/qemu-kvm -M rhel6.3.0 -cpu Penryn -enable-kvm -uuid `uuidgen` -rtc base=localtime,driftfix=slew -m 8G -smp 2,sockets=1,cores=2,threads=1 -name rhel6.3-64 -drive file=/home/RHEL-Server-6.3-64-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=zhang,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,scsi=off -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=04:1a:4a:42:0b:00,bus=pci.0,addr=0x3 -monitor stdio -boot c -qmp tcp:0:5555,server,nowait -vnc :10 -device sga -device virtio-balloon-pci,id=balloon0,bus=pci.0,id=0x6  -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device  virtio-serial -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0  -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0
2.Inside guest: #service qemu-ga start

3. On host:
# nc -U /tmp/qga.sock
{"execute":"guest-suspend-hybrid"}
{"error": {"class": "Unsupported", "data": {}}}
{"execute":"guest-sync"}
{"error": {"class": "InvalidParameterType", "data": {"name": "id", "expected": "integer"}}}
{"execute":"guest-suspend-ram"}--->do S3
{"execute":"guest-suspend-disk"}---->do S4
4.boot guest with same CLI
inside guest:
#service qemu-ga status
qemu-ga (pid  2227) is running ...

on host:
# nc -U /tmp/qga.sock
{"execute":"guest-sync","arguments":{"id":1234}}
{"return": 1234}
{"execute":"guest-suspend-hybrid"}
{"error": {"class": "Unsupported", "data": {}}}
{"execute":"guest-suspend-ram"}-->second time do S3
{"execute":"guest-suspend-disk"}--->second time do S4,guest hang,see the attachment about show on guest


above test ,when execute "service qemu-ga start" ,guest can suspend to mem/disk successfully first time,but when after resume,do S4 second time,guest hang.

Comment 23 langfang 2012-10-10 02:52:15 UTC
Created attachment 624476 [details]
when second time do S4

Comment 24 langfang 2012-10-10 03:11:06 UTC
when guest hang.

#top
Tasks: 161 total,   1 running, 160 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.1%us,  0.0%sy,  0.0%ni, 74.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   7509328k total,  5932360k used,  1576968k free,    48968k buffers
Swap: 58720240k total,      628k used, 58719612k free,  3930996k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                          
 1968 root      20   0 8764m 1.5g 4408 S 101.4 21.6   2:29.89 qemu-kvm  


addinfo:
tied step2 on comment22 to use 
#qemu-ga -m virtio-serial -p /dev/virtio-ports/org.qemu.guest_agent.0
(instead of "service qemu-ga start")
also have the problem



attention:when guest hang ,please wait about 2 min,guest will show Call Trce.

do you think we need report another bug to track the problem?

Comment 25 langfang 2012-10-10 03:21:32 UTC
from above test ,i will change this bug to verify ,about i hit the problem,i will find whether there have a exist bug or new issue.thanks

Comment 26 Luiz Capitulino 2012-10-10 13:36:07 UTC
Yes, the issue you found is unrelated to qemu-ga and this bug. It's either, a qemu issue or a guest kernel issue.

Please, open a new bz for it. I also recommend reproducing the problem without qemu-ga (ie. doing echo mem > /sys/power/state directly), as this will make it easier to investigate the problem.

Comment 27 langfang 2012-10-12 05:05:52 UTC
(In reply to comment #26)
> Yes, the issue you found is unrelated to qemu-ga and this bug. It's either,
> a qemu issue or a guest kernel issue.
> 
> Please, open a new bz for it. I also recommend reproducing the problem
> without qemu-ga (ie. doing echo mem > /sys/power/state directly), as this
> will make it easier to investigate the problem.

hi Luiz,thanks very much your suggestion,for this issue ,i have open a new bug https://bugzilla.redhat.com/show_bug.cgi?id=864780.

Comment 29 errata-xmlrpc 2013-02-21 07:34:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0527.html


Note You need to log in before you can comment on or make changes to this bug.