Bug 970161

Summary: qemu-guest-agent-win32 crashes on "guest-suspend-disk"
Product: Red Hat Enterprise Linux 6 Reporter: Peter Krempa <pkrempa>
Component: qemu-kvmAssignee: Gal Hammer <ghammer>
Status: CLOSED NEXTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: urgent    
Version: 6.4CC: acathrow, areis, bsarathy, chayang, cwei, dyuan, gsun, juzhang, lersek, michen, mkenneth, mprivozn, mzhan, pkrempa, qzhang, sluo, virt-bugs, virt-maint, zhwang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Windows   
Whiteboard:
Fixed In Version: qemu-ga-win-6.5-4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 890648 Environment:
Last Closed: 2013-12-24 13:37:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 896690, 890648, 1080376    

Description Peter Krempa 2013-06-03 15:00:35 UTC
Description of problem:
The windows guest agent crashes when the "guest-suspend-disk" command is issued. Libvirt then blocks as it expects a reply.

Version-Release number of selected component (if applicable):
qemu-guest-agent-win32-0.12.1.2-2.371

How reproducible:
100%

Steps to reproduce:
1.) Run a windows guest with guest agent channel configured
2.) Start the guest agent (in verbose mode ideally)
3.) issue the guest-suspend-disk command, also possible via virsh:
virsh qemu-agent-command guest '{"execute":"guest-suspend-disk"}' --timeout 5
(there is a native suspend api, but the libvirtd daemon will hang on the command that did not return)

Additional info:
I can try to catch a memory dump of the process in windows if required.

+++ This bug was initially created as a clone of Bug #890648 +++

Description of problem:
Fail to excute s3/s4 operation for the windows guest which running the guest agent service
Version-Release number of selected component (if applicable):
libvirt-0.10.2-13.el6.x86_64
kernel-2.6.32-348.el6.x86_64  
qemu-kvm-0.12.1.2-2.346.el6.x86_64

How reproducible:
100%
1 Install the virtio-win-1.5.4-1.el6.noarch pkg to get the virtio-serial and spice+qxl drivers
The virtio-serial driver was in
# ls /usr/share/virtio-win/virtio-win-1.5.4.iso
/usr/share/virtio-win/virtio-win-1.5.4.iso

The spice+qxl driver was in
# ls /usr/share/virtio-win/   -----you need make a iso file for this directory
#mkisofs -o /var/lib/libvirt/images/virtiowin.iso /usr/share/virtio-win/

2 Prepare a windows guest with the virtio-serial and spice+qxl driver installed
# virsh dumpxml win7x86
  <domain type='kvm'>
  <name>win7-32</name>
  <uuid>ad61420e-b3c6-b50e-16ab-73009cbf9b6d</uuid>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='i686' machine='rhel6.4.0'>hvm</type>
    <loader>/usr/share/seabios/bios.bin</loader>
    <boot dev='hd'/>
  </os>

---
  <pm>
    <suspend-to-mem enabled='yes'/>
    <suspend-to-disk enabled='yes'/>
  </pm>
---
  <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/win7-32.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <graphics type='spice' autoport='yes'/>
    <video>
      <model type='qxl' vram='65536' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>

----

3 Install the qemu-guest-agent-win32-0.12.1.2-2.346.el6.x86_64 on a rhel host and get the executable
# ll /usr/share/qemu-kvm/qemu-ga-win32/
total 464
-rwxr-xr-x. 1 root root 467160 Dec 14 12:23 qemu-ga.exe
-r--r--r--. 1 root root   1155 Dec 14 12:16 README.txt

4 Install the qemu-ga service in guest
mkdir a folder named qemu-ga in the windows guest then put the qemu-ga.exe  and other three dll file which needed in README.TXT to the qemu-ga directory
#c:\qemu-ga> dir
qemu-ga.exe
iconv.dll
libglib-2.0.0.dll
libintl-8.dll
README.txt

#c:\qemu-ga\qemu-ga.exe --service install

5 Check the qemu-ga  service statu with the command  services.msc
#c:\services.msc
6 operation s3/s4 on the host
# time virsh dompmsuspend win7-32 --target mem
Domain win7-32 successfully suspended

real    10m17.578s
user    0m0.030s
sys    0m0.041s

# virsh domstate --reason win7-32
shut off (shutdown)

libvirtd did not hung, but domain s3 fail

# time virsh dompmsuspend win7-32 --target disk
Domain win7-32 successfully suspended

real    10m20.538s
user    0m0.036s
sys    0m0.043s

# virsh domstate --reason win7-32
shut off (shutdown)

libvirtd did not hung, but domain s4 fail

So libvirt support on windows s3/s4 still fail.

Try with qemu qmp command
# virsh start win7-32
Domain win7-32 started

# ps aux|grep qemu
qemu     20399 45.8  0.0 1603380 39544 ?       Sl   06:17   0:02
/usr/libexec/qemu-kvm -name win7-32 -S -M rhel6.4.0 -enable-kvm -m 1024
-smp 1,sockets=1,cores=1,threads=1 -uuid
752d19d2-6b22-d848-52c8-66c0a0b7891a -nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/win7-32.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive
file=/var/lib/libvirt/images/win7-32.img,if=none,id=drive-ide0-0-0,format=raw,cache=none
-device
ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1
-drive
file=/usr/share/virtio-win/virtio-win-1.5.4.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
-netdev tap,fd=29,id=hostnet0 -device
rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:01:57:87,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
spicevmc,id=charchannel0,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
-chardev
socket,id=charchannel1,path=/var/lib/libvirt/qemu/win7-32.agent,server,nowait
-device
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
-spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on
-vga qxl -global qxl-vga.vram_size=67108864 -device
intel-hda,id=sound0,bus=pci.0,addr=0x4 -device
hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

# nc -U /var/lib/libvirt/qemu/win7-32.agent
{ "execute": "guest-ping"}

no response here.

So, could be the libvirt xml setting problem.

But with qemu command with the same img will success.
/usr/libexec/qemu-kvm -M rhel6.4.0 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -enable-kvm -name win7-32 -uuid 61b6c504-5a8b-4fe1-8347-6c929b750dde -k en-us -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=input0 -drive file=/var/lib/libvirt/images/win7-32.img,if=none,id=disk0,format=raw,werror=stop,rerror=stop,aio=native -device ide-drive,bus=ide.0,unit=1,drive=disk0,id=disk0  -monitor stdio -qmp tcp:0:6666,server,nowait -chardev socket,path=/tmp/isa-serial,server,nowait,id=isa1 -device isa-serial,chardev=isa1,id=isa-serial1 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x8 -chardev socket,id=charchannel0,path=/tmp/serial-socket,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,path=/tmp/foo,server,nowait,id=foo -device virtconsole,chardev=foo,id=console0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9 -spice port=5930,disable-ticketing -vga qxl -global qxl-vga.vram_size=67108864 -k en-us -boot c -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtserialport,bus=virtio-serial0.0,chardev=qga0,name=org.qemu.guest_agent.0  -global  PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0

# nc -U /tmp/qga.sock
 { "execute": "guest-ping"}
 {"return": {}}
 { "execute": "guest-sync-delimited", "arguments": { "id": 123456 } }
{"return": 123456}
 { "execute": "guest-suspend-ram" } or { "execute": "guest-suspend-disk" }
 can successfully resume from s3 with (qemu)system_wakeup  and resume from s4

Actual results:
The s3/s4 virsh command will hang there while run the qemu-agent service in the windows guest

Expected results:
should do s3/s4 operation successfully for windows guest which running the guest agent service 
Additional info:
The s3/s4 operation works fine with rhel guest, the guest'xml is the same config with rhel guest.

Comment 4 Ademar Reis 2013-06-07 16:30:15 UTC
Please retest once the fix for bug 962669 gets in.

Comment 5 Qunfang Zhang 2013-06-08 07:27:24 UTC
(In reply to Ademar de Souza Reis Jr. from comment #4)
> Please retest once the fix for bug 962669 gets in.

Ok, that is also the plan of QE :). btw, Ademar, as this bug can not be reproduced with directly qemu command line, (even the original reporter can not reproduce as that), why this bug is cloned to qemu-kvm and what do we plan to do about this bug?  

Thanks,
Qunfang

Comment 6 Ademar Reis 2013-06-10 13:50:07 UTC
Peter: we can't reproduce it using qemu directly. Can you post your full qemu command line used to start the guest? Is it different from the original bug (before the clone?)

BTW, in the original bug, the reporter tested the problem with nc, sending the commands manually. Can you do the same? Thanks.

Comment 7 Laszlo Ersek 2013-07-15 16:29:45 UTC
Per comment 4, bug 962669 is in MODIFIED status, please retest. Thanks.

Comment 8 Sibiao Luo 2013-08-30 02:55:42 UTC
(In reply to Laszlo Ersek from comment #7)
> Per comment 4, bug 962669 is in MODIFIED status, please retest. Thanks.
Tried the qemu-kvm-0.12.1.2-2.381.el6.x86_64 that can reproduce this issue, after do S4, it fail to hibernate and "guest-suspend-disk" command hang there. but it can do S3 and resume successfully. 

host info:
# uname -r && rpm -q qemu-kvm
2.6.32-414.el6.x86_64
qemu-kvm-0.12.1.2-2.381.el6.x86_64
guest info:
win7 64bit

# /usr/libexec/qemu-kvm -M rhel6.4.0 -enable-kvm -m 4096 -smp 4,sockets=2,cores=2,threads=1 -no-kvm-pit-reinjection -uuid 350e716b-5f98-4bf0-9a2a-c8e423295244 -usb -device usb-tablet,id=input0 -rtc base=localtime,clock=host,driftfix=slew -drive file=/home/win7-64.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:1D:B1,bus=pci.0,addr=0x5 -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x6 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5930,disable-ticketing -vga qxl -global qxl-vga.vram_size=67108864 -monitor stdio -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0

- in guest side:
C:\qemu-ga>dir
 Volume in drive C is OS_Install
 Volume Serial Number is B43D-B05C

 Directory of C:\qemu-ga

08/29/2013  10:26 PM    <DIR>          .
08/29/2013  10:26 PM    <DIR>          ..
08/29/2013  07:16 PM             1,023 a.txt.txt
06/05/2013  08:37 PM           640,512 iconv.dll
06/05/2013  08:37 PM           288,856 libgcc_s_sjlj-1.dll
08/29/2013  02:45 AM           896,628 libglib-2.0-0.dll
08/29/2013  02:45 AM           923,354 libiconv-2.dll
08/29/2013  02:45 AM           278,271 libintl-8.dll
08/29/2013  02:44 AM            55,060 libssp-0.dll
08/29/2013  07:26 PM           592,424 qemu-ga.exe
06/05/2013  08:37 PM             1,155 README.txt
               9 File(s)      3,677,283 bytes
               2 Dir(s)  15,443,693,568 bytes free

C:\qemu-ga>qemu-ga.exe --service uninstall
Service was deleted successfully.

C:\qemu-ga>qemu-ga.exe --service install
** (qemu-ga.exe:2184): DEBUG: service's cmdline: "C:\qemu-ga\qemu-ga.exe" -d
Service was installed successfully.

C:\qemu-ga>net start qemu-ga

The QEMU Guest Agent service was started successfully.

- in host side:
# nc -U /tmp/qga.sock
{"execute": "guest-ping"}
{"return": {}}
{"execute": "guest-info"}
{"return": {"version": "0.12.1", "supported_commands": [{"enabled": true, "name": "guest-set-vcpus"}, {"enabled": true, "name": "guest-get-vcpus"}, {"enabled": true, "name": "guest-network-get-interfaces"}, {"enabled": true, "name": "guest-suspend-hybrid"}, {"enabled": true, "name": "guest-suspend-ram"}, {"enabled": true, "name": "guest-suspend-disk"}, {"enabled": true, "name": "guest-fstrim"}, {"enabled": true, "name": "guest-fsfreeze-thaw"}, {"enabled": true, "name": "guest-fsfreeze-freeze"}, {"enabled": true, "name": "guest-fsfreeze-status"}, {"enabled": true, "name": "guest-file-flush"}, {"enabled": true, "name": "guest-file-seek"}, {"enabled": true, "name": "guest-file-write"}, {"enabled": true, "name": "guest-file-read"}, {"enabled": true, "name": "guest-file-close"}, {"enabled": true, "name": "guest-file-open"}, {"enabled": true, "name": "guest-shutdown"}, {"enabled": true, "name": "guest-info"}, {"enabled": true, "name": "guest-set-time"}, {"enabled": true, "name": "guest-get-time"}, {"enabled": true, "name": "guest-ping"}, {"enabled": true, "name": "guest-sync"}, {"enabled": true, "name": "guest-sync-delimited"}]}}

{"execute": "guest-sync-delimited", "arguments": {"id": 123456}}
�{"return": 123456}

{"execute": "guest-suspend-ram"}
             <----------can dp S3 and can system_wakeup successfully.
{"execute": "guest-suspend-disk"}
             <----------fail to do S4, and command will hang.

BTW, if tried to do S3/S4 in guest directly(Press Sleep/Hibernate), it can do it and resume it successfully.

Best Regards,
sluo

Comment 9 Sibiao Luo 2013-08-30 02:59:15 UTC
FYI, Bug 888694 - Windows guest agent service has to be restarted to make it work again after resume from S3/S4

Maybe not the same, but we could check.

Comment 10 Gal Hammer 2013-10-01 14:17:24 UTC
I'm sure if the virtio-serial driver which was included in virtio-win-1.5.4-1.el6 fully supported s3/s4 states. Did you try to reproduce with a newer driver?

Comment 11 Peter Krempa 2013-10-03 14:09:42 UTC
Well as I reported originally the guest agent was crashing for me before even entering S3/S4 states so this realy doesn't have anything to do with the virtio-win driver, which was performing well at that stage.

Comment 12 RHEL Program Management 2013-10-14 03:26:10 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 13 Gal Hammer 2013-12-22 13:20:54 UTC
I'm unable to reproduce with qemu-ga-win-6.5-4. Can you please confirm? Thanks.

Comment 14 Qunfang Zhang 2013-12-23 02:25:43 UTC
Hi, Sibiao

Could you help reply comment 13?  Thanks!

Comment 15 Sibiao Luo 2013-12-24 02:42:45 UTC
(In reply to Gal Hammer from comment #13)
> I'm unable to reproduce with qemu-ga-win-6.5-4. Can you please confirm?
> Thanks.
Retried the latest qemu-ga-win-6.5-5 with the same steps as comment #8. It can do guest-suspend-ram/guest-suspend-disk and resume successfully. So, this issue has been gone.

e.g:/usr/libexec/qemu-kvm...-device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0

host info:
# uname -r && rpm -q qemu-kvm-rhev
2.6.32-425.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.415.el6.x86_64
guest info:
win7-64bit
virtio-win-prewhql-0.1-74 (virtio-serial)
qemu-ga-win-6.5-5         (qemu-ga)

# nc -U /tmp/qga.sock
{"execute": "guest-ping"}
{"return": {}}
{"execute": "guest-info"}
{"return": {"version": "0.12.1", "supported_commands": [{"enabled": true, "name": "guest-set-vcpus"}, {"enabled": true, "name": "guest-get-vcpus"}, {"enabled": true, "name": "guest-network-get-interfaces"}, {"enabled": true, "name": "guest-suspend-hybrid"}, {"enabled": true, "name": "guest-suspend-ram"}, {"enabled": true, "name": "guest-suspend-disk"}, {"enabled": true, "name": "guest-fstrim"}, {"enabled": true, "name": "guest-fsfreeze-thaw"}, {"enabled": true, "name": "guest-fsfreeze-freeze"}, {"enabled": true, "name": "guest-fsfreeze-status"}, {"enabled": true, "name": "guest-file-flush"}, {"enabled": true, "name": "guest-file-seek"}, {"enabled": true, "name": "guest-file-write"}, {"enabled": true, "name": "guest-file-read"}, {"enabled": true, "name": "guest-file-close"}, {"enabled": true, "name": "guest-file-open"}, {"enabled": true, "name": "guest-shutdown"}, {"enabled": true, "name": "guest-info"}, {"enabled": true, "name": "guest-set-time"}, {"enabled": true, "name": "guest-get-time"}, {"enabled": true, "name": "guest-ping"}, {"enabled": true, "name": "guest-sync"}, {"enabled": true, "name": "guest-sync-delimited"}]}}
{"execute": "guest-sync-delimited", "arguments": {"id": 123456}}
�{"return": 123456}

{"execute": "guest-suspend-ram"}
             <----------can do S3 and can be resumed successfully.
{"execute": "guest-suspend-disk"}
             <----------can do S4 and can be resumed successfully.

Base on above, I suggest that we can close it to CURRENTRELEASE. Please correct me if any mistake, thanks.

Best Regards,
sluo