Bug 1191796

Summary: After do S4 success, start guest to resume it failed when configure VNC+QXL
Product: Red Hat Enterprise Linux 7 Reporter: vivian zhang <vivianzhang>
Component: qemu-kvm-rhevAssignee: Gerd Hoffmann <kraxel>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 7.1CC: areis, dyuan, hhuang, huding, jdenemar, juzhang, mazhang, mzhan, rbalakri, virt-maint, xfu, zhwang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-15 19:35:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 923626    
Attachments:
Description Flags
guest xml
none
qemu log
none
libvirtd.log
none
/var/log/messages
none
guest console log
none
qemu log time stamp same with libvirtd.log none

Description vivian zhang 2015-02-12 02:06:11 UTC
Description:
Configure VNC+QXL for a guest, after do S4 success, start to resume the guest from S4 failed

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.2-23.el7.x86_64
3.10.0-229.el7.x86_64
libvirt-1.2.8-16.el7.x86_64
qemu-guest-agent-2.1.0-4.el7.x86_64


How reproducible:
100%

Steps:
1. prepare a healthy guest with S3/S4 enable
# virsh dumpxml rl
...
<pm>
    <suspend-to-mem enabled='yes'/>
    <suspend-to-disk enabled='yes'/>
  </pm>
...
<channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/rl.org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel1'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>


2. configure VNC+QXL to guest
    #virsh dumpxml rl
...
<graphics type='vnc' port='5901' autoport='yes' listen='127.0.0.1'>
      <listen type='address' address='127.0.0.1'/>
    </graphics>
    <sound model='ich6'>
      <alias name='sound0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </sound>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
...    

3. start guest
# virsh list
 Id    Name                           State
----------------------------------------------------
 20    rl                             running

3. do S4 for this guest success, guest shutoff
# virsh dompmsuspend rl --target disk
Domain rl successfully suspended

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     rl                             shut off

4. resume this guest from S4, after a while, start failed, guest still shutoff
# virsh start rl
Domain rl started

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     rl                             shut off

5. check guest status
# virsh domstate rl --reason
shut off (crashed)

6. start the guest again, it works well

7. the issue can only hit on IDE disk, but works well for virtio disk type

Actual result:
when configure VNC+QXL for guest, after do S4 success for this guest, start the guest to resume it failed.

Expect result:
1. if S4 doest not support fo VNC+QXL, an acceptable error should report when do S4 for this guest
2. if S4 support for VNC+QXL, start guest to resume it from S4 should success.

Comment 2 vivian zhang 2015-02-12 02:14:52 UTC
choose the wrong version, correct it to 7.1

Comment 3 Jiri Denemark 2015-02-12 08:32:41 UTC
Could you attach /var/log/libvirt/qemu/rl.log and domain XML?

Comment 4 vivian zhang 2015-02-12 09:03:05 UTC
Created attachment 990808 [details]
guest xml

Comment 5 vivian zhang 2015-02-12 09:10:27 UTC
Created attachment 990810 [details]
qemu log

Comment 6 vivian zhang 2015-02-12 09:11:54 UTC
(In reply to Jiri Denemark from comment #3)
> Could you attach /var/log/libvirt/qemu/rl.log and domain XML?

hi, Jiri
please see attached log and xml, thanks

Comment 7 Jiri Denemark 2015-02-12 09:22:01 UTC
Hmm, I hoped I'd see some details about why libvirt says qemu-kvm crashed but there's nothing interesting in qemu log. Could you also attach debug logs from libvirtd?

Comment 8 vivian zhang 2015-02-12 10:54:21 UTC
Created attachment 990868 [details]
libvirtd.log

Comment 9 vivian zhang 2015-02-12 10:55:25 UTC
Created attachment 990869 [details]
/var/log/messages

Comment 10 vivian zhang 2015-02-12 10:56:26 UTC
Created attachment 990870 [details]
guest console log

Comment 11 vivian zhang 2015-02-12 10:58:29 UTC
(In reply to Jiri Denemark from comment #7)
> Hmm, I hoped I'd see some details about why libvirt says qemu-kvm crashed
> but there's nothing interesting in qemu log. Could you also attach debug
> logs from libvirtd?

hi, Jiri
I capture the libvirtd log, /var/log/message and guest console start log, please check.
But I can not capture the qemu core dump file or some other crash file from guest, I am not sure is it guest crash or qemu crash trigger this issue.
if you have some ideas about this, please tell me.

Comment 12 Jiri Denemark 2015-02-12 15:00:30 UTC
So libvirt normally starts the domain, sends cont command to qemu-kvm and gets RESUME event:

2015-02-12 10:37:50.507+0000: 1359: debug : qemuDomainObjEnterMonitorInternal:1564 : Entering monitor (mon=0x7f58cc00da30 vm=0x7f58d01ed270 name=rl)
2015-02-12 10:37:50.507+0000: 1359: debug : qemuMonitorJSONCommandWithFd:290 : Send command '{"execute":"cont","id":"libvirt-93"}' for write with FD -1
2015-02-12 10:37:50.507+0000: 1359: debug : qemuMonitorSend:976 : QEMU_MONITOR_SEND_MSG: mon=0x7f58cc00da30 msg={"execute":"cont","id":"libvirt-93"}
 fd=-1
2015-02-12 10:37:50.507+0000: 1289: debug : qemuMonitorIOWrite:507 : QEMU_MONITOR_IO_WRITE: mon=0x7f58cc00da30 buf={"execute":"cont","id":"libvirt-93"}
 len=38 ret=38 errno=11
2015-02-12 10:37:50.515+0000: 1289: debug : qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7f58cc00da30 buf={"timestamp": {"seconds": 1423737470, "microseconds": 515161}, "event": "RESUME"}
{"return": {}, "id": "libvirt-93"}
 len=119
2015-02-12 10:37:50.515+0000: 1289: debug : qemuMonitorJSONIOProcessLine:198 : QEMU_MONITOR_RECV_EVENT: mon=0x7f58cc00da30 event={"timestamp": {"seconds": 1423737470, "microseconds": 515161}, "event": "RESUME"}
2015-02-12 10:37:50.515+0000: 1289: debug : qemuMonitorJSONIOProcessEvent:142 : mon=0x7f58cc00da30 obj=0x7f58f02b5d30
2015-02-12 10:37:50.515+0000: 1289: debug : qemuMonitorEmitEvent:1256 : mon=0x7f58cc00da30 event=RESUME
2015-02-12 10:37:50.515+0000: 1289: debug : qemuMonitorEmitResume:1307 : mon=0x7f58cc00da30
2015-02-12 10:37:50.515+0000: 1359: debug : qemuDomainObjExitMonitorInternal:1587 : Exited monitor (mon=0x7f58cc00da30 vm=0x7f58d01ed270 name=rl)


Few moments later, we get VNC_CONNECTED and VNC_INITIALIZED events:

2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7f58cc00da30 buf={"timestamp": {"seconds": 1423737483, "microseconds": 672151}, "event": "VNC_CONNECTED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5900", "host": "127.0.0.1"}, "client": {"family": "ipv4", "service": "42758", "host": "127.0.0.1"}}}
{"timestamp": {"seconds": 1423737483, "microseconds": 672658}, "event": "VNC_INITIALIZED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5900", "host": "127.0.0.1"}, "client": {"family": "ipv4", "service": "42758", "host": "127.0.0.1"}}}
 len=516
2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorJSONIOProcessLine:198 : QEMU_MONITOR_RECV_EVENT: mon=0x7f58cc00da30 event={"timestamp": {"seconds": 1423737483, "microseconds": 672151}, "event": "VNC_CONNECTED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5900", "host": "127.0.0.1"}, "client": {"family": "ipv4", "service": "42758", "host": "127.0.0.1"}}}
2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorJSONIOProcessEvent:142 : mon=0x7f58cc00da30 obj=0x7f58f02b3560
2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorEmitEvent:1256 : mon=0x7f58cc00da30 event=VNC_CONNECTED
2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorEmitGraphics:1370 : mon=0x7f58cc00da30
2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorJSONIOProcessLine:198 : QEMU_MONITOR_RECV_EVENT: mon=0x7f58cc00da30 event={"timestamp": {"seconds": 1423737483, "microseconds": 672658}, "event": "VNC_INITIALIZED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5900", "host": "127.0.0.1"}, "client": {"family": "ipv4", "service": "42758", "host": "127.0.0.1"}}}
2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorJSONIOProcessEvent:142 : mon=0x7f58cc00da30 obj=0x7f58f02b3560
2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorEmitEvent:1256 : mon=0x7f58cc00da30 event=VNC_INITIALIZED
2015-02-12 10:38:03.674+0000: 1289: debug : qemuMonitorEmitGraphics:1370 : mon=0x7f58cc00da30


And several seconds after, QEMU monitor disconnects and we clean up everything that was left behind:

2015-02-12 10:38:07.362+0000: 1289: error : qemuMonitorIO:666 : internal error: End of file from monitor
2015-02-12 10:38:07.362+0000: 1289: debug : qemuMonitorIO:721 : Error on monitor internal error: End of file from monitor
2015-02-12 10:38:07.362+0000: 1289: debug : qemuMonitorIO:742 : Triggering EOF callback
2015-02-12 10:38:07.362+0000: 1289: debug : qemuProcessHandleMonitorEOF:306 : Received EOF on 0x7f58d01ed270 'rl'
2015-02-12 10:38:07.362+0000: 1289: debug : qemuProcessHandleMonitorEOF:324 : Monitor connection to 'rl' closed without SHUTDOWN event; assuming the domain crashed
2015-02-12 10:38:07.362+0000: 1289: debug : qemuProcessStop:5011 : Shutting down vm=0x7f58d01ed270 name=rl id=8 pid=6579 flags=0
2015-02-12 10:38:07.362+0000: 1289: debug : qemuProcessKill:4969 : vm=0x7f58d01ed270 name=rl pid=6579 flags=5
2015-02-12 10:38:07.362+0000: 1289: debug : virProcessKillPainfully:299 : vpid=6579 force=1


I'm not sure what happened to the qemu-kvm process but libvirt doesn't seem to be the root cause of it.

Comment 13 Jiri Denemark 2015-02-12 15:03:21 UTC
Anyway, could you please attach qemu log that corresponds to the libvirtd log, that is which shows a domain which died at about 10:38, Feb 12?

Comment 14 vivian zhang 2015-02-13 02:01:35 UTC
Created attachment 991187 [details]
qemu log time stamp same with libvirtd.log

Comment 15 Sibiao Luo 2015-02-13 03:22:56 UTC
Hi all,

IIRC, the VNC+QXL not support by official, the QXL is only for SPICE display protocol but not for VNC protocol. I think it's enough for us that there is no any harmful for QEMU if boot a VM with VNC+QXL.

I tried it with wzhang together yesterday on the same host using the same qemu-kvm command line directly which was generated by the virt-manager, it can do S4 and resumed successfully.

Also tried the virt-manager and virsh command, it can do suspend to disk and loading and decompressing image data 100% successfully, but at that time the VM went to shut off (crashed) and  can be running (booted) if you start the domains again.

host info:
3.10.0-229.el7.x86_64
qemu-kvm-rhev-2.1.2-23.el7.x86_64
libvirt-1.2.8-16.el7.x86_64
seabios-bin-1.7.5-8.el7.noarch
virt-manager-1.1.0-12.el7.noarch

Best Regards,
sluo