Bug 718575

Summary: spice client disconnect causing libvirt to shutdown domain
Product: [Community] Virtualization Tools Reporter: gvenkat <gvenkat>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED NOTABUG QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: crobinso, dallan, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-07 21:05:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description gvenkat 2011-07-03 22:49:46 UTC
Description of problem:
When a Windows XP guest (happens to Windows 7 as well) is started the first time after libvirtd is started, the guest shuts down within a few minutes of use OR if a remote spice client is disconnected. The domain is being shutdown because libvirt assumes the guest has died (see debug log below). If the guest is restarted at that point, this problem does not occur and the guest is stable for weeks. This problem  started in one of the libvirt/kvm updates within the last month or so.

Version-Release number of selected component (if applicable):
libvirt-0.9.2
kvm-0.14.1.0
kernel 2.6.37.6
libspice-server1-0.8.1
spice-client-0.8.1

How reproducible:
Start a Windows guest domain after libvirtd has just started and wait for it to be shutdown within a few minutes

Steps to Reproduce:
1. Reboot host machine
2. Start domain via virsh
3. Use spice-client to connect from remote machine
4. Use the guest with some graphics on screen or kill/disconnect remote spice client
  
Actual results:
The guest domain is shutdown

Expected results:
The guest should be kept alive.

Additional info:

The detailed libvirtd.log with log level 1 is at
http://pastebin.com/HMmVQi8Z

Excerpts that might help:
15:08:38.741: 3887: debug : qemuMonitorJSONIOProcessLine:112 : Line [{"timestamp": {"seconds": 1309730918, "microseconds": 740876}, "event": "SPICE_DISCONNECTED", "data": {"server": {"port": "5940", "family": "ipv4", "host": "192.168.200.20"}, "client": {"port": "47844", "family": "ipv4", "host": "192.168.200.103"}}}]
15:08:38.741: 3887: debug : virJSONValueFromString:912 : string={"timestamp": {"seconds": 1309730918, "microseconds": 740876}, "event": "SPICE_DISCONNECTED", "data": {"server": {"port": "5940", "family": "ipv4", "host": "192.168.200.20"}, "client": {"port": "47844", "family": "ipv4", "host": "192.168.200.103"}}}

...

15:08:40.228: 3887: debug : udevEventHandleCallback:1462 : udev action: 'remove'
15:08:40.228: 3887: debug : udevRemoveOneDevice:1213 : Failed to find device to remove that has udev name '/sys/devices/virtual/net/vnet0/queues/rx-0'

....
15:08:40.233: 3887: error : qemuMonitorIO:576 : internal error End of file from monitor
15:08:40.233: 3887: debug : qemuMonitorIO:609 : Error on monitor internal error End of file from monitor
15:08:40.233: 3887: debug : virEventPollUpdateHandle:144 : Update handle w=8 e=12
15:08:40.233: 3887: debug : virEventPollInterruptLocked:686 : Skip interrupt, 1 -485828864
15:08:40.233: 3887: debug : qemuMonitorIO:632 : Triggering EOF callback
15:08:40.234: 3887: debug : qemuProcessHandleMonitorEOF:113 : Received EOF on 0x7f67dc13bb20 'Munchkin'
15:08:40.234: 3887: debug : qemuProcessHandleMonitorEOF:128 : Monitor connection to 'Munchkin' closed without SHUTDOWN event; assuming the domain crashed
15:08:40.234: 3887: debug : qemuProcessStop:2572 : Shutting down VM 'Munchkin' pid=8133 migrated=0
15:08:40.234: 3887: debug : qemuMonitorClose:757 : mon=0x742980

Comment 1 gvenkat 2011-07-03 22:54:02 UTC
PS: The Windows guest event logs don't show any problem so I don't believe the guest itself crashed prior to libvirt shutting down the domain. The guests behave as if the machine had a power shutdown.

Comment 2 Michal Privoznik 2011-07-07 07:56:19 UTC
I believe this is not a libvirt error. On the line 104 we can see qemu crashed. That error message there is printed iff we talk to qemu via JSON but we did not receive SHUTDOWN event, which qemu sends upon domain shutdown. Moreover, receiving EOF on monitor means qemu process died. There could be meaningful message in /var/log/libvirt/qemu/Munchkin.log, isn't it?

Comment 3 gvenkat 2011-07-07 19:21:23 UTC
Thank you. You might be right about that and I am not familiar with the boundaries between libvirt and qemu-kvm. So I did some experiments to see the qemu-kvm log.

To minimize the variables, I started the guest after a host reboot with virsh but did not connect any spice display. The guest remained stable.

Next I brought up a spice client but did not do anything with it.

The guest dies within a few minutes with the following lines in Munchkin.log (Munchkin is the name of the guest domain)

qemu-kvm: virtio: trying to map MMIO memory
2011-07-07 12:05:03.740: shutting down

From the previous logs I see that these lines precede every such crash but do not appear on subsequent restart of the domain when the guests stay stable long-term, it is only the first time after a host reboot that this happens.

Searching on the web, I see one more report like that

http://fossplanet.com/f13/re-%5Bfedora-virt%5D-windows-7-guest-dying-178326/

The OP reports that it went away by itself but I suspect that he has not realized it happens only the first time and then it is stable after that.

Please feel free to close this if it is definitely related to libvirt but would appreciate a pointer to where I should file this bug for so the relevant people will take a look at it.

Thanks

Comment 4 Dave Allan 2011-07-07 21:05:59 UTC
That looks to me as well like qemu is crashing, not a libvirt bug.  If it's qemu from source, I'd file a BZ with their upstream tracker: https://bugs.launchpad.net/qemu/ otherwise, I'd open it with your distro against qemu.