Red Hat Bugzilla – Bug 1311208
virt-viewer stops responding and window cannot be closed.
Last modified: 2016-05-09 05:59:58 EDT
Virt-viewer hangs at random times, when the console is left idle, for random users using SPICE to access Windows VDI.
What problem/issue/behavior are you having trouble with? What do you expect to see?
We have Linux users (RHEL6.7) accessing Windows VDI through the RHEV User Portal. They are using SPICE as the protocol. At random times the SPICE sessions freeze, and the remote-viewer has to be killed, and re-connected by opening a new remote-viewer SPICE session from the User Portal. They are using two displays with SPICE. Sometimes this happens several times during a day, other times it may take a week between each time it happens. After the SPICE session is restarted you can see that the keyboard was active even though the image was frozen, as the password field in the unlock screen has characters in it if anything was typed while the SPICE session was frozen.
Where are you experiencing the behavior? What environment?
RHEV: RHEV-M 3.5.6 / RHEH-H 20151015.1.el6ev
Client: RHEL 6.7 clients (spice-xpi-2.7-27.el6.x86_64, virt-viewer2.0-7.el6.x86_64)
VDI VM: Windows 7 x64
This has been an issue for quite some time. We've had RHEV-M 3.3, 3.4 and now 3.5. And we've had RHEL Clients 6.4, 6.5, 6.6 and nwo 6.7. The hypervisor image has also been RHEV-H 6.4, 6.5, 6.6 and now 6.7.
When does the behavior occur? Frequently? Repeatedly? At certain times?
At random times of the day, and with random frequency, after the connection is idle, and with random users. Many users has been affected. We expect that everyone experiences the issue even though not everybody has reported it.
What information can you provide around timeframes and urgency?
It's becoming an urgent matter as more users than initially expected are affected by this, and their only method of accessing the corporate Windows environment is the Windows vms provided in the RHEV User Portal.
We had a user experience this issue yesterday. He tried to open his desktop around 12:25. When we look at the c:\windows\temp\vdagent.log, we see the following happened at that time. It looks like there was an issue in configuring the qxl display in the guest
12:25: User returns to his machine and open the remote-viewer window. Looks like the vdagent is struggling with the qxl device.
12:54: He reconnected to his VDI by closing the existing remote-viewer using xkill, and opening a new remote-viewer from the RHEV User Portal web page.
When this hang occurs the user are forced to close the remote viewer using xkill because it refuses not close normally, and just sits with the message "Waiting for display...
Hypervisor version information
How many VMs are running on each hypervisor? And how many VMs will be used in parallel?
Between 8 and 15 per vm
Is it a RHEL-based host or a RHEV-H hypervisor?
Red Hat Enterprise Virtualization Hypervisor release 6.7 (20151015.1.el6ev)
qemu-kvm version: rpm -qa | grep qemu-kvm
# rpm -qa | grep qemu-kvm
spice-server version: rpm -qa | grep spice-server
# rpm -qa | grep spice-server
Guest version information
What operating system(s) are running in the virtual machines?
Windows 7 x64
Is the QXL driver installed? What is its version?
Windows: Device Manager -> Display -> Red Hat QXL -> Version
Is the RHEV spice agent (vdservice or vdagent) installed on the guest and is the service running?
Windows: Control Panel -> Administrative Tools -> Windows Services -> RHEV Spice Agent -> Properties
Yes, enabled and running.
Note: The RHEV Tools installation covers the QXL driver and the Spice agent for Windows. What version of the RHEV Tools is installed?
RHEV Tools 3.5.9
Created attachment 1129799 [details]
An strace attached to the qemu process after remote-viewer froze on the connected client.
Created attachment 1129800 [details]
the qemu log of the windows vm from the hypervisor
Created attachment 1129801 [details]
A tcpdump from the traffic between the client and the hypervisor after the remote-viewer froze - collected at the client
Created attachment 1129815 [details]
a tcpdump from the traffic between the client and the hypervisor after remote-viewer frize - collected at the hypervisor
Created attachment 1129817 [details]
An strace attached to the remote-viewer after it froze
Created attachment 1129819 [details]
Log from Firefox started with SPICE_DEBUG=1 G_MESSAGES_DEBUG=all
In comment #0 it is mentioned that there was some errors in vdagent.log.
Can you please add that logfile or paste the error message ?
Do they use a proxy ?
I tried to reproduce by running a VM and remote-viewer on a RHEL-6 and
leaving it idle, but the client did not hang for me (but I did not use RHEV-M).
On my machine:
$ rpm -q spice-server virt-viewer
What is the qemu-kvm command line used ?
Created attachment 1130460 [details]
Requested log in comment #8
I have attached the log you requsted.
We are not using any proxy for SPICE connections.
Also, I just updated the support case with the following info:
Lie, Sigbjorn on Feb 25 2016 at 10:36 AM +01:00
I have an update that might be of interest.
I've requested a list of users who's reported the issue to our helpdesk so I can investigate them. The person who's been complaining the most and have had several freezes every day had a different VM config that most.
VM config for the user who's had several frezes during one day:
Number of CPU Cores: 1 (1 Socket(s), 1 Core(s) per Socket)
Defined Memory: 4GB
Physical Memory Guaranteed: 256MB
Normal VM config:
Number of CPU Cores: 2 (2 Socket(s), 1 Core(s) per Socket)
Defined Memory: 12GB
Physical Memory Guaranteed: 4GB
Based on this I am now testing the following VM config:
Number of CPU Cores: 4 (2 Socket(s), 2 Core(s) per Socket)
Defined Memory: 12GB
Physical Memory Guaranteed: 12GB
Could the SPICE freeeze issue be related to assigned vm guest memory?
The cluster assigned to VDI machines has KSM enabled, and "Enable Memory Balloon Optimization" disabled.
I am increasing the guest vcpu count for some vms because we have noticed increased SPICE performance when using 4 vcpus over 2 vcpus.
Does this provide any help for further investigation?
(In reply to Frank DeLorey from comment #0)
> Virt-viewer hangs at random times, when the console is left idle
Regarding the "left idle" part, does this bug _always_ happen when the session is idle? Or does it sometimes happen when the user is interacting with the VM? Could it be https://bugzilla.redhat.com/show_bug.cgi?id=1298590 ? Uri asked about proxies, but the forceful closing of idle TCP connections could be caused by a firewall, router, ...
There are no firewalls between the clients and guests according to the customer.
I had previously inquired about this and here is the answer the gave:
Lie, Sigbjorn on Feb 19 2016 at 06:31 AM -05:00
It happens when they come back to the VDI after being away for some time.
This article describes issues with keyboard and mouse, however the keyboard is still working as described earlier. In our environment it's only the image which is frozen. As well, we do not have any firewalls between the machine where the SPICE client and the hypervisor where the SPICE server/qemu is running. They are on different subnets, however there are only a Nexus 7000 series router in between.
Update from Customer:
Created By: Sigbjorn Lie (3/1/2016 10:58 AM)
I have an update to make.
I re-installed a RHEV-H 6 hypervisor with the RHEV-H 7 hypervisor iso, and migrated the most problematic users there. I also re-configured their VMs to match "Physical Memory Guaranteed:" with "Defined Memory:".
These users had several freezes during a single day before this change, and now they've had none for several days.
Could there be a culprit for Windows vms/SPICE in RHEV-H 6 or related to how much memory is guaranteed for the VM?
New update from customer:
I just had a user which had a freeze again, on the RHEV-H 7 based cluster.
His "Physical Memory Guaranteed" was set to 2098MB and his "Defined Memory:" to 12288MB.
The users no longer complaining has both "Physical Memory Guaranteed" and "Defined Memory:" set to 12288MB.
Even though this may be a workaround. it's not a good setting for VDI vms though, as we will utilize much more memory than needed having to reserve the full 12GB per vm.
Is there an issue with amount of allocated memory for the VM and SPICE?
We have informed the customer of this and they will send us the data when this is next reported by a user.
With RHEV-H 7 and "Physical Memory Guaranteed" and "Defined Memory:" set to the same value it didn't happen again. Customer closed the case.
Closing this bug as well.
Thank you all