Bug 1311208 - virt-viewer stops responding and window cannot be closed.
virt-viewer stops responding and window cannot be closed.
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: virt-viewer (Show other bugs)
6.7
Unspecified Unspecified
medium Severity medium
: rc
: 6.9
Assigned To: Virt Viewer Maint
Virtualization Bugs
:
Depends On:
Blocks: 1269194
  Show dependency treegraph
 
Reported: 2016-02-23 10:08 EST by Frank DeLorey
Modified: 2016-05-09 05:59 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-09 05:59:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
An strace attached to the qemu process after remote-viewer froze on the connected client. (590.69 KB, application/x-gzip)
2016-02-23 10:15 EST, Frank DeLorey
no flags Details
the qemu log of the windows vm from the hypervisor (79.80 KB, text/plain)
2016-02-23 10:16 EST, Frank DeLorey
no flags Details
A tcpdump from the traffic between the client and the hypervisor after the remote-viewer froze - collected at the client (9.64 KB, application/octet-stream)
2016-02-23 10:17 EST, Frank DeLorey
no flags Details
a tcpdump from the traffic between the client and the hypervisor after remote-viewer frize - collected at the hypervisor (2.02 KB, application/octet-stream)
2016-02-23 10:17 EST, Frank DeLorey
no flags Details
An strace attached to the remote-viewer after it froze (43.10 KB, text/plain)
2016-02-23 10:20 EST, Frank DeLorey
no flags Details
Log from Firefox started with SPICE_DEBUG=1 G_MESSAGES_DEBUG=all (4.15 MB, application/x-gzip)
2016-02-23 10:21 EST, Frank DeLorey
no flags Details
Requested log in comment #8 (17.77 KB, text/plain)
2016-02-25 04:47 EST, Sigbjorn Lie
no flags Details

  None (edit)
Description Frank DeLorey 2016-02-23 10:08:32 EST
Virt-viewer hangs at random times, when the console is left idle, for random users using SPICE to access Windows VDI.
	

What problem/issue/behavior are you having trouble with?  What do you expect to see?

We have Linux users (RHEL6.7) accessing Windows VDI through the RHEV User Portal. They are using SPICE as the protocol. At random times the SPICE sessions freeze, and the remote-viewer has to be killed, and re-connected by opening a new remote-viewer SPICE session from the User Portal. They are using two displays with SPICE. Sometimes this happens several times during a day, other times it may take a week between each time it happens. After the SPICE session is restarted you can see that the keyboard was active even though the image was frozen, as the password field in the unlock screen has characters in it if anything was typed while the SPICE session was frozen.

Where are you experiencing the behavior?  What environment?

RHEV: RHEV-M 3.5.6 / RHEH-H 20151015.1.el6ev
Client: RHEL 6.7 clients (spice-xpi-2.7-27.el6.x86_64, virt-viewer2.0-7.el6.x86_64)
VDI VM: Windows 7 x64

This has been an issue for quite some time. We've had RHEV-M 3.3, 3.4 and now 3.5. And we've had RHEL Clients 6.4, 6.5, 6.6 and nwo 6.7. The hypervisor image has also been RHEV-H 6.4, 6.5, 6.6 and now 6.7.

When does the behavior occur? Frequently?  Repeatedly?   At certain times?

At random times of the day, and with random frequency, after the connection is idle, and with random users. Many users has been affected. We expect that everyone experiences the issue even though not everybody has reported it.

What information can you provide around timeframes and urgency?

It's becoming an urgent matter as more users than initially expected are affected by this, and their only method of accessing the corporate Windows environment is the Windows vms provided in the RHEV User Portal.

We had a user experience this issue yesterday. He tried to open his desktop around 12:25. When we look at the c:\windows\temp\vdagent.log, we see the following happened at that time. It looks like there was an issue in configuring the qxl display in the guest

12:25:  User returns to his machine and open the remote-viewer window. Looks like the vdagent is struggling with the qxl device. 
12:54: He reconnected to his VDI by closing the existing remote-viewer using xkill, and opening a new remote-viewer from the RHEV User Portal web page.

When this hang occurs the user are forced to close the remote viewer using xkill because it refuses not close normally, and just sits with the message "Waiting for display...

Hypervisor version information

    How many VMs are running on each hypervisor? And how many VMs will be used in parallel?
Between 8 and 15 per vm


    Is it a RHEL-based host or a RHEV-H hypervisor?
Red Hat Enterprise Virtualization Hypervisor release 6.7 (20151015.1.el6ev)

    qemu-kvm version: rpm -qa | grep qemu-kvm
# rpm -qa | grep qemu-kvm
qemu-kvm-rhev-tools-0.12.1.2-2.479.el6_7.1.x86_64
qemu-kvm-rhev-0.12.1.2-2.479.el6_7.1.x86_64

    spice-server version: rpm -qa | grep spice-server
# rpm -qa | grep spice-server
spice-server-0.12.4-12.el6_7.3.x86_64


Guest version information

    What operating system(s) are running in the virtual machines?
Windows 7 x64

    Is the QXL driver installed? What is its version?
        Windows: Device Manager -> Display -> Red Hat QXL -> Version
6.1.0.10018


    Is the RHEV spice agent (vdservice or vdagent) installed on the guest and is the service running?
        Windows: Control Panel -> Administrative Tools -> Windows Services -> RHEV Spice Agent -> Properties
Yes, enabled and running.


	Note: The RHEV Tools installation covers the QXL driver and the Spice agent for Windows. What version of the RHEV Tools is installed?
RHEV Tools 3.5.9
Comment 1 Frank DeLorey 2016-02-23 10:15 EST
Created attachment 1129799 [details]
An strace attached to the qemu process after remote-viewer froze on the connected client.
Comment 2 Frank DeLorey 2016-02-23 10:16 EST
Created attachment 1129800 [details]
the qemu log of the windows vm from the hypervisor
Comment 3 Frank DeLorey 2016-02-23 10:17 EST
Created attachment 1129801 [details]
A tcpdump from the traffic between the client and the hypervisor after the remote-viewer froze - collected at the client
Comment 4 Frank DeLorey 2016-02-23 10:17 EST
Created attachment 1129815 [details]
a tcpdump from the traffic between the client and the hypervisor after remote-viewer frize  - collected at the hypervisor
Comment 5 Frank DeLorey 2016-02-23 10:20 EST
Created attachment 1129817 [details]
An strace attached to the remote-viewer after it froze
Comment 7 Frank DeLorey 2016-02-23 10:21 EST
Created attachment 1129819 [details]
Log from Firefox started with SPICE_DEBUG=1 G_MESSAGES_DEBUG=all
Comment 8 Uri Lublin 2016-02-24 11:26:57 EST
In comment #0 it is mentioned that there was some errors in vdagent.log.
Can you please add that logfile or paste the error message ?

Do they use a proxy ?

I tried to reproduce by running a VM and remote-viewer on a RHEL-6 and
leaving it idle, but the client did not hang for me (but I did not use RHEV-M).

On my machine:
$ rpm -q spice-server virt-viewer
spice-server-0.12.4-12.el6_7.3.x86_64
virt-viewer-2.0-7.el6.x86_64

What is the qemu-kvm command line used ?
Comment 12 Sigbjorn Lie 2016-02-25 04:47 EST
Created attachment 1130460 [details]
Requested log in comment #8
Comment 13 Sigbjorn Lie 2016-02-25 04:49:45 EST
I have attached the log you requsted. 

We are not using any proxy for SPICE connections.

Also, I just updated the support case with the following info:







Lie, Sigbjorn on Feb 25 2016 at 10:36 AM +01:00
Hi,

I have an update that might be of interest.

I've requested a list of users who's reported the issue to our helpdesk so I can investigate them. The person who's been complaining the most and have had several freezes every day had a different VM config that most.

VM config for the user who's had several frezes during one day:
Number of CPU Cores: 1 (1 Socket(s), 1 Core(s) per Socket)
Defined Memory: 4GB
Physical Memory Guaranteed: 256MB


Normal VM config:
Number of CPU Cores: 2 (2 Socket(s), 1 Core(s) per Socket)
Defined Memory: 12GB
Physical Memory Guaranteed: 4GB


Based on this I am now testing the following VM config:
Number of CPU Cores: 4 (2 Socket(s), 2 Core(s) per Socket)
Defined Memory: 12GB
Physical Memory Guaranteed: 12GB


Could the SPICE freeeze issue be related to assigned vm guest memory?
The cluster assigned to VDI machines has KSM enabled, and "Enable Memory Balloon Optimization" disabled.
I am increasing the guest vcpu count for some vms because we have noticed increased SPICE performance when using 4 vcpus over 2 vcpus.

Does this provide any help for further investigation?


Regards,
Siggi
Comment 14 Christophe Fergeau 2016-02-29 11:08:40 EST
(In reply to Frank DeLorey from comment #0)
> Virt-viewer hangs at random times, when the console is left idle

Regarding the "left idle" part, does this bug _always_ happen when the session is idle? Or does it sometimes happen when the user is interacting with the VM? Could it be https://bugzilla.redhat.com/show_bug.cgi?id=1298590 ? Uri asked about proxies, but the forceful closing of idle TCP connections could be caused by a firewall, router, ...
Comment 15 Frank DeLorey 2016-02-29 11:22:45 EST
There are no firewalls between the clients and guests according to the customer.
I had previously inquired about this and here is the answer the gave:

Message
Lie, Sigbjorn on Feb 19 2016 at 06:31 AM -05:00
Hi,

It happens when they come back to the VDI after being away for some time.

This article describes issues with keyboard and mouse, however the keyboard is still working as described earlier. In our environment it's only the image which is frozen. As well, we do not have any firewalls between the machine where the SPICE client and the hypervisor where the SPICE server/qemu is running. They are on different subnets, however there are only a Nexus 7000 series router in between.


Regards,
Siggi
Comment 16 Frank DeLorey 2016-03-01 11:23:55 EST
Update from Customer:

 Created By: Sigbjorn Lie  (3/1/2016 10:58 AM)

I have an update to make.

I re-installed a RHEV-H 6 hypervisor with the RHEV-H 7 hypervisor iso, and migrated the most problematic users there. I also re-configured their VMs to match "Physical Memory Guaranteed:" with "Defined Memory:". 

These users had several freezes during a single day before this change, and now they've had none for several days.

Could there be a culprit for Windows vms/SPICE in RHEV-H 6 or related to how much memory is guaranteed for the VM?
Comment 17 Alexandros Gkesos 2016-03-03 07:33:50 EST
New update from customer:

I just had a user which had a freeze again, on the RHEV-H 7 based cluster. 

His "Physical Memory Guaranteed" was set to 2098MB and his "Defined Memory:" to 12288MB.

The users no longer complaining has both  "Physical Memory Guaranteed" and "Defined Memory:" set to 12288MB.

Even though this may be a workaround. it's not a good setting for VDI vms though, as we will utilize much more memory than needed having to reserve the full 12GB per vm.

Is there an issue with amount of allocated memory for the VM and SPICE?
Comment 19 Frank DeLorey 2016-03-09 10:03:07 EST
We have informed the customer of this and they will send us the data when this is next reported by a user.

Frank
Comment 20 Alexandros Gkesos 2016-05-09 05:59:58 EDT
With RHEV-H 7 and "Physical Memory Guaranteed" and "Defined Memory:" set to the same value it didn't happen again. Customer closed the case.

Closing this bug as well.
Thank you all

Note You need to log in before you can comment on or make changes to this bug.