RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1428804 - remote-viewer hangs after~1 hour of client inactivity when SPICE proxy enabled
Summary: remote-viewer hangs after~1 hour of client inactivity when SPICE proxy enabled
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: spice
Version: 7.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Frediano Ziglio
QA Contact: SPICE QE bug list
URL:
Whiteboard:
Depends On: 1719736
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-03 11:52 UTC by Andrei Stepanov
Modified: 2019-06-12 12:54 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1719736 (view as bug list)
Environment:
Last Closed: 2019-06-12 12:54:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
keepallive (proxy enabled) (45.83 KB, image/png)
2017-03-13 13:04 UTC, Radek Duda
no flags Details

Description Andrei Stepanov 2017-03-03 11:52:57 UTC
There was a RFE: "Implement a SPICE protocol level keepalive mechanism for all channels" https://bugzilla.redhat.com/show_bug.cgi?id=1298590

It was believed, that above bug would help to solve issues with RV hangups. But, above solution doesn't not help. RV hangs after a period of client inactivity.

Server:
spice-server-0.12.4-20.el7_3.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64

Client RHEL73:
rpm -qa | grep -E 'spice|viewer'
virt-viewer-2.0-12.el7.x86_64
gnome-font-viewer-3.14.1-4.el7.x86_64
spice-server-0.12.4-20.el7_3.x86_64
spice-gtk3-0.31-6.el7_3.2.x86_64
spice-vdagent-0.14.0-14.el7.x86_64
spice-glib-0.31-6.el7_3.2.x86_64
spice-protocol-0.12.11-1.el7.noarch

or Client  Windows10 with spice-client-msi-x86-4.1-6.el7ev.noarch

Steps to Reproduce:
1. Connect to VM with remote-viewer
2. Keep remote-viewer running.
3. Do not interact with VM in any way.

Actual results: After ~30-60 minutes RV hangs. It doesn't respond to any keys, a mouse clicks.

This bug is very tricky in terms of reproduction. It would be good to know what logs are necessary in advance.

Comment 2 Radek Duda 2017-03-06 13:03:54 UTC
not reproducible with spice-0.12.4-17.el7 installed on host. Keepalive packet is sent approx. every 10 minutes.

Comment 3 Radek Duda 2017-03-06 17:21:59 UTC
When SPICE proxy is disabled, keepalive packets are sent and guest in remote-viewer does not hang. So this seems to be a SPICE proxy problem

Comment 4 Christophe Fergeau 2017-03-06 17:29:14 UTC
What is the exact problem description now?

When spice proxy is enabled,
1) keepalive packets are not sent, and the guest hangs?
2) keepalive packets are sent, but the guest hangs?

Comment 5 Frediano Ziglio 2017-03-06 18:14:56 UTC
(In reply to Radek Duda from comment #3)
> When SPICE proxy is disabled, keepalive packets are sent and guest in
> remote-viewer does not hang. So this seems to be a SPICE proxy problem

When client is connected to a proxy is the proxy that should send the keep alive packets, the server cannot as a proxy is a L7.
If the proxy (which actually can be any http proxy) is not configured for keep alive packets this disconnection is expected.

There are some possible solutions:
- configure proxy for keep alive;
- add keep alive support to client, the client also will send keep alive packets to the proxy to keep client <-> proxy connection alive;
- implement keep alive at spice protocol level (I would avoid this);
- do not close connections on inactivity (usually not possible as done for security reasons in the network infrastructure).

Comment 6 Radek Duda 2017-03-07 09:31:17 UTC
(In reply to Christophe Fergeau from comment #4)
> What is the exact problem description now?
> 
> When spice proxy is enabled,
> 1) keepalive packets are not sent, and the guest hangs?
> 2) keepalive packets are sent, but the guest hangs?

When proxy is enabled, keepalive packets are not sent to client at all

Comment 7 Frediano Ziglio 2017-03-07 12:38:46 UTC
(In reply to Radek Duda from comment #6)
> (In reply to Christophe Fergeau from comment #4)
> > What is the exact problem description now?
> > 
> > When spice proxy is enabled,
> > 1) keepalive packets are not sent, and the guest hangs?
> > 2) keepalive packets are sent, but the guest hangs?
> 
> When proxy is enabled, keepalive packets are not sent to client at all

This is based on proxy configuration as explained above.
We are currently discussing on implementing keep alive on client to keep the client <-> proxy connection alive.

Comment 8 Christophe Fergeau 2017-03-10 16:56:41 UTC
This spice-gtk scratch build should send keepalives every 10 minutes, can you test it Radek and see if it helps with the hang you are seeing?
http://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12738921

Comment 9 Frediano Ziglio 2017-03-13 09:19:13 UTC
> 
> Actual results: After ~30-60 minutes RV hangs. It doesn't respond to any
> keys, a mouse clicks.
> 

I was just reading this line. If there are no activity on the connection and you have some connection tracking on the middle is possibly that some hop close silently the connection not sending any data. However clicks and keys are supposed to generate some traffic on the network. At this point the packets (in this case TCP) should be just dropped. But TCP should have some timeout as is waiting for some packet acknowledge and RV should after a while detect the connection close and TCP will complain (from user level prospective after some timeout the connection get closed by the kernel and notify the user space).
So why this dos not happen sending keys/mouse events? Is the timeout too long? Or RV just silently close the affected connection (input channel)?

Comment 10 Radek Duda 2017-03-13 13:02:30 UTC
(In reply to Christophe Fergeau from comment #8)
> This spice-gtk scratch build should send keepalives every 10 minutes, can
> you test it Radek and see if it helps with the hang you are seeing?
> http://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12738921

I installed this spice-gtk package to client (rhel7.3-z) and detected keepalive packets after 10 minutes of inactivity to be send from client to proxy (have proxy enabled). But that is all - after another 10 minutes no any more keepalive packet is sent and guest in remote-viewer becames unresponsive to user inputs. 

Before I didn't detected keepalive at all.

Comment 11 Radek Duda 2017-03-13 13:04:46 UTC
Created attachment 1262439 [details]
keepallive (proxy enabled)

Screenshot from wireshark - after ~10 minutes is sent keepalive packet
10.34.130.192 - client
13.34.73.1 - proxy

Comment 12 Frediano Ziglio 2017-03-13 14:23:01 UTC
(In reply to Radek Duda from comment #10)
> (In reply to Christophe Fergeau from comment #8)
> > This spice-gtk scratch build should send keepalives every 10 minutes, can
> > you test it Radek and see if it helps with the hang you are seeing?
> > http://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12738921
> 
> I installed this spice-gtk package to client (rhel7.3-z) and detected
> keepalive packets after 10 minutes of inactivity to be send from client to
> proxy (have proxy enabled). But that is all - after another 10 minutes no
> any more keepalive packet is sent and guest in remote-viewer becames
> unresponsive to user inputs. 
> 
> Before I didn't detected keepalive at all.

There are some weird thing. There's no patch attached so maybe there are some detail missing, however:
- the keepalives are only for a single connection (port 49734, first 2 packets are other stuff), where are the other connections? I have to suppose that either keepalive is enabled for a single connection or that the client is closing other connection but in this case why it hangs?
- it seems that interval is 10 seconds but default for RHEL7.3 is 75 and count is 5 (there are 5 keep alive packets) but the default for RHEL7.3 is 9. Did the code patch changed these settings?

Surely the client is not detecting properly disconnection from the server or not propagate a channel close to other channel. Not really knowledgeable about spice-gtk code.

Comment 13 Frediano Ziglio 2017-03-13 17:20:07 UTC
(In reply to Frediano Ziglio from comment #12)
> (In reply to Radek Duda from comment #10)
> > (In reply to Christophe Fergeau from comment #8)
> > > This spice-gtk scratch build should send keepalives every 10 minutes, can
> > > you test it Radek and see if it helps with the hang you are seeing?
> > > http://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12738921
> > 
> > I installed this spice-gtk package to client (rhel7.3-z) and detected
> > keepalive packets after 10 minutes of inactivity to be send from client to
> > proxy (have proxy enabled). But that is all - after another 10 minutes no
> > any more keepalive packet is sent and guest in remote-viewer becames
> > unresponsive to user inputs. 
> > 
> > Before I didn't detected keepalive at all.
> 
> There are some weird thing. There's no patch attached so maybe there are
> some detail missing, however:
> - the keepalives are only for a single connection (port 49734, first 2
> packets are other stuff), where are the other connections? I have to suppose
> that either keepalive is enabled for a single connection or that the client
> is closing other connection but in this case why it hangs?
> - it seems that interval is 10 seconds but default for RHEL7.3 is 75 and
> count is 5 (there are 5 keep alive packets) but the default for RHEL7.3 is
> 9. Did the code patch changed these settings?
> 
> Surely the client is not detecting properly disconnection from the server or
> not propagate a channel close to other channel. Not really knowledgeable
> about spice-gtk code.

Ignore...
Our keepalive packets are the first 2, the ones to 13.34.73.1, the proxy.
2 for 2 different connections (ports 49942 and 49944).
But this seems to indicate that the client closed the connection to the proxy as we should have other keepalives. But this should be visible to the user too.

Comment 14 Christophe Fergeau 2017-03-24 17:20:32 UTC
Most likely not solvable with a client-side TCP keepalive change as I reproduced this on a system with
# cat /proc/sys/net/ipv4/tcp_keepalive_time 
120

Comment 17 David Blechter 2019-06-11 14:50:27 UTC
moving to 7.8. Too late for 7.7. We need to solve it in rhel 8 as well

Comment 18 Frediano Ziglio 2019-06-11 18:57:34 UTC
Both client and server implemented keep-alives. Just a matter of backport needed changesets.

Comment 19 Frediano Ziglio 2019-06-12 11:35:16 UTC
spice-server in RHEL 7.7 already has the support for keepalive at tcp level.
Note that having a proxy in the middle both server and client must have support to keepalive in order to keep the proxy tunnel alive.
Patch for client is:

commit 677782fb6aa471d5e6d007744a5c6564b1f3021f
Author: Jeremy White <jwhite>
Date:   Tue Apr 30 17:04:59 2019 -0500

    Detect timeout conditions more aggressively on Linux
    
    This mitigates a fairly rare problem we see with our kiosk mode clients.
    That is, normally if something goes wrong with a client connection
    (e.g. the session is killed, or the server is restarted ), the kiosk will
    exit on disconnect, and we get a chance to retry the connection, or
    present the user with a 'server down' style message.
    
    But in the case of a serious network problem or a server hard power
    cycle (i.e. no TCP FIN packets can flow), our end user behavior is not
    ideal - the kiosk appears to hang solid, requiring a power cycle.
    
    That's because we've got the stock keepalive timeouts, or about 2 hours
    and 11 minutes, before the client sees the disconnect.
    
    This change will cause the client to recognize the server has vanished
    without a TCP FIN after 75 seconds.
    
    See this thread:
      https://lists.freedesktop.org/archives/spice-devel/2017-March/036553.html
    
    As well as this bug:
      https://bugzilla.redhat.com/show_bug.cgi?id=1436589
    
    Signed-off-by: Jeremy White <jwhite>


which is included in spice-gtk 0.37 version. The version ditributed with RHEL 7.7 is 0.35 so the patch needs to be backported on the client.

Comment 23 Frediano Ziglio 2019-06-12 12:54:19 UTC
Spice-server part already in place, spice-gtk part will be merged in rhel 7.8, opened bug #1719736


Note You need to log in before you can comment on or make changes to this bug.