Bug 1397188 - Unusual OpenShift Console Behavior
Summary: Unusual OpenShift Console Behavior
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Dan Winship
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-21 20:29 UTC by Steven Walter
Modified: 2017-03-01 09:33 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-30 19:32:26 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Steven Walter 2016-11-21 20:29:49 UTC
Description of problem:
After logging into the OpenShift Web Console, after about 1 minute "at rest" on a single page a dialog box displays and forces you to refresh the page or nothing on the page updates. Dialog displays "Server Connection Interrupted".

Version-Release number of selected component (if applicable):


How reproducible:
Unverified

Actual results:
See priv comment for details

Expected results:
Load fine and not drop connections

Additional info:
it does not break any builds, but it does make navigating the WebUI quiet painful. as far as performance, the site loads and runs very smooth and snappy. Its just after a minute of sitting idle on the WebUI, it forces a refresh.  when on CLI if you "oc rsh" into a container, you get kicked after about 1-2mins being idle on the cmd prompt of the container. Performance seems the same when using VPN or within VPN so it is unlikely any firewall/VPN in the middle. When using the Web Dev console in the browser, it seems that most Websocket traffic is Pending for a very long time. Like:
imagestreams?watch=true&resourceVersion=972..... will take many seconds rather than the very short, .1 or .01 seconds it does in a separate test environment.

Comment 8 Ben Bennett 2016-11-23 13:54:00 UTC
Can you please get a tcpdump / wireshark trace of the traffic at the client side?  Ideally one with the working Safari and one with Chrome.  I'd be interested to see what connections exist and which end is initiating the teardown.

Comment 13 Ben Bennett 2017-01-25 19:38:38 UTC
Are they still seeing this behavior?

Comment 15 Steven Walter 2017-01-25 20:32:03 UTC
I'm verifying with them

Comment 17 Dan Winship 2017-01-25 22:57:01 UTC
The Safari capture shows a connection being established at time 1.530214, being used for a bit of traffic, and then going idle at 1.834035. Then at 61.920059 (60 seconds plus epsilon later), the server cleanly closes its end of the connection (and then the client tries to get in a few words before closing the other end of the connection, but the server has already stopped listening to it, and so responds with RSTs).

So there doesn't seem to be anything "networky" going on there; the connection has gone idle, the server has apparently been configured to close any connections that are idle for longer than 60 seconds, and the client apparently isn't expecting this.

I don't know much about our web UI, but based on the "will not even allow me to login and it takes FOREVER to load the login page" comment it seems like the problem is that some request that is supposed to happen quickly is getting "stuck" forever(-ish) somewhere in the backend and so we don't get a response before the proxy times out the connection? Looking at the logs on the server might show something.


(It's worth noting that in both pcaps there seems to be a lot of network lossage (retransmissions, duplicates, etc) going on. But TCP seems to be coping with that lossage (because TCP), so I don't think it's related to this problem.)

Comment 18 Ben Bennett 2017-01-30 19:32:26 UTC
Based on Dan's comment it looks like something is taking too long to respond in the application.

You can increase the timeouts:
- Globally: https://docs.openshift.com/container-platform/3.3/install_config/configuring_routing.html#install-config-configuring-route-timeouts

- Per-route: https://docs.openshift.com/container-platform/3.3/architecture/core_concepts/routes.html#haproxy-template-router (see the environment variable section)


Note You need to log in before you can comment on or make changes to this bug.