Bug 1646901

Summary: [starter-ca-central-1] routing issues
Product: OpenShift Online Reporter: Jiří Fiala <jfiala>
Component: RoutingAssignee: Dan Mace <dmace>
Status: CLOSED DUPLICATE QA Contact: zhaozhanqi <zzhao>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.xCC: aos-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-06 11:44:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
HTTP 200/503 rate using different routes none

Description Jiří Fiala 2018-11-06 09:32:01 UTC
Created attachment 1502338 [details]
HTTP 200/503 rate using different routes

Description of problem:

Requests to exposed and admitted routes often return HTTP 503 even though the application is running properly and responding just fine, when checked locally (by opening a remote shell into the pod and running curl localhost:8080, or so).

We received a report detailing this behaviour as follows:
----
I went ahead and created three routes with different configurations:

Insecure: http://insecure-kros-converter.193b.starter-ca-central-1.openshiftapps.com/ 
With multiple routes defined, this one no longer does HTTPS redirects (previously, an insecure route with no TLS termination and Insecure Policy defined was unexpectedly redirected from http to https, which correctly reported HTTP 503), but it rarely works. It seems it kind of starts working after trying it a few times after a period of inactivity, however some of the static resources often fail to be fetched.

Secure, allow HTTP traffic: https://secure-kros-converter.193b.starter-ca-central-1.openshiftapps.com/ 
Works in a fashion similar to the above, maybe a bit more often, but still very unstable. 
Curiously, http://secure-kros-converter.193b.starter-ca-central-1.openshiftapps.com/ (i.e. the secure route but with HTTP) is noticeably more stable, although still not perfect.

Secure, redirect HTTP to HTTPS: https://redirect-kros-converter.193b.starter-ca-central-1.openshiftapps.com/ 
Probably most stable during activity (like 50%), but also stops working at all after a period of inactivity, curiously also after using another route for a while. Interestingly, the HTTPS redirect works only in those periods of stability, which may suggest the problem is somewhere before deciding about the HTTPS redirect, so quite soon.

Note that doing curl localhost:8080 in the pod terminal is 100% stable, so I don't suspect a problem with the pod.
----
A projection of the currently observed success/fail rate for the above mentioned route URLs can be found in the attached 'ca-central-1-503-rate.log' file.

Version-Release number of selected component (if applicable):
Server https://api.starter-ca-central-1.openshift.com:443
openshift v3.11.16
kubernetes v1.11.0+d4cacc0

How reproducible:
Appears to be consistently reproducible in the past few days using various routes

Steps to Reproduce:
1. deploy a new application
2. create one or more routes to the application
3. hit the external URL repeatedly

Actual results:
random and relatively frequent HTTP 503 as response to those request, sometimes only to a particular static resource 

Expected results:
consistent response, as long as the app is running properly

Additional info:
This is mentioned in INC0802435
The observed behaviour is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1609751
This could be different than https://bugzilla.redhat.com/show_bug.cgi?id=1645206, the router pods are not restarting frequently at all.

Comment 1 Jiří Fiala 2018-11-06 11:44:39 UTC

*** This bug has been marked as a duplicate of bug 1646903 ***