Bug 1791905 - the server sometimes is unable to handle the get routes.route.openshift.io request
Summary: the server sometimes is unable to handle the get routes.route.openshift.io re...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-apiserver
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.4.0
Assignee: Lukasz Szaszkiewicz
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-16 16:48 UTC by Oleg Bulatov
Modified: 2020-03-24 13:00 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-17 16:52:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
http request to route.openshift.io group (103.22 KB, image/png)
2020-02-12 11:22 UTC, Lukasz Szaszkiewicz
no flags Details
aggregator_unavailable_apiservice metrics (44.55 KB, image/png)
2020-02-12 11:23 UTC, Lukasz Szaszkiewicz
no flags Details

Description Oleg Bulatov 2020-01-16 16:48:14 UTC
Description of problem:

Recently I found that some of our tests fails because of the error

unable to get route: the server is currently unable to handle the request (get routes.route.openshift.io testroute)

In our test we have disabled the openshift-apiserver operator, so openshift-apiserver shouldn't be interrupted and should stay available during the test.

Version-Release number of selected component (if applicable):

master

How reproducible:

Sometimes. There are few occasions:

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/428/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws-operator/2202
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/437/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws-operator/2196
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/428/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws-operator/2195

Expected results:

the apiserver stays high available during the test suite and always handle `get routes.route.openshift.io` requests

Comment 2 Lukasz Szaszkiewicz 2020-02-12 11:21:31 UTC
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/437/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws-operator/2196 seems to be a one-off issue.

The test tried to connect to the server but failed with "unable to get route: the server is currently unable to handle the request (get routes.route.openshift.io testroute)". The error message implies that the servers should have returned HTTP 503 error code. However, I didn't find a single request with that HTTP status (see attached file). Additionally `aggregator_unavailable_apiservice` metric didn't report anything.

Comment 3 Lukasz Szaszkiewicz 2020-02-12 11:22:37 UTC
Created attachment 1662648 [details]
http request to route.openshift.io group

Comment 4 Lukasz Szaszkiewicz 2020-02-12 11:23:18 UTC
Created attachment 1662649 [details]
aggregator_unavailable_apiservice metrics

Comment 5 Lukasz Szaszkiewicz 2020-02-12 11:54:27 UTC
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-image-registry-operator/428/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws-operator/2202 is similar additionally one test reports dial tcp 10.0.131.15:10250: connect: connection refused

Comment 7 Lukasz Szaszkiewicz 2020-02-12 12:49:53 UTC
I've checked briefly other runs to see if they suffer from the same issue and I haven't found any (https://prow.svc.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-cluster-image-registry-operator-master-e2e-aws-operator)

Comment 8 Lukasz Szaszkiewicz 2020-02-12 12:56:21 UTC
Assuming that the graceful mechanism we have works you shouldn't see any interruptions even during restarts.
Oleg does the issue still occurs?

Comment 9 Oleg Bulatov 2020-02-13 15:50:28 UTC
I haven't seen this issue for a while.


Note You need to log in before you can comment on or make changes to this bug.