Bug 1926146 - [sig-network-edge][Conformance][Area:Networking][Feature:Router] The HAProxy router should be able to connect to a service that is idled because a GET on the route will unidle it
Summary: [sig-network-edge][Conformance][Area:Networking][Feature:Router] The HAProxy ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.8.0
Assignee: Andrew McDermott
QA Contact: Arvind iyengar
URL:
Whiteboard:
Depends On:
Blocks: 1927953 1944204
TreeView+ depends on / blocked
 
Reported: 2021-02-08 10:51 UTC by Jan Chaloupka
Modified: 2022-08-04 22:32 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
[sig-network-edge][Conformance][Area:Networking][Feature:Router] The HAProxy router should be able to connect to a service that is idled because a GET on the route will unidle it
Last Closed: 2021-07-27 22:42:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 25874 0 None closed Bug 1926146: test/extended/router/idle: address flakes/failures seen in CI 2021-02-25 10:13:47 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:42:37 UTC

Description Jan Chaloupka 2021-02-08 10:51:13 UTC
test:
[sig-network-edge][Conformance][Area:Networking][Feature:Router] The HAProxy router should be able to connect to a service that is idled because a GET on the route will unidle it 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-network-edge%5C%5D%5C%5BConformance%5C%5D%5C%5BArea%3ANetworking%5C%5D%5C%5BFeature%3ARouter%5C%5D+The+HAProxy+router+should+be+able+to+connect+to+a+service+that+is+idled+because+a+GET+on+the+route+will+unidle+it


https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-rt-4.7/1358650050154074112

The test fails after g.By("Validating that the idle annotations have been removed from the endpoints") step after timing out after 3 minutes:
```
fail [github.com/openshift/origin/test/extended/router/idle.go:105]: Unexpected error:
    <*errors.errorString | 0xc0002d29b0>: {
        s: "timed out waiting for the condition",
    }
    timed out waiting for the condition
occurred
```

Comment 1 Andrew McDermott 2021-02-08 18:15:11 UTC
The CI test runs locally in a loop without failure and with other
origin tests running. My first pass will be just bumping the timeouts
for the things we wait on in the test - currently set at 3m and 1m -
perhaps on a test cluster the latter is not enough.

Marking as blocker- for the moment.

Comment 2 Standa Laznicka 2021-02-10 08:29:56 UTC
This test is failing 100% cases on proxy:
https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-ci-openshift-release-master-ocp-4.7-e2e-aws-proxy
https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-ocp-4.8-e2e-aws-proxy

I can only assume you forgot to add `Proxy: http.ProxyFromEnvironment` to your transport. Please fix ASAP.

Comment 3 Standa Laznicka 2021-02-10 09:25:19 UTC
The test is also suffering from 100% failure rate on s390x libvirt installs with "waiting for condition" (I did not proceed further as to why that is):
https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#release-openshift-origin-installer-e2e-remote-libvirt-s390x-4.7

Comment 4 Andrew McDermott 2021-02-10 17:41:08 UTC
(In reply to Standa Laznicka from comment #2)
> This test is failing 100% cases on proxy:
> https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-
> ci-openshift-release-master-ocp-4.7-e2e-aws-proxy
> https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-
> ci-openshift-release-master-ocp-4.8-e2e-aws-proxy
> 
> I can only assume you forgot to add `Proxy: http.ProxyFromEnvironment` to
> your transport. Please fix ASAP.

Proxy passing: https://github.com/openshift/origin/pull/25874#issuecomment-776885859

Comment 6 jamo luhrsen 2021-02-16 00:01:18 UTC
I noticed the PR attached to this bug was merged 4 days ago, but the test case is still permafailing in this job:
https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-ci-openshift-release-master-ocp-4.7-e2e-aws-proxy

I'll re-open it for now, but maybe we want a new bug?

Comment 7 jamo luhrsen 2021-02-22 19:25:23 UTC
(In reply to jamo luhrsen from comment #6)
> I noticed the PR attached to this bug was merged 4 days ago, but the test
> case is still permafailing in this job:
> https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-
> ci-openshift-release-master-ocp-4.7-e2e-aws-proxy
> 
> I'll re-open it for now, but maybe we want a new bug?

still failing. any update?

Comment 8 jamo luhrsen 2021-02-22 19:37:20 UTC
(In reply to jamo luhrsen from comment #7)
> (In reply to jamo luhrsen from comment #6)
> > I noticed the PR attached to this bug was merged 4 days ago, but the test
> > case is still permafailing in this job:
> > https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-
> > ci-openshift-release-master-ocp-4.7-e2e-aws-proxy
> > 
> > I'll re-open it for now, but maybe we want a new bug?
> 
> still failing. any update?

looks like this is the PR to resolve this for 4.7:
https://github.com/openshift/origin/pull/25892

Comment 9 Andrew McDermott 2021-02-25 09:46:17 UTC
(In reply to jamo luhrsen from comment #7)
> (In reply to jamo luhrsen from comment #6)
> > I noticed the PR attached to this bug was merged 4 days ago, but the test
> > case is still permafailing in this job:
> > https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-
> > ci-openshift-release-master-ocp-4.7-e2e-aws-proxy
> > 
> > I'll re-open it for now, but maybe we want a new bug?
> 
> still failing. any update?

The PR originally referenced only made it into 4.8:

 https://github.com/openshift/origin/pull/25874

Updated the BZ links; https://github.com/openshift/origin/pull/25892 is the backport PR for 4.7.

Comment 10 Andrew McDermott 2021-02-25 10:13:28 UTC
(In reply to jamo luhrsen from comment #6)
> I noticed the PR attached to this bug was merged 4 days ago, but the test
> case is still permafailing in this job:
> https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#periodic-
> ci-openshift-release-master-ocp-4.7-e2e-aws-proxy
> 
> I'll re-open it for now, but maybe we want a new bug?

This change only landed in 4.8.

I see the test passing in: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-ocp-4.8-e2e-aws-proxy

Please can we revalidate this as the test is passing in 4.8?

The 4.7 backport is here: https://bugzilla.redhat.com/show_bug.cgi?id=1927953

Comment 13 Andrew McDermott 2021-03-29 10:54:45 UTC
Correcting target release based on https://github.com/openshift/origin/pull/25893#issuecomment-777860169

Comment 19 errata-xmlrpc 2021-07-27 22:42:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.