Bug 1949978 - [sig-network-edge][Conformance][Area:Networking][Feature:Router] The HAProxy router should pass the h2spec conformance tests [Suite:openshift/conformance/parallel/minimal]
Summary: [sig-network-edge][Conformance][Area:Networking][Feature:Router] The HAProxy ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Andrew McDermott
QA Contact: jechen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-15 14:09 UTC by Oleg Bulatov
Modified: 2022-08-04 22:32 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
[sig-network-edge][Conformance][Area:Networking][Feature:Router] The HAProxy router should pass the h2spec conformance tests [Suite:openshift/conformance/parallel/minimal]
Last Closed: 2021-07-27 23:01:13 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 26083 0 None closed Bug 1949978: test/extended/router: h2spec increase timeout and retry tests 2021-04-19 14:10:06 UTC
Github openshift origin pull 26086 0 None closed Bug 1949978: test/extended/router: skip h2spec on proxy jobs 2021-04-19 14:10:09 UTC
Github openshift origin pull 26089 0 None open Bug 1949978: test/extended/router: skip ingresscontroller deletion 2021-04-19 14:10:12 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:01:31 UTC

Description Oleg Bulatov 2021-04-15 14:09:27 UTC
test:
[sig-network-edge][Conformance][Area:Networking][Feature:Router] The HAProxy router should pass the h2spec conformance tests [Suite:openshift/conformance/parallel/minimal] 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-network-edge%5C%5D%5C%5BConformance%5C%5D%5C%5BArea%3ANetworking%5C%5D%5C%5BFeature%3ARouter%5C%5D+The+HAProxy+router+should+pass+the+h2spec+conformance+tests+%5C%5BSuite%3Aopenshift%2Fconformance%2Fparallel%2Fminimal%5C%5D

An example of a failed job: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.8-e2e-aws/1382601757305081856

Output:

fail [github.com/openshift/origin/test/extended/router/h2spec.go:155]: Unexpected error:
    <exec.CodeExitError>: {
        Err: {
            s: "error running /usr/bin/kubectl --server=https://api.ci-op-gt9b13hp-abfa2.origin-ci-int-aws.dev.rhcloud.com:6443 --kubeconfig=/tmp/kubeconfig-030368613 --namespace=e2e-test-router-h2spec-h7w64 exec h2spec -- /bin/sh -x -c cat \"/tmp/h2spec-results\":\nCommand stdout:\n\nstderr:\n+ cat /tmp/h2spec-results\ncat: /tmp/h2spec-results: No such file or directory\ncommand terminated with exit code 1\n\nerror:\nexit status 1",
        },
        Code: 1,
    }
    error running /usr/bin/kubectl --server=https://api.ci-op-gt9b13hp-abfa2.origin-ci-int-aws.dev.rhcloud.com:6443 --kubeconfig=/tmp/kubeconfig-030368613 --namespace=e2e-test-router-h2spec-h7w64 exec h2spec -- /bin/sh -x -c cat "/tmp/h2spec-results":
    Command stdout:
    
    stderr:
    + cat /tmp/h2spec-results
    cat: /tmp/h2spec-results: No such file or directory
    command terminated with exit code 1
    
    error:
    exit status 1
occurred

Comment 1 Oleg Bulatov 2021-04-15 14:11:38 UTC
Marking it as a high severity bug, as it has high impact on CI.

$ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=%5C%5Bsig-network-edge%5C%5D%5C%5BConformance%5C%5D%5C%5BArea%3ANetworking%5C%5D%5C%5BFeature%3ARouter%5C%5D+The+HAProxy+router+should+pass+the+h2spec+conformance+tests+%5C%5BSuite%3Aopenshift%2Fconformance%2Fparallel%2Fminimal%5C%5D&maxAge=168h&context=1&type=bug%2Bjunit&name=4.8&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job' | grep 'of failures match'
periodic-ci-openshift-release-master-nightly-4.8-e2e-aws-proxy (all) - 31 runs, 94% failed, 34% of failures match = 32% impact
release-openshift-ocp-installer-e2e-aws-upi-4.8 (all) - 27 runs, 81% failed, 32% of failures match = 26% impact
periodic-ci-openshift-release-master-nightly-4.8-e2e-aws-fips (all) - 27 runs, 37% failed, 40% of failures match = 15% impact
periodic-ci-openshift-release-master-nightly-4.8-e2e-aws (all) - 39 runs, 31% failed, 17% of failures match = 5% impact
periodic-ci-openshift-release-master-nightly-4.8-e2e-azure (all) - 10 runs, 90% failed, 33% of failures match = 30% impact
periodic-ci-openshift-release-master-nightly-4.8-e2e-gcp-rt (all) - 26 runs, 100% failed, 12% of failures match = 12% impact
promote-release-openshift-okd-machine-os-content-e2e-aws-4.8 (all) - 84 runs, 13% failed, 64% of failures match = 8% impact
release-openshift-ocp-installer-e2e-gcp-ovn-4.8 (all) - 28 runs, 86% failed, 17% of failures match = 14% impact
release-openshift-ocp-installer-e2e-aws-ovn-4.8 (all) - 27 runs, 44% failed, 42% of failures match = 19% impact
release-openshift-ocp-installer-e2e-openstack-4.8 (all) - 21 runs, 100% failed, 24% of failures match = 24% impact
promote-release-openshift-machine-os-content-e2e-aws-4.8 (all) - 77 runs, 12% failed, 44% of failures match = 5% impact
periodic-ci-openshift-release-master-okd-4.8-e2e-aws (all) - 42 runs, 62% failed, 15% of failures match = 10% impact
periodic-ci-openshift-release-master-ci-4.8-e2e-aws-upgrade-rollback (all) - 7 runs, 29% failed, 50% of failures match = 14% impact
release-openshift-origin-installer-e2e-aws-disruptive-4.8 (all) - 4 runs, 75% failed, 33% of failures match = 25% impact
release-openshift-origin-installer-e2e-aws-shared-vpc-4.8 (all) - 4 runs, 75% failed, 33% of failures match = 25% impact
rehearse-17585-pull-ci-openshift-cluster-nfd-operator-release-4.8-e2e-aws (all) - 17 runs, 65% failed, 18% of failures match = 12% impact
release-openshift-ocp-installer-e2e-azure-ovn-4.8 (all) - 27 runs, 81% failed, 5% of failures match = 4% impact
periodic-ci-openshift-release-master-nightly-4.8-e2e-aws-upgrade (all) - 30 runs, 70% failed, 14% of failures match = 10% impact
rehearse-17717-pull-ci-openshift-openshift-apiserver-release-4.8-e2e-aws (all) - 5 runs, 60% failed, 33% of failures match = 20% impact
periodic-ci-openshift-release-master-nightly-4.8-e2e-aws-workers-rhel7 (all) - 28 runs, 64% failed, 6% of failures match = 4% impact
periodic-ci-openshift-release-master-ci-4.8-e2e-aws-compact (all) - 3 runs, 100% failed, 33% of failures match = 33% impact
periodic-ci-openshift-release-master-ci-4.8-e2e-aws-compact-serial (all) - 3 runs, 100% failed, 33% of failures match = 33% impact
release-openshift-ocp-installer-e2e-remote-libvirt-ppc64le-4.8 (all) - 14 runs, 93% failed, 8% of failures match = 7% impact
release-openshift-ocp-installer-e2e-remote-libvirt-s390x-4.8 (all) - 14 runs, 71% failed, 10% of failures match = 7% impact
release-openshift-ocp-installer-e2e-remote-libvirt-compact-s390x-4.8 (all) - 13 runs, 92% failed, 8% of failures match = 8% impact

Comment 3 Andrew McDermott 2021-04-16 13:02:10 UTC
I will follow up with another PR that disables the test if running in a proxied environment.

Comment 4 Andrew McDermott 2021-04-16 13:55:47 UTC
Added additional fix https://github.com/openshift/origin/pull/26086, moving back to POST.

Comment 6 Hongan Li 2021-04-19 03:52:33 UTC
checked the search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-network-edge%5C%5D%5C%5BConformance%5C%5D%5C%5BArea%3ANetworking%5C%5D%5C%5BFeature%3ARouter%5C%5D+The+HAProxy+router+should+pass+the+h2spec+conformance+tests+%5C%5BSuite%3Aopenshift%2Fconformance%2Fparallel%2Fminimal%5C%5D

and still seeing failures in: 
periodic-ci-openshift-release-master-nightly-4.8-e2e-gcp-rt (all) 
release-openshift-ocp-installer-e2e-aws-mirrors-4.8 (all)
release-openshift-ocp-installer-e2e-aws-upi-4.8 (all)

Comment 7 Andrew McDermott 2021-04-19 08:05:11 UTC
Looking at one result in:

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-upi-4.8/1384016184512352256/artifacts/e2e-aws-upi/e2e.log

  5. Streams and Multiplexing
    5.1. Stream States
        1: idle: Sends a DATA frame
      ✔ 1: idle: Sends a DATA frame
        2: idle: Sends a RST_STREAM frame
      ✔ 2: idle: Sends a RST_STREAM frame
        3: idle: Sends a WINDOW_UPDATE frame
      ✔ 3: idle: Sends a WINDOW_UPDATE frame
        4: idle: Sends a CONTINUATION frame
      ✔ 4: idle: Sends a CONTINUATION frame
        5: half closed (remote): Sends a DATA frame
      ✔ 5: half closed (remote): Sends a DATA frame
        6: half closed (remote): Sends a HEADERS frame
      ✔ 6: half closed (remote): Sends a HEADERS frame
        7: half closed (remote): Sends a CONTINUATION frame
      × 7: half closed (remote): Sends a CONTINUATION frame

Error: dial tcp 54.147.190.229:443: i/o timeout

I see the tests run but are getting a timeout. Investigating.

Comment 8 Andrew McDermott 2021-04-19 08:44:41 UTC
It also looks like quite a lot of the tests run to (almost) completion.
The tests run, but looking through the logs the last action is to delete
the ingresscontroller that was stood up for the test and deleting that 
may be taking too long.

https://bugzilla.redhat.com/show_bug.cgi?id=1912413

Comment 10 jechen 2021-04-21 13:57:21 UTC
checked most recent CI reports (over 17 runs), did not see the h2spec conformance tests fail for HAProxy router any more, change the bug status to verified.

https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.8-e2e-aws

Comment 13 errata-xmlrpc 2021-07-27 23:01:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.