Description of problem: Flaky test in e2e-azure-serial: [Conformance][Area:Networking][Feature:Router] The HAProxy router should override the route host with a custom value [Suite:openshift/conformance/parallel/minimal]: fail [github.com/openshift/origin/test/extended/router/scoped.go:137]: Unexpected error: <*errors.errorString | 0xc003b17e10>: { s: "host command failed: error running &{/usr/bin/kubectl [kubectl --server=https://api.ci-op-8zqr6i6n-282fe.ci.azure.devcluster.openshift.com:6443 --kubeconfig=/tmp/admin.kubeconfig exec --namespace=e2e-test-router-scoped-wrv7w execpod -- /bin/sh -x -c \n\t\tset -e\n\t\tSTOP=$(($(date '+%s') + 180))\n\t\twhile [ $(date '+%s') -lt $STOP ]; do\n\t\t\tcode=$( curl -k -s -m 5 -o /dev/null -w '%{http_code}\\n' --header 'Host: 10.129.2.221' \"http://10.129.2.221:1936/healthz\" ) || rc=$?\n\t\t\tif [[ \"${rc:-0}\" -eq 0 ]]; then\n\t\t\t\techo $code\n\t\t\t\tif [[ $code -eq 200 ]]; then\n\t\t\t\t\texit 0\n\t\t\t\tfi\n\t\t\t\tif [[ $code -ne 503 ]]; then\n\t\t\t\t\texit 1\n\t\t\t\tfi\n\t\t\telse\n\t\t\t\techo \"error ${rc}\" 1>&2\n\t\t\tfi\n\t\t\tsleep 1\n\t\tdone\n\t\t] [] <nil> 500\n + set -e\n+ date +%s\n+ STOP=1568847433\n+ date +%s\n+ [ 1568847253 -lt 1568847433 ]\n+ curl -k -s -m 5 -o /dev/null -w %{http_code}\\n --header Host: 10.129.2.221 http://10.129.2.221:1936/healthz\n+ code=500\n+ [[ 0 -eq 0 ]]\n+ echo 500\n+ [[ 500 -eq 200 ]]\n+ [[ 500 -ne 503 ]]\n+ exit 1\ncommand terminated with exit code 1\n [] <nil> 0xc003128ae0 exit status 1 <nil> <nil> true [0xc00206e4d8 0xc00206e4f0 0xc00206e508] [0xc00206e4d8 0xc00206e4f0 0xc00206e508] [0xc00206e4e8 0xc00206e500] [0x95ade0 0x95ade0] 0xc00376f020 <nil>}:\nCommand stdout:\n500\n\nstderr:\n+ set -e\n+ date +%s\n+ STOP=1568847433\n+ date +%s\n+ [ 1568847253 -lt 1568847433 ]\n+ curl -k -s -m 5 -o /dev/null -w %{http_code}\\n --header Host: 10.129.2.221 http://10.129.2.221:1936/healthz\n+ code=500\n+ [[ 0 -eq 0 ]]\n+ echo 500\n+ [[ 500 -eq 200 ]]\n+ [[ 500 -ne 503 ]]\n+ exit 1\ncommand terminated with exit code 1\n\nerror:\nexit status 1\n\n", } host command failed: error running &{/usr/bin/kubectl [kubectl --server=https://api.ci-op-8zqr6i6n-282fe.ci.azure.devcluster.openshift.com:6443 --kubeconfig=/tmp/admin.kubeconfig exec --namespace=e2e-test-router-scoped-wrv7w execpod -- /bin/sh -x -c set -e STOP=$(($(date '+%s') + 180)) while [ $(date '+%s') -lt $STOP ]; do code=$( curl -k -s -m 5 -o /dev/null -w '%{http_code}\n' --header 'Host: 10.129.2.221' "http://10.129.2.221:1936/healthz" ) || rc=$? if [[ "${rc:-0}" -eq 0 ]]; then echo $code if [[ $code -eq 200 ]]; then exit 0 fi if [[ $code -ne 503 ]]; then exit 1 fi else echo "error ${rc}" 1>&2 fi sleep 1 done ] [] <nil> 500 + set -e + date +%s + STOP=1568847433 + date +%s + [ 1568847253 -lt 1568847433 ] + curl -k -s -m 5 -o /dev/null -w %{http_code}\n --header Host: 10.129.2.221 http://10.129.2.221:1936/healthz + code=500 + [[ 0 -eq 0 ]] + echo 500 + [[ 500 -eq 200 ]] + [[ 500 -ne 503 ]] + exit 1 command terminated with exit code 1 [] <nil> 0xc003128ae0 exit status 1 <nil> <nil> true [0xc00206e4d8 0xc00206e4f0 0xc00206e508] [0xc00206e4d8 0xc00206e4f0 0xc00206e508] [0xc00206e4e8 0xc00206e500] [0x95ade0 0x95ade0] 0xc00376f020 <nil>}: Command stdout: 500 stderr: + set -e + date +%s + STOP=1568847433 + date +%s + [ 1568847253 -lt 1568847433 ] + curl -k -s -m 5 -o /dev/null -w %{http_code}\n --header Host: 10.129.2.221 http://10.129.2.221:1936/healthz + code=500 + [[ 0 -eq 0 ]] + echo 500 + [[ 500 -eq 200 ]] + [[ 500 -ne 503 ]] + exit 1 command terminated with exit code 1 error: exit status 1 occurred see: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-azure-4.2/309
Note this test is flakey, it failed ~ 25% runs today.
In the past two days since the fix merged (https://bugzilla.redhat.com/show_bug.cgi?id=1765177) I'm no longer seeing evidence of any of the flakes associated with this bug. https://ci-search-ci-search-next.svc.ci.openshift.org/?search=failed%3A.*override+the+route+host+for+overridden+domains&maxAge=48h&context=2&type=all https://ci-search-ci-search-next.svc.ci.openshift.org/?search=failed%3A.*router+should+serve+a+route+that+points+to+two+services&maxAge=48h&context=2&type=all https://ci-search-ci-search-next.svc.ci.openshift.org/?search=failed%3A.*router+should+serve+the+correct+routes+when+scoped+to+a+single+namespace&maxAge=48h&context=2&type=all https://ci-search-ci-search-next.svc.ci.openshift.org/?search=failed%3A.*router+should+run+even+if+it+has+no+access+to+update+status+&maxAge=48h&context=2&type=all I also see no evidence of flaking in the post-submit test grid statistics. I recommend we declare this fixed and open new bugs as necessary.
didn't see this issue since release-openshift-ocp-installer-e2e-azure-4.3 #198, so moving to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062