Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1871939

Summary: [sig-network][Feature:Router] The HAProxy router should set Forwarded headers appropriately
Product: OpenShift Container Platform Reporter: Antonio Ojea <aojeagar>
Component: NetworkingAssignee: Stephen Greene <sgreene>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: low CC: aos-bugs, bperkins, ccoleman, jchaloup, mfisher, mmasters, wking
Version: 4.6   
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
[sig-network][Feature:Router] The HAProxy router should set Forwarded headers appropriately
Last Closed: 2021-07-01 16:53:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Antonio Ojea 2020-08-24 16:19:54 UTC
test:
[sig-network][Feature:Router] The HAProxy router should set Forwarded headers appropriately 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-network%5C%5D%5C%5BFeature%3ARouter%5C%5D+The+HAProxy+router+should+set+Forwarded+headers+appropriately


Failure example
[sig-network][Feature:Router] The HAProxy router should expose prometheus metrics for a route [Suite:openshift/conformance/parallel] expand_less	53s
fail [github.com/openshift/origin@/test/extended/router/metrics.go:219]: Expected
    <float64>: 1095
to be >=
    <float64>: 1767

Comment 7 Stephen Greene 2020-10-01 15:22:26 UTC
I’m adding UpcomingSprint, because I was occupied by fixing bugs with higher
priority/severity, developing new features with higher priority, or developing
new features to improve stability at a macro level. I will revisit this bug
next sprint.

Comment 8 Jan Chaloupka 2020-10-12 14:43:37 UTC
Occuring in 4.5 as well: https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=4.5&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-network%5C%5D%5C%5BFeature%3ARouter%5C%5D+The+HAProxy+router+should+set+Forwarded+headers+appropriately

Don't see the original error message for 4.7 so posting one for 4.5:

```
fail [github.com/openshift/origin/test/extended/router/headers.go:102]: Unexpected error:
    <*errors.errorString | 0xc0028c8f30>: {
        s: "host command failed: error running /usr/bin/kubectl --server=https://api.ci-op-jht456ms-daaf9.origin-ci-int-gce.dev.openshift.com:6443 --kubeconfig=/var/run/secrets/ci.openshift.io/multi-stage/kubeconfig exec --namespace=e2e-test-router-headers-7cn7d execpod -- /bin/sh -x -c \n\t\tset -e\n\t\tSTOP=$(($(date '+%s') + 180))\n\t\twhile [ $(date '+%s') -lt $STOP ]; do\n\t\t\trc=0\n\t\t\tcode=$( curl -k -s -m 5 -o /dev/null -w '%{http_code}\\n' --header 'Host: 172.30.125.224' \"http://172.30.125.224:1936/healthz\" ) || rc=$?\n\t\t\tif [[ \"${rc:-0}\" -eq 0 ]]; then\n\t\t\t\techo $code\n\t\t\t\tif [[ $code -eq 200 ]]; then\n\t\t\t\t\texit 0\n\t\t\t\tfi\n\t\t\t\tif [[ $code -ne 503 ]]; then\n\t\t\t\t\texit 1\n\t\t\t\tfi\n\t\t\telse\n\t\t\t\techo \"error ${rc}\" 1>&2\n\t\t\tfi\n\t\t\tsleep 1\n\t\tdone\n\t\t:\nCommand stdout:\n\nstderr:\nUnable to connect to the server: dial tcp 35.196.249.78:6443: i/o timeout\n\nerror:\nexit status 1\n",
    }
    host command failed: error running /usr/bin/kubectl --server=https://api.ci-op-jht456ms-daaf9.origin-ci-int-gce.dev.openshift.com:6443 --kubeconfig=/var/run/secrets/ci.openshift.io/multi-stage/kubeconfig exec --namespace=e2e-test-router-headers-7cn7d execpod -- /bin/sh -x -c 
    		set -e
    		STOP=$(($(date '+%s') + 180))
    		while [ $(date '+%s') -lt $STOP ]; do
    			rc=0
    			code=$( curl -k -s -m 5 -o /dev/null -w '%{http_code}\n' --header 'Host: 172.30.125.224' "http://172.30.125.224:1936/healthz" ) || rc=$?
    			if [[ "${rc:-0}" -eq 0 ]]; then
    				echo $code
    				if [[ $code -eq 200 ]]; then
    					exit 0
    				fi
    				if [[ $code -ne 503 ]]; then
    					exit 1
    				fi
    			else
    				echo "error ${rc}" 1>&2
    			fi
    			sleep 1
    		done
    		:
    Command stdout:
    
    stderr:
    Unable to connect to the server: dial tcp 35.196.249.78:6443: i/o timeout
    
    error:
    exit status 1
    
occurred
```

Feel free to ignore it if it's not relevant.

Comment 12 Clayton Coleman 2021-01-28 18:03:22 UTC
The original two causes of this failure no longer happen.  The new occurrence is a different issue than what is described in the most recent comment, that is addressed by https://bugzilla.redhat.com/show_bug.cgi?id=1921857

Comment 14 Clayton Coleman 2021-06-14 14:37:15 UTC
~50% of GCP runs this is failing (also looks like a lot of azure runs, maybe 25-30%).

Is this test assuming something that only works on AWS?

Raising to high because this is blocking a lot of CI runs and is happening in 4.8

Comment 15 Stephen Greene 2021-06-14 19:53:25 UTC
(In reply to Clayton Coleman from comment #14)
> ~50% of GCP runs this is failing (also looks like a lot of azure runs, maybe
> 25-30%).
> 
> Is this test assuming something that only works on AWS?
> 
> Raising to high because this is blocking a lot of CI runs and is happening
> in 4.8

Recent spike in failures is being tracked via https://bugzilla.redhat.com/show_bug.cgi?id=1971808

Comment 16 Miciah Dashiel Butler Masters 2021-07-01 16:53:11 UTC
Some specific instances of this issue have been resolved (bug 1971808, bug 1921857), and the general issue is being tracked in bug 1802311, so I am marking this report as a duplicate of bug 1802311.

*** This bug has been marked as a duplicate of bug 1802311 ***