Bug 1959149

Summary: Creating a new custom ingress-controller triggers an oauth-apiserver rollout
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: apiserver-authAssignee: Standa Laznicka <slaznick>
Status: CLOSED DUPLICATE QA Contact: pmali
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.8CC: aos-bugs, mfojtik
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-11 07:32:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Clayton Coleman 2021-05-10 19:17:45 UTC
The router e2e tests create custom ingress controllers which appear to consistently (100%) trigger a rollout of a new oauth-apiserver.

GO111MODULE=on GOFLAGS=-mod=vendor go test -timeout 1h -count 1 -v -tags e2e -run "TestHTTPHeaderBufferSize" ./test/e2e

creates a new "router-header-buffer-size" deployment

May 10 15:10:37.000 I ns/openshift-authentication-operator deployment/authentication-operator reason/SecretUpdated Updated Secret/v4-0-config-system-router-certs -n openshift-authentication because it changed (5 times)
May 10 15:10:37.000 I ns/openshift-config-managed secret/router-certs reason/UpdatedPublishedRouterCertificates Updated the published router certificates (5 times)
...
May 10 19:10:40.179 I ns/openshift-ingress pod/router-header-buffer-size-6847776996-j5pvn node/ci-ln-7491spt-f76d1-7nzpr-worker-d-x6blq container/router reason/ContainerStart duration/3.00s
May 10 19:10:41.142 I ns/openshift-ingress pod/router-header-buffer-size-6847776996-j5pvn node/ci-ln-7491spt-f76d1-7nzpr-worker-d-x6blq container/router reason/Ready
...
May 10 15:11:27.000 I ns/openshift-authentication pod/oauth-openshift-774c68999f-gnkc4 node/ci-ln-7491spt-f76d1-7nzpr-master-0 container/oauth-openshift reason/Killing
May 10 15:11:27.000 I ns/openshift-authentication-operator deployment/authentication-operator reason/DeploymentUpdated Updated Deployment.apps/oauth-openshift -n openshift-authentication because it changed (5 times)

and triggers auth rollout.  I *assume* this is router cert related, but not sure.  This causes an immense amount of disruption (which is covered by other bugs) but in general customers adding custom ingress controllers should have no impact on oauth.

Starting with oauth because I'm not sure which side it's on, but oauth should not react unless the config it cares about changes.

I'm able to trigger this on any run of https://prow.ci.openshift.org/job-history/gs/origin-ci-test/pr-logs/directory/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator and also directly from a launch by running the command above.

Comment 1 Clayton Coleman 2021-05-10 20:21:29 UTC
Note my naive expectation was non-default ingress controllers didn't automatically cause oauth to rollout because oauth should only care about router certs for the router its route is exposed on (which technically is part of the inferred logic from route status).  However, if *every* ingress controller automatically exposes oauth, that could be bad for other reasons, and our testing needs a way to keep those from selecting those routes / disable that logic / bypass rollout.  We shouldn't remove the flexibility to expose oauth on a different ingress controller, but we need to be more cautious about the impact of extra controllers on core oauth infra.

Comment 2 Standa Laznicka 2021-05-11 07:32:12 UTC
The fix in library-go resource syncer has finally been merged, this is being fixed as a part of https://bugzilla.redhat.com/show_bug.cgi?id=1950379

*** This bug has been marked as a duplicate of bug 1950379 ***