Description of problem: The authentication operator appears to be taking too long to pick up a change to its configuration Version-Release number of selected component (if applicable): master How reproducible: 100% Steps to Reproduce: 1. configure a new identity provider 2. wait for the authentication operator to start deploying a new revision (it reports progressing=true) Actual results: It can take up to 5 minutes for the authentication operator to start rolling out a new revision Expected results: It should not take more than a minute after a new configuration was set for the the authentication operator to start deploying a new revision. Additional info:
From my observations it is possible for the change to take up to 2 minutes to be observed by the deployment controller _sometimes_, but it is not the 5 minutes as reported by console. I noticed that those 2 minutes extra usually happen when there's client-side throttling, I wonder whether and how it might affect informer caches. In the other cases the change is quite quick. The reaction to configuration changes and the CM syncing is usually almost immediate, so while this is not a big issue, it is still questionable why the deployment controller takes longer to observe current state.
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.
sprint review: this issue is still being worked on.
I tried to reproduce this issue using https://github.com/stlaz/openshift-requestheader-config. The OAuth/cluster configuration was applied at '09:13:42Z': $ oc get OAuth cluster -o yaml ... - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: f:kubectl.kubernetes.io/last-applied-configuration: {} f:spec: f:identityProviders: {} manager: kubectl-client-side-apply operation: Update time: "2021-08-18T09:13:42Z" The Pod rotation started less than a minute after this event - at 09:14:11Z: $ oc get events -n openshift-authentication -o custom-columns=TIME:.lastTimestamp,MESSAGE:.message ... 2021-08-18T08:25:12Z Created pod: oauth-openshift-84d7896496-rbhkl 2021-08-18T08:25:43Z Created pod: oauth-openshift-84d7896496-zt679 2021-08-18T08:26:13Z Created pod: oauth-openshift-84d7896496-nvm25 2021-08-18T09:14:11Z Deleted pod: oauth-openshift-84d7896496-nvm25 2021-08-18T09:14:42Z Deleted pod: oauth-openshift-84d7896496-rbhkl 2021-08-18T09:15:11Z Deleted pod: oauth-openshift-84d7896496-zt679 2021-08-18T08:21:30Z Scaled up replica set oauth-openshift-7bc48d955 to 3 2021-08-18T08:25:12Z Scaled down replica set oauth-openshift-7bc48d955 to 2 2021-08-18T08:25:12Z Scaled up replica set oauth-openshift-84d7896496 to 1 2021-08-18T08:25:43Z Scaled down replica set oauth-openshift-7bc48d955 to 1 2021-08-18T08:25:43Z Scaled up replica set oauth-openshift-84d7896496 to 2 2021-08-18T08:26:13Z Scaled down replica set oauth-openshift-7bc48d955 to 0 2021-08-18T08:26:13Z Scaled up replica set oauth-openshift-84d7896496 to 3 2021-08-18T09:14:11Z Scaled down replica set oauth-openshift-84d7896496 to 2 2021-08-18T09:14:11Z Scaled up replica set oauth-openshift-58fcdf4f49 to 1 2021-08-18T09:14:42Z Scaled down replica set oauth-openshift-84d7896496 to 1 2021-08-18T09:14:42Z Scaled up replica set oauth-openshift-58fcdf4f49 to 2 2021-08-18T09:15:11Z Scaled down replica set oauth-openshift-84d7896496 to 0 2021-08-18T09:15:11Z Scaled up replica set oauth-openshift-58fcdf4f49 to 3 It took roughly 1 minute to rotate all the pods (09:15:11Z - 09:14:11Z). The CAO also seemed to flip the Progressing flag on timely manner (09:15:32Z, which is roughly 1 minute after the deployment was finished). $ oc get clusteroperator authentication -o yaml - lastTransitionTime: "2021-08-18T09:15:32Z" message: 'AuthenticatorCertKeyProgressing: All is well' reason: AsExpected status: "False" type: Progressing So the whole procedure took about 2 minutes (09:15:32Z - 09:13:42Z) from start to finish. Let me close this as "WORKSFORME". Perhaps the cluster was overloaded or AWS had its "hard time". Hard to say...
Together with Standa we clarified that the problem is between 09:13:42Z and 09:14:11Z. In my case that took 29 seconds but sometimes it takes 2.5 minutes. The biggest suspicion is client-side throttling. Putting this onto the back burner for now.
Created attachment 1817377 [details] kube_deployment_status_replicas_available{namespace="openshift-authentication"} From https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-aws-ovn/1428583979283386368: ``` 1 unexpected clusteroperator state transitions during e2e test run Aug 20 05:49:23.886 - 46s E clusteroperator/authentication condition/Degraded status/True reason/Unknown ``` From https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-aws-ovn/1428583979283386368/artifacts/e2e-aws-ovn/gather-extra/artifacts/pods/openshift-authentication-operator_authentication-operator-f96986bb8-xnn6j_authentication-operator.log: ``` I0820 05:49:07.254217 1 status_controller.go:211] clusteroperator/authentication diff { "status":{ "conditions": [ {"lastTransitionTime":"2021-08-20T05:49:07Z","message":"OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()","reason":"OAuthServerDeployment_UnavailablePod","status":"True","type":"Degraded"}, {"lastTransitionTime":"2021-08-20T05:48:20Z","message":"AuthenticatorCertKeyProgressing: All is well","reason":"AsExpected","status":"False","type":"Progressing"}, {"lastTransitionTime":"2021-08-20T05:48:20Z","message":"All is well","reason":"AsExpected","status":"True","type":"Available"}, {"lastTransitionTime":"2021-08-20T05:23:51Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"} ] } } I0820 05:49:07.270768 1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"openshift-authentication-operator", Name:"openshift-authentication-operator", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Degraded changed from False to True ("OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()") ``` From kubelet logs: ``` ip-10-0-181-79.us-west-1.compute.internal:9313:Aug 20 05:36:21.304475 ip-10-0-181-79 hyperkube[1681]: I0820 05:36:21.304443 1681 kubelet.go:2065] "SyncLoop ADD" source="api" pods=[openshift-authentication/oauth-openshift-854d6fff9c-b9m5k] ip-10-0-181-79.us-west-1.compute.internal:9482:Aug 20 05:36:29.714223 ip-10-0-181-79 hyperkube[1681]: I0820 05:36:29.714198 1681 kubelet.go:2175] "SyncLoop (probe)" probe="readiness" status="ready" pod="openshift-authentication/oauth-openshift-854d6fff9c-b9m5k" ip-10-0-181-79.us-west-1.compute.internal:12166:Aug 20 05:45:45.759285 ip-10-0-181-79 hyperkube[1681]: I0820 05:45:45.755286 1681 kubelet.go:2081] "SyncLoop DELETE" source="api" pods=[openshift-authentication/oauth-openshift-854d6fff9c-b9m5k] ip-10-0-181-79.us-west-1.compute.internal:12280:Aug 20 05:46:12.149280 ip-10-0-181-79 hyperkube[1681]: I0820 05:46:12.149255 1681 kubelet.go:2075] "SyncLoop REMOVE" source="api" pods=[openshift-authentication/oauth-openshift-854d6fff9c-b9m5k] ip-10-0-221-30.us-west-1.compute.internal:8613:Aug 20 05:36:21.191781 ip-10-0-221-30 hyperkube[1683]: I0820 05:36:21.191744 1683 kubelet.go:2065] "SyncLoop ADD" source="api" pods=[openshift-authentication/oauth-openshift-854d6fff9c-b6crd] ip-10-0-221-30.us-west-1.compute.internal:8733:Aug 20 05:36:30.005871 ip-10-0-221-30 hyperkube[1683]: I0820 05:36:30.005844 1683 kubelet.go:2175] "SyncLoop (probe)" probe="readiness" status="ready" pod="openshift-authentication/oauth-openshift-854d6fff9c-b6crd" ip-10-0-221-30.us-west-1.compute.internal:12778:Aug 20 05:46:41.518540 ip-10-0-221-30 hyperkube[1683]: I0820 05:46:41.514047 1683 kubelet.go:2075] "SyncLoop REMOVE" source="api" pods=[openshift-authentication/oauth-openshift-854d6fff9c-b6crd] ip-10-0-221-30.us-west-1.compute.internal:12779:Aug 20 05:46:41.524046 ip-10-0-221-30 hyperkube[1683]: I0820 05:46:41.524015 1683 kubelet.go:2065] "SyncLoop ADD" source="api" pods=[openshift-authentication/oauth-openshift-9b79df6cb-dhvrb] ip-10-0-221-30.us-west-1.compute.internal:12856:Aug 20 05:46:44.699371 ip-10-0-221-30 hyperkube[1683]: I0820 05:46:44.699343 1683 kubelet.go:2175] "SyncLoop (probe)" probe="readiness" status="ready" pod="openshift-authentication/oauth-openshift-9b79df6cb-dhvrb" ip-10-0-179-152.us-west-1.compute.internal:12916:Aug 20 05:36:21.303538 ip-10-0-179-152 hyperkube[1680]: I0820 05:36:21.303503 1680 kubelet.go:2065] "SyncLoop ADD" source="api" pods=[openshift-authentication/oauth-openshift-854d6fff9c-j7bw9] ip-10-0-179-152.us-west-1.compute.internal:13001:Aug 20 05:36:30.479977 ip-10-0-179-152 hyperkube[1680]: I0820 05:36:30.479933 1680 kubelet.go:2175] "SyncLoop (probe)" probe="readiness" status="ready" pod="openshift-authentication/oauth-openshift-854d6fff9c-j7bw9" ip-10-0-179-152.us-west-1.compute.internal:15877:Aug 20 05:46:44.764966 ip-10-0-179-152 hyperkube[1680]: I0820 05:46:44.764930 1680 kubelet.go:2081] "SyncLoop DELETE" source="api" pods=[openshift-authentication/oauth-openshift-854d6fff9c-j7bw9] ip-10-0-179-152.us-west-1.compute.internal:15953:Aug 20 05:47:11.423571 ip-10-0-179-152 hyperkube[1680]: I0820 05:47:11.423477 1680 kubelet.go:2075] "SyncLoop REMOVE" source="api" pods=[openshift-authentication/oauth-openshift-854d6fff9c-j7bw9] ip-10-0-179-152.us-west-1.compute.internal:15954:Aug 20 05:47:11.437638 ip-10-0-179-152 hyperkube[1680]: I0820 05:47:11.437612 1680 kubelet.go:2065] "SyncLoop ADD" source="api" pods=[openshift-authentication/oauth-openshift-9b79df6cb-9wgf6] ip-10-0-179-152.us-west-1.compute.internal:16031:Aug 20 05:47:14.544054 ip-10-0-179-152 hyperkube[1680]: I0820 05:47:14.544024 1680 kubelet.go:2175] "SyncLoop (probe)" probe="readiness" status="ready" pod="openshift-authentication/oauth-openshift-9b79df6cb-9wgf6" ip-10-0-181-79.us-west-1.compute.internal:12281:Aug 20 05:46:12.163056 ip-10-0-181-79 hyperkube[1681]: I0820 05:46:12.163030 1681 kubelet.go:2065] "SyncLoop ADD" source="api" pods=[openshift-authentication/oauth-openshift-9b79df6cb-6f6hk] ip-10-0-181-79.us-west-1.compute.internal:12365:Aug 20 05:46:15.096268 ip-10-0-181-79 hyperkube[1681]: I0820 05:46:15.096247 1681 kubelet.go:2175] "SyncLoop (probe)" probe="readiness" status="ready" pod="openshift-authentication/oauth-openshift-9b79df6cb-6f6hk" ``` ip-10-0-221-30.us-west-1.compute.internal: - oauth-openshift-854d6fff9c-b6crd: <05:36:21.191744; 05:46:41.514047> - oauth-openshift-9b79df6cb-dhvrb: <05:46:41.524015; ???> ip-10-0-179-152.us-west-1.compute.internal: - oauth-openshift-854d6fff9c-j7bw9: <05:36:21.303503; 05:47:11.423477> - oauth-openshift-9b79df6cb-9wgf6: <05:47:11.437612; ???> ip-10-0-181-79.us-west-1.compute.internal: - oauth-openshift-854d6fff9c-b9m5k: <05:36:29.714223; 05:46:12.149280> - oauth-openshift-9b79df6cb-6f6hk: <05:46:12.163030; ???> From https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-aws-ovn/1428583979283386368/artifacts/e2e-aws-ovn/gather-extra/artifacts/pods/openshift-kube-controller-manager_kube-controller-manager-ip-10-0-221-30.us-west-1.compute.internal_kube-controller-manager.log: ``` I0820 05:45:45.720433 1 replica_set.go:599] "Too many replicas" replicaSet="openshift-authentication/oauth-openshift-854d6fff9c" need=2 deleting=1 I0820 05:45:45.720471 1 replica_set.go:227] "Found related ReplicaSets" replicaSet="openshift-authentication/oauth-openshift-854d6fff9c" relatedReplicaSets=[oauth-openshift-854d6fff9c oauth-openshift-9b79df6cb] I0820 05:45:45.720543 1 controller_utils.go:592] "Deleting pod" controller="oauth-openshift-854d6fff9c" pod="openshift-authentication/oauth-openshift-854d6fff9c-b9m5k" I0820 05:45:45.721679 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled down replica set oauth-openshift-854d6fff9c to 2" I0820 05:45:45.743913 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift-854d6fff9c" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulDelete" message="Deleted pod: oauth-openshift-854d6fff9c-b9m5k" I0820 05:45:45.753023 1 replica_set.go:563] "Too few replicas" replicaSet="openshift-authentication/oauth-openshift-9b79df6cb" need=1 creating=1 I0820 05:45:45.753403 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set oauth-openshift-9b79df6cb to 1" I0820 05:45:45.799878 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift-9b79df6cb" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: oauth-openshift-9b79df6cb-6f6hk" I0820 05:46:15.169692 1 replica_set.go:599] "Too many replicas" replicaSet="openshift-authentication/oauth-openshift-854d6fff9c" need=1 deleting=1 I0820 05:46:15.169736 1 replica_set.go:227] "Found related ReplicaSets" replicaSet="openshift-authentication/oauth-openshift-854d6fff9c" relatedReplicaSets=[oauth-openshift-854d6fff9c oauth-openshift-9b79df6cb] I0820 05:46:15.169805 1 controller_utils.go:592] "Deleting pod" controller="oauth-openshift-854d6fff9c" pod="openshift-authentication/oauth-openshift-854d6fff9c-b6crd" I0820 05:46:15.170261 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled down replica set oauth-openshift-854d6fff9c to 1" I0820 05:46:15.190841 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift-854d6fff9c" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulDelete" message="Deleted pod: oauth-openshift-854d6fff9c-b6crd" I0820 05:46:15.201758 1 replica_set.go:563] "Too few replicas" replicaSet="openshift-authentication/oauth-openshift-9b79df6cb" need=2 creating=1 I0820 05:46:15.202449 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set oauth-openshift-9b79df6cb to 2" I0820 05:46:15.262665 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift-9b79df6cb" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: oauth-openshift-9b79df6cb-dhvrb" I0820 05:46:44.744944 1 replica_set.go:599] "Too many replicas" replicaSet="openshift-authentication/oauth-openshift-854d6fff9c" need=0 deleting=1 I0820 05:46:44.745068 1 replica_set.go:227] "Found related ReplicaSets" replicaSet="openshift-authentication/oauth-openshift-854d6fff9c" relatedReplicaSets=[oauth-openshift-854d6fff9c oauth-openshift-9b79df6cb] I0820 05:46:44.745094 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled down replica set oauth-openshift-854d6fff9c to 0" I0820 05:46:44.745202 1 controller_utils.go:592] "Deleting pod" controller="oauth-openshift-854d6fff9c" pod="openshift-authentication/oauth-openshift-854d6fff9c-j7bw9" I0820 05:46:44.757307 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift-854d6fff9c" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulDelete" message="Deleted pod: oauth-openshift-854d6fff9c-j7bw9" I0820 05:46:44.770854 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set oauth-openshift-9b79df6cb to 3" I0820 05:46:44.772211 1 replica_set.go:563] "Too few replicas" replicaSet="openshift-authentication/oauth-openshift-9b79df6cb" need=3 creating=1 I0820 05:46:44.813635 1 event.go:291] "Event occurred" object="openshift-authentication/oauth-openshift-9b79df6cb" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: oauth-openshift-9b79df6cb-9wgf6" ``` From which: ``` I0820 05:45:45.721679 1 event.go:291] message="Scaled down replica set oauth-openshift-854d6fff9c to 2" I0820 05:45:45.753403 1 event.go:291] message="Scaled up replica set oauth-openshift-9b79df6cb to 1" I0820 05:46:15.170261 1 event.go:291] message="Scaled down replica set oauth-openshift-854d6fff9c to 1" I0820 05:46:15.202449 1 event.go:291] message="Scaled up replica set oauth-openshift-9b79df6cb to 2" I0820 05:46:44.745094 1 event.go:291] message="Scaled down replica set oauth-openshift-854d6fff9c to 0" I0820 05:46:44.770854 1 event.go:291] message="Scaled up replica set oauth-openshift-9b79df6cb to 3" ``` From must-gather and host_service_logs/masters/crio_service.log: ``` Aug 20 05:46:41.880945 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:41.880868895Z" level=info msg="Running pod sandbox: openshift-authentication/oauth-openshift-9b79df6cb-dhvrb/POD" id=0e7fc010-b3fe-483a-8b0a-fe15a7599f81 name=/runtime.v1alpha2.RuntimeService/RunPodSandbox Aug 20 05:46:41.902653 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:41.902609242Z" level=info msg="Got pod network &{Name:oauth-openshift-9b79df6cb-dhvrb Namespace:openshift-authentication ID:a6bd66adbb6aef3355ebfdebde018989f63a77f15506da65ee3eb314550df94e UID:73f63020-ce8f-478d-9e86-a62aa217a300 NetNS:/var/run/netns/4e0e52d0-09f3-4a90-8e2c-940a2439e0e3 Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}" Aug 20 05:46:41.902653 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:41.902654779Z" level=info msg="Adding pod openshift-authentication_oauth-openshift-9b79df6cb-dhvrb to CNI network \"multus-cni-network\" (type=multus)" Aug 20 05:46:43.834063 ip-10-0-221-30 crio[1645]: 2021-08-20T05:46:43Z [verbose] Add: openshift-authentication:oauth-openshift-9b79df6cb-dhvrb:73f63020-ce8f-478d-9e86-a62aa217a300:ovn-kubernetes(ovn-kubernetes):eth0 {"cniVersion":"0.4.0","interfaces":[{"name":"a6bd66adbb6aef3","mac":"c2:d8:67:f4:55:dd"},{"name":"eth0","mac":"0a:58:0a:80:00:2d","sandbox":"/var/run/netns/4e0e52d0-09f3-4a90-8e2c-940a2439e0e3"}],"ips":[{"version":"4","interface":1,"address":"10.128.0.45/23","gateway":"10.128.0.1"}],"dns":{}} Aug 20 05:46:43.834063 ip-10-0-221-30 crio[1645]: I0820 05:46:43.809646 53468 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-authentication", Name:"oauth-openshift-9b79df6cb-dhvrb", UID:"73f63020-ce8f-478d-9e86-a62aa217a300", APIVersion:"v1", ResourceVersion:"28289", FieldPath:""}): type: 'Normal' reason: 'AddedInterface' Add eth0 [10.128.0.45/23] from ovn-kubernetes Aug 20 05:46:43.835370 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:43.835334112Z" level=info msg="Got pod network &{Name:oauth-openshift-9b79df6cb-dhvrb Namespace:openshift-authentication ID:a6bd66adbb6aef3355ebfdebde018989f63a77f15506da65ee3eb314550df94e UID:73f63020-ce8f-478d-9e86-a62aa217a300 NetNS:/var/run/netns/4e0e52d0-09f3-4a90-8e2c-940a2439e0e3 Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}" Aug 20 05:46:43.835585 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:43.835549249Z" level=info msg="Checking pod openshift-authentication_oauth-openshift-9b79df6cb-dhvrb for CNI network multus-cni-network (type=multus)" Aug 20 05:46:43.870277 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:43.870235309Z" level=info msg="Ran pod sandbox a6bd66adbb6aef3355ebfdebde018989f63a77f15506da65ee3eb314550df94e with infra container: openshift-authentication/oauth-openshift-9b79df6cb-dhvrb/POD" id=0e7fc010-b3fe-483a-8b0a-fe15a7599f81 name=/runtime.v1alpha2.RuntimeService/RunPodSandbox Aug 20 05:46:43.875169 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:43.875145473Z" level=info msg="Creating container: openshift-authentication/oauth-openshift-9b79df6cb-dhvrb/oauth-openshift" id=7994b2e3-aae3-42bd-bbc6-7c3c1db50729 name=/runtime.v1alpha2.RuntimeService/CreateContainer Aug 20 05:46:44.154018 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:44.153953325Z" level=info msg="Created container 354d411a59b72cc655cb7cc066defc1f6d0202bc3e70a7de7638fe8d05deaa13: openshift-authentication/oauth-openshift-9b79df6cb-dhvrb/oauth-openshift" id=7994b2e3-aae3-42bd-bbc6-7c3c1db50729 name=/runtime.v1alpha2.RuntimeService/CreateContainer Aug 20 05:46:44.175436 ip-10-0-221-30 crio[1645]: time="2021-08-20 05:46:44.175362799Z" level=info msg="Started container" PID=53567 containerID=354d411a59b72cc655cb7cc066defc1f6d0202bc3e70a7de7638fe8d05deaa13 description=openshift-authentication/oauth-openshift-9b79df6cb-dhvrb/oauth-openshift id=7cfed9b2-7f10-45c0-8286-e30ba809f1b9 name=/runtime.v1alpha2.RuntimeService/StartContainer sandboxID=a6bd66adbb6aef3355ebfdebde018989f63a77f15506da65ee3eb314550df94e Aug 20 05:47:11.817443 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:11.817393617Z" level=info msg="Running pod sandbox: openshift-authentication/oauth-openshift-9b79df6cb-9wgf6/POD" id=438c5bb5-02a0-4cd9-8409-bcff85f85b06 name=/runtime.v1alpha2.RuntimeService/RunPodSandbox Aug 20 05:47:11.857327 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:11.857284805Z" level=info msg="Got pod network &{Name:oauth-openshift-9b79df6cb-9wgf6 Namespace:openshift-authentication ID:d2062f1fad1bed879a3fdf284dc7f1e95e8b313ca96be0cc355e2a890310676c UID:323debbc-dd6e-42ca-aee2-499dea4789b3 NetNS:/var/run/netns/20389f00-d34a-4ce1-b8b4-cc523eed8f58 Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}" Aug 20 05:47:11.857327 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:11.857327467Z" level=info msg="Adding pod openshift-authentication_oauth-openshift-9b79df6cb-9wgf6 to CNI network \"multus-cni-network\" (type=multus)" Aug 20 05:47:13.890280 ip-10-0-179-152 crio[1642]: 2021-08-20T05:47:13Z [verbose] Add: openshift-authentication:oauth-openshift-9b79df6cb-9wgf6:323debbc-dd6e-42ca-aee2-499dea4789b3:ovn-kubernetes(ovn-kubernetes):eth0 {"cniVersion":"0.4.0","interfaces":[{"name":"d2062f1fad1bed8","mac":"6e:ac:a4:09:2c:0e"},{"name":"eth0","mac":"0a:58:0a:82:00:3d","sandbox":"/var/run/netns/20389f00-d34a-4ce1-b8b4-cc523eed8f58"}],"ips":[{"version":"4","interface":1,"address":"10.130.0.61/23","gateway":"10.130.0.1"}],"dns":{}} Aug 20 05:47:13.890280 ip-10-0-179-152 crio[1642]: I0820 05:47:13.865011 71431 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-authentication", Name:"oauth-openshift-9b79df6cb-9wgf6", UID:"323debbc-dd6e-42ca-aee2-499dea4789b3", APIVersion:"v1", ResourceVersion:"28523", FieldPath:""}): type: 'Normal' reason: 'AddedInterface' Add eth0 [10.130.0.61/23] from ovn-kubernetes Aug 20 05:47:13.891836 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:13.891750971Z" level=info msg="Got pod network &{Name:oauth-openshift-9b79df6cb-9wgf6 Namespace:openshift-authentication ID:d2062f1fad1bed879a3fdf284dc7f1e95e8b313ca96be0cc355e2a890310676c UID:323debbc-dd6e-42ca-aee2-499dea4789b3 NetNS:/var/run/netns/20389f00-d34a-4ce1-b8b4-cc523eed8f58 Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}" Aug 20 05:47:13.892094 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:13.892066862Z" level=info msg="Checking pod openshift-authentication_oauth-openshift-9b79df6cb-9wgf6 for CNI network multus-cni-network (type=multus)" Aug 20 05:47:13.907960 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:13.907919770Z" level=info msg="Ran pod sandbox d2062f1fad1bed879a3fdf284dc7f1e95e8b313ca96be0cc355e2a890310676c with infra container: openshift-authentication/oauth-openshift-9b79df6cb-9wgf6/POD" id=438c5bb5-02a0-4cd9-8409-bcff85f85b06 name=/runtime.v1alpha2.RuntimeService/RunPodSandbox Aug 20 05:47:13.913124 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:13.913101443Z" level=info msg="Creating container: openshift-authentication/oauth-openshift-9b79df6cb-9wgf6/oauth-openshift" id=a6971e66-2849-4606-b3c9-a227d7fd4ca3 name=/runtime.v1alpha2.RuntimeService/CreateContainer Aug 20 05:47:14.115938 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:14.115896673Z" level=info msg="Created container 0cefd44ed0ec845dd8e4b5c93a431462eea513987449df3cadb0fb12931f7e71: openshift-authentication/oauth-openshift-9b79df6cb-9wgf6/oauth-openshift" id=a6971e66-2849-4606-b3c9-a227d7fd4ca3 name=/runtime.v1alpha2.RuntimeService/CreateContainer Aug 20 05:47:14.141299 ip-10-0-179-152 crio[1642]: time="2021-08-20 05:47:14.141170202Z" level=info msg="Started container" PID=71508 containerID=0cefd44ed0ec845dd8e4b5c93a431462eea513987449df3cadb0fb12931f7e71 description=openshift-authentication/oauth-openshift-9b79df6cb-9wgf6/oauth-openshift id=3a4e3297-2276-4f48-9ee1-a4c6fcaec30b name=/runtime.v1alpha2.RuntimeService/StartContainer sandboxID=d2062f1fad1bed879a3fdf284dc7f1e95e8b313ca96be0cc355e2a890310676c Aug 20 05:46:12.490298 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:12.490239160Z" level=info msg="Running pod sandbox: openshift-authentication/oauth-openshift-9b79df6cb-6f6hk/POD" id=7331708c-9b2d-43e5-9add-97a533157bc0 name=/runtime.v1alpha2.RuntimeService/RunPodSandbox Aug 20 05:46:12.521892 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:12.521837624Z" level=info msg="Got pod network &{Name:oauth-openshift-9b79df6cb-6f6hk Namespace:openshift-authentication ID:f470dfa72b93597f82e20b5a3bdff04c5cdf8d3459abc52c649e757a05a9ad99 UID:b3689bf3-844e-4595-ad6c-e379879289bc NetNS:/var/run/netns/93d326a1-8b32-4d23-a18c-37ff86b2f4c6 Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}" Aug 20 05:46:12.521892 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:12.521888708Z" level=info msg="Adding pod openshift-authentication_oauth-openshift-9b79df6cb-6f6hk to CNI network \"multus-cni-network\" (type=multus)" Aug 20 05:46:14.010119 ip-10-0-181-79 crio[1643]: 2021-08-20T05:46:13Z [verbose] Add: openshift-authentication:oauth-openshift-9b79df6cb-6f6hk:b3689bf3-844e-4595-ad6c-e379879289bc:ovn-kubernetes(ovn-kubernetes):eth0 {"cniVersion":"0.4.0","interfaces":[{"name":"f470dfa72b93597","mac":"56:13:f6:51:b4:e3"},{"name":"eth0","mac":"0a:58:0a:81:00:23","sandbox":"/var/run/netns/93d326a1-8b32-4d23-a18c-37ff86b2f4c6"}],"ips":[{"version":"4","interface":1,"address":"10.129.0.35/23","gateway":"10.129.0.1"}],"dns":{}} Aug 20 05:46:14.010119 ip-10-0-181-79 crio[1643]: I0820 05:46:13.981905 53298 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-authentication", Name:"oauth-openshift-9b79df6cb-6f6hk", UID:"b3689bf3-844e-4595-ad6c-e379879289bc", APIVersion:"v1", ResourceVersion:"28045", FieldPath:""}): type: 'Normal' reason: 'AddedInterface' Add eth0 [10.129.0.35/23] from ovn-kubernetes Aug 20 05:46:14.010691 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:14.010656394Z" level=info msg="Got pod network &{Name:oauth-openshift-9b79df6cb-6f6hk Namespace:openshift-authentication ID:f470dfa72b93597f82e20b5a3bdff04c5cdf8d3459abc52c649e757a05a9ad99 UID:b3689bf3-844e-4595-ad6c-e379879289bc NetNS:/var/run/netns/93d326a1-8b32-4d23-a18c-37ff86b2f4c6 Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}" Aug 20 05:46:14.010855 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:14.010835979Z" level=info msg="Checking pod openshift-authentication_oauth-openshift-9b79df6cb-6f6hk for CNI network multus-cni-network (type=multus)" Aug 20 05:46:14.020664 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:14.020597705Z" level=info msg="Ran pod sandbox f470dfa72b93597f82e20b5a3bdff04c5cdf8d3459abc52c649e757a05a9ad99 with infra container: openshift-authentication/oauth-openshift-9b79df6cb-6f6hk/POD" id=7331708c-9b2d-43e5-9add-97a533157bc0 name=/runtime.v1alpha2.RuntimeService/RunPodSandbox Aug 20 05:46:14.027498 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:14.027472074Z" level=info msg="Creating container: openshift-authentication/oauth-openshift-9b79df6cb-6f6hk/oauth-openshift" id=21c6b0d6-c28e-4fba-9848-ca6f4a7d12e8 name=/runtime.v1alpha2.RuntimeService/CreateContainer Aug 20 05:46:14.180079 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:14.180040980Z" level=info msg="Created container 1ec7eaa560ae11b7470d6df481e27e5654bee1524f43308e664194e6f0b8ba01: openshift-authentication/oauth-openshift-9b79df6cb-6f6hk/oauth-openshift" id=21c6b0d6-c28e-4fba-9848-ca6f4a7d12e8 name=/runtime.v1alpha2.RuntimeService/CreateContainer Aug 20 05:46:14.193655 ip-10-0-181-79 crio[1643]: time="2021-08-20 05:46:14.193610435Z" level=info msg="Started container" PID=53397 containerID=1ec7eaa560ae11b7470d6df481e27e5654bee1524f43308e664194e6f0b8ba01 description=openshift-authentication/oauth-openshift-9b79df6cb-6f6hk/oauth-openshift id=639c66fd-50c7-4513-99f2-8363a2cbc82a name=/runtime.v1alpha2.RuntimeService/StartContainer sandboxID=f470dfa72b93597f82e20b5a3bdff04c5cdf8d3459abc52c649e757a05a9ad99 ``` One can see all the newly spawned containers are running since they were started. Checking metrics one can confirm the same (see the attachment). The deployment reports 3 replicas available since 05:47:54. Yet after 73s the operator still reports (at 05:49:07.270768) one replicas is missing. Making the operator change the Degraded condition to True even though it should not.
Created attachment 1817379 [details] count(kube_pod_status_ready{namespace="openshift-authentication",pod=~"oauth-openshift-9b79df6cb-.*",condition="true"}) by (pods) All three oauth-openshift-9b79df6cb-.* pods are ready since 05:47:50.
Possibly other in https://search.ci.openshift.org/?search=OAuthServerDeploymentDegraded%3A+1+of+3+requested+instances+are+unavailable&maxAge=168h&context=1&type=junit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
I’m adding UpcomingSprint, because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug next sprint.
A note to myself - Standa suspects client side throttling. Here's are some example logs: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-authentication-operator/478/pull-ci-openshift-cluster-authentication-operator-master-e2e-operator/1436227570038214656/artifacts/e2e-operator/gather-extra/artifacts/pods/openshift-authentication-operator_authentication-operator-8679ffdb48-57md6_authentication-operator.log
I analyzed the second case presented by jchaloup (https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-aws-ovn/1428583979283386368) trying to find the answer what happens between 05:47:54 and 05:49:07+: ``` 2021-08-20T05:47:03.435822885Z I0820 05:47:03.435771 1 status_controller.go:211] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2021-08-20T05:43:38Z","message":"OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()\nWellKnownReadyControllerDegraded: need at least 3 kube-apiservers, got 2","reason":"OAuthServerDeployment_UnavailablePod::WellKnownReadyController_SyncError","status":"True","type":"Degraded"},{"lastTransitionTime":"2021-08-20T05:36:17Z","message":"WellKnownReadyControllerProgressing: kube-apiserver oauth endpoint https://10.0.179.152:6443/.well-known/oauth-authorization-server is not yet served and authentication operator keeps waiting (check kube-apiserver operator, and check that instances roll out successfully, which can take several minutes per instance)","reason":"WellKnownReadyController_OAuthMetadataNotYetServed","status":"True","type":"Progressing"},{"lastTransitionTime":"2021-08-20T05:23:52Z","message":"WellKnownAvailable: The well-known endpoint is not yet available: need at least 3 kube-apiservers, got 2","reason":"WellKnown_NotReady","status":"False","type":"Available"},{"lastTransitionTime":"2021-08-20T05:23:51Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} 2021-08-20T05:47:03.459082069Z I0820 05:47:03.455324 1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"openshift-authentication-operator", Name:"openshift-authentication-operator", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Degraded message changed from "WellKnownReadyControllerDegraded: need at least 3 kube-apiservers, got 2" to "OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()\nWellKnownReadyControllerDegraded: need at least 3 kube-apiservers, got 2",Progressing message changed from "OAuthServerDeploymentProgressing: deployment/oauth-openshift.openshift-authentication: 2/3 pods have been updated to the latest generation\nWellKnownReadyControllerProgressing: kube-apiserver oauth endpoint https://10.0.179.152:6443/.well-known/oauth-authorization-server is not yet served and authentication operator keeps waiting (check kube-apiserver operator, and check that instances roll out successfully, which can take several minutes per instance)" to "WellKnownReadyControllerProgressing: kube-apiserver oauth endpoint https://10.0.179.152:6443/.well-known/oauth-authorization-server is not yet served and authentication operator keeps waiting (check kube-apiserver operator, and check that instances roll out successfully, which can take several minutes per instance)" 2021-08-20T05:47:26.146880265Z I0820 05:47:26.146833 1 request.go:668] Waited for 1.072004557s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-oauth-apiserver/configmaps/audit 2021-08-20T05:47:27.147733220Z I0820 05:47:27.147690 1 request.go:668] Waited for 1.5958309s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-authentication/secrets/v4-0-config-system-router-certs 2021-08-20T05:47:28.147879448Z I0820 05:47:28.147836 1 request.go:668] Waited for 1.796436361s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-oauth-apiserver/configmaps/etcd-serving-ca 2021-08-20T05:47:29.015757077Z E0820 05:47:29.015703 1 base_controller.go:272] WellKnownReadyController reconciliation failed: need at least 3 kube-apiservers, got 2 2021-08-20T05:47:29.347741995Z I0820 05:47:29.347693 1 request.go:668] Waited for 1.195671848s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-oauth-apiserver/configmaps/trusted-ca-bundle 2021-08-20T05:47:30.547362686Z I0820 05:47:30.547314 1 request.go:668] Waited for 1.506119023s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/router-certs 2021-08-20T05:47:31.547740259Z I0820 05:47:31.547698 1 request.go:668] Waited for 1.396113997s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-authentication/secrets/v4-0-config-system-session 2021-08-20T05:47:32.747015572Z I0820 05:47:32.746972 1 request.go:668] Waited for 1.188849439s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-authentication/configmaps/v4-0-config-system-cliconfig 2021-08-20T05:47:59.011135667Z E0820 05:47:59.011092 1 base_controller.go:272] WellKnownReadyController reconciliation failed: need at least 3 kube-apiservers, got 2 2021-08-20T05:48:20.594090404Z I0820 05:48:20.592221 1 status_controller.go:211] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2021-08-20T05:43:38Z","message":"OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()\nWellKnownReadyControllerDegraded: need at least 3 kube-apiservers, got 2","reason":"OAuthServerDeployment_UnavailablePod::WellKnownReadyController_SyncError","status":"True","type":"Degraded"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"AuthenticatorCertKeyProgressing: All is well","reason":"AsExpected","status":"False","type":"Progressing"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"All is well","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2021-08-20T05:23:51Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} 2021-08-20T05:48:20.610361186Z I0820 05:48:20.610313 1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"openshift-authentication-operator", Name:"openshift-authentication-operator", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Progressing changed from True to False ("AuthenticatorCertKeyProgressing: All is well"),Available changed from False to True ("All is well") 2021-08-20T05:48:20.663368755Z I0820 05:48:20.663321 1 status_controller.go:211] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"AuthenticatorCertKeyProgressing: All is well","reason":"AsExpected","status":"False","type":"Progressing"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"All is well","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2021-08-20T05:23:51Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} 2021-08-20T05:48:20.709108344Z I0820 05:48:20.709042 1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"openshift-authentication-operator", Name:"openshift-authentication-operator", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Degraded changed from True to False ("OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()") 2021-08-20T05:48:21.793847210Z I0820 05:48:21.793805 1 request.go:668] Waited for 1.136380693s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-oauth-apiserver/configmaps/audit 2021-08-20T05:48:22.794617185Z I0820 05:48:22.794543 1 request.go:668] Waited for 1.394260215s due to client-side throttling, not priority and fairness, request: DELETE:https://172.30.0.1:443/api/v1/namespaces/openshift-oauth-apiserver/secrets/encryption-config 2021-08-20T05:48:26.194132017Z I0820 05:48:26.194089 1 request.go:668] Waited for 1.116075959s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/router-certs 2021-08-20T05:48:27.194324546Z I0820 05:48:27.194270 1 request.go:668] Waited for 1.591906687s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-oauth-apiserver/configmaps/audit 2021-08-20T05:48:28.394654866Z I0820 05:48:28.394604 1 request.go:668] Waited for 1.797038312s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-oauth-apiserver/configmaps/etcd-serving-ca 2021-08-20T05:48:29.594191226Z I0820 05:48:29.594150 1 request.go:668] Waited for 1.374382932s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-authentication/configmaps/v4-0-config-system-service-ca 2021-08-20T05:48:30.594671125Z I0820 05:48:30.594518 1 request.go:668] Waited for 1.552775541s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/router-certs 2021-08-20T05:48:31.794529674Z I0820 05:48:31.794489 1 request.go:668] Waited for 1.396213117s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-authentication/secrets/v4-0-config-system-session 2021-08-20T05:48:32.794677831Z I0820 05:48:32.794636 1 request.go:668] Waited for 1.187852265s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-authentication 2021-08-20T05:49:07.254257397Z I0820 05:49:07.254217 1 status_controller.go:211] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2021-08-20T05:49:07Z","message":"OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()","reason":"OAuthServerDeployment_UnavailablePod","status":"True","type":"Degraded"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"AuthenticatorCertKeyProgressing: All is well","reason":"AsExpected","status":"False","type":"Progressing"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"All is well","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2021-08-20T05:23:51Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} 2021-08-20T05:49:07.270816501Z I0820 05:49:07.270768 1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"openshift-authentication-operator", Name:"openshift-authentication-operator", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Degraded changed from False to True ("OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()") 2021-08-20T05:49:26.277476479Z I0820 05:49:26.277434 1 request.go:668] Waited for 1.051254939s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-authentication/serviceaccounts/oauth-openshift 2021-08-20T05:49:29.277667550Z I0820 05:49:29.277621 1 request.go:668] Waited for 1.030436518s due to client-side throttling, not priority and fairness, request: DELETE:https://172.30.0.1:443/api/v1/namespaces/openshift-oauth-apiserver/secrets/encryption-config 2021-08-20T05:49:30.477463987Z I0820 05:49:30.477424 1 request.go:668] Waited for 1.044396892s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config/secrets/webhook-authentication-integrated-oauth 2021-08-20T05:50:10.673277949Z I0820 05:50:10.673222 1 status_controller.go:211] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2021-08-20T05:50:10Z","message":"All is well","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"AuthenticatorCertKeyProgressing: All is well","reason":"AsExpected","status":"False","type":"Progressing"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"All is well","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2021-08-20T05:23:51Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} 2021-08-20T05:50:10.694601820Z I0820 05:50:10.693487 1 status_controller.go:211] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2021-08-20T05:50:10Z","message":"All is well","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"AuthenticatorCertKeyProgressing: All is well","reason":"AsExpected","status":"False","type":"Progressing"},{"lastTransitionTime":"2021-08-20T05:48:20Z","message":"All is well","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2021-08-20T05:23:51Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} 2021-08-20T05:50:10.698154172Z I0820 05:50:10.698103 1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"openshift-authentication-operator", Name:"openshift-authentication-operator", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Degraded changed from True to False ("All is well") 2021-08-20T05:50:10.704125566Z I0820 05:50:10.704068 1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"openshift-authentication-operator", Name:"openshift-authentication-operator", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Degraded changed from True to False ("All is well") ``` I found two things that are suspicious: - Events are emitted twice, which may suggest that there are multiple worker threads. This is very unlikely as the OAuthServerWorkloadController runs with a single worker. - Client side throttling happened.
reviewed-in-sprint: not enough capacity to work on this bugzilla.
*** Bug 2024416 has been marked as a duplicate of this bug. ***
Hello Team, Any update on this? The customer is continuously asking for updates. Please let us know in case of any information required from our end. Regards, Akhil Harchinder Red Hat
Hello Team, Any update on this? Regards, Hala Elwazery Red Hat
Hello @Sebastian , I'm following up again on this bugzilla as The Support case (#3057022 - Waiting on Red Hat) is still opened because of this BZ and we didn't have any update since March,17. did you have any chance to get a stable reproducer for this issue? Thank you, have a great day Hala Elwazery - Technical Account Manager, Openshift
Synced with Ruiyang with necessary guide, he will help verify this bug.
A total of four tests were done, and the time required to obtain the expected results was as follows: first: 62 seconds second: 10 seconds third: 44 seconds fourth: 28 seconds The test is basically as expected.
Test payload: 4.11.0-0.nightly-2022-05-18-171831 Test steps : OCP-22434 setup part( https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-22434 ) 1. Save the oauth cluster yaml to oauth_cluster.yaml 2. change oauth cluster identityProviders part 3. Time to wait for authentication's PROGRESS status to become true (oc get co | grep "authentication") 4. restore oauth cluster envrioment. Repeat steps 2~4 four times. Test Result: A total of four tests were done, and the time required to obtain the expected results was as follows: first: 62 seconds second: 10 seconds third: 44 seconds fourth: 28 seconds The test is basically as expected.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069