Description of problem:
When the ingress operator creates or updates a router deployment, the API sets default values for the protocol field of the router container's ports, which the operator detects as an external update and attempts to revert. The operator should not update the deployment in response to API defaulting.
OpenShift release version:
The issue was introduced in OpenShift 4.11 by <https://github.com/openshift/cluster-ingress-operator/pull/694/commits/af653f9fa7368cf124e11b7ea4666bc40e601165>.
All platforms are affected.
Steps to Reproduce:
1. Launch a new cluster.
2. Check the ingress operator's logs:
oc -n openshift-ingress-operator logs deploy/ingress-operator -c ingress-operator
The operator's logs have "updated router deployment" repeated over and over. For example, in this CI run, I see "updated router deployment" 177 times: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/724/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/1519044935800590336/artifacts/e2e-aws-operator/gather-extra/artifacts/pods/openshift-ingress-operator_ingress-operator-86dccb55cd-p4529_ingress-operator.log
The operator should ignore updates by the API that only set default values, and the operator should not log "updated router deployment" unless the deployment is updated outside of API defaulting. For example, in this CI run from the release-4.10 branch, I see "updated router deployment" 2 times: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/748/pull-ci-openshift-cluster-ingress-operator-release-4.10-e2e-aws-operator/1519357272591962112/artifacts/e2e-aws-operator/gather-extra/artifacts/pods/openshift-ingress-operator_ingress-operator-59b64ff4bb-7cdnw_ingress-operator.log
Impact of the problem:
The spurious reconciliation requests incur excessive CPU and API usage and add noise to logs.
melvinjoseph@mjoseph-mac Downloads % oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-05-04-214114 True False 163m Cluster version is 4.11.0-0.nightly-2022-05-04-214114
melvinjoseph@mjoseph-mac Downloads % oc -n openshift-ingress-operator logs deploy/ingress-operator -c ingress-operator
There was no "updated router deployment" message.
Also checked the CI run.
There was no much "updated router deployment" logs and we can see there is only 9 times.
Hence verifying the bug.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.