Bug 2081447 - Ingress operator performs spurious updates in response to API's defaulting of router deployment's router container's ports' protocol field
Summary: Ingress operator performs spurious updates in response to API's defaulting of...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.11
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Miciah Dashiel Butler Masters
QA Contact: Melvin Joseph
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-03 18:22 UTC by Miciah Dashiel Butler Masters
Modified: 2022-08-10 11:10 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:09:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-ingress-operator pull 753 0 None open Bug 2081447: `desiredRouterDeployment`: Fix for port defaulting 2022-05-03 18:28:24 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:10:20 UTC

Description Miciah Dashiel Butler Masters 2022-05-03 18:22:36 UTC
Description of problem:

When the ingress operator creates or updates a router deployment, the API sets default values for the protocol field of the router container's ports, which the operator detects as an external update and attempts to revert.  The operator should not update the deployment in response to API defaulting.


OpenShift release version:

The issue was introduced in OpenShift 4.11 by <https://github.com/openshift/cluster-ingress-operator/pull/694/commits/af653f9fa7368cf124e11b7ea4666bc40e601165>.


Cluster Platform:

All platforms are affected.


How reproducible:

100%.


Steps to Reproduce:

1. Launch a new cluster.

2. Check the ingress operator's logs:

    oc -n openshift-ingress-operator logs deploy/ingress-operator -c ingress-operator


Actual results:

The operator's logs have "updated router deployment" repeated over and over.  For example, in this CI run, I see "updated router deployment" 177 times: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/724/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/1519044935800590336/artifacts/e2e-aws-operator/gather-extra/artifacts/pods/openshift-ingress-operator_ingress-operator-86dccb55cd-p4529_ingress-operator.log


Expected results:

The operator should ignore updates by the API that only set default values, and the operator should not log "updated router deployment" unless the deployment is updated outside of API defaulting.  For example, in this CI run from the release-4.10 branch, I see "updated router deployment" 2 times: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/748/pull-ci-openshift-cluster-ingress-operator-release-4.10-e2e-aws-operator/1519357272591962112/artifacts/e2e-aws-operator/gather-extra/artifacts/pods/openshift-ingress-operator_ingress-operator-59b64ff4bb-7cdnw_ingress-operator.log


Impact of the problem:

The spurious reconciliation requests incur excessive CPU and API usage and add noise to logs.

Comment 3 Melvin Joseph 2022-05-05 05:13:53 UTC
melvinjoseph@mjoseph-mac Downloads % oc get clusterversion


NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-05-04-214114   True        False         163m    Cluster version is 4.11.0-0.nightly-2022-05-04-214114

melvinjoseph@mjoseph-mac Downloads % oc -n openshift-ingress-operator logs deploy/ingress-operator -c ingress-operator
There was no "updated router deployment" message.

Also checked the CI run.
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/754/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/1521829918185361408/artifacts/e2e-aws-operator/gather-extra/artifacts/pods/openshift-ingress-operator_ingress-operator-66bdf59d76-6nrc9_ingress-operator.log

There was no much "updated router deployment" logs and we can see there is only 9 times.

Hence verifying the bug.

Comment 5 errata-xmlrpc 2022-08-10 11:09:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.