Bug 1834989

Summary:	Ingress operator performs spurious updates in response to API's defaulting of liveness and readiness probes
Product:	OpenShift Container Platform	Reporter:	Miciah Dashiel Butler Masters <mmasters>
Component:	Networking	Assignee:	Miciah Dashiel Butler Masters <mmasters>
Networking sub component:	router	QA Contact:	Hongan Li <hongli>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	aos-bugs
Version:	4.5
Target Milestone:	---
Target Release:	4.5.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Cause: When the ingress operator reconciles an IngressController, the operator determines whether it needs to update the IngressController's Deployment by constructing an expected Deployment in memory, getting the actual Deployment from the API, and comparing the two. The operator leaves some values unspecified in its expected Deployment. When the API set default values for these unspecified values, the comparison would return a false positive. Consequence: The operator was repeatedly trying to update IngressControllers' Deployments in response to the API's setting default values. Fix: The operator now considers unspecified values and default values to be equal when comparing Deployments. Result: The operator should no longer update Deployments in response to API defaulting.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-07-13 17:37:59 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Miciah Dashiel Butler Masters 2020-05-12 20:03:00 UTC

Description of problem:

When the ingress operator compares an ingress controller's deployment that the operator gets from the API with what the operator expects to get in order to determine whether an update is needed, the operator compares the deployment's containers' liveness and readiness probes' parameters.  The API sets default values for these parameters, which the operator detects, and as a result, the operator repeatedly tries to update the deployment.  The operator should not update the deployment in response to API defaulting.


Steps to Reproduce:

1. Launch a new cluster.
2. Check the ingress operator's logs:

    oc -n openshift-ingress-operator logs deploy/ingress-operator


Actual results:

The ingress operator's logs have "updated router deployment" repeated scores of times.


Expected results:

The ingress operator should ignore the defaults that the API sets and should reach a steady state and stop logging "updated router deployment".


Additional info:

The following CI run shows "updated router deployment" logged 105 times:

https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/391/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/1489/artifacts/e2e-aws-operator/pods/openshift-ingress-operator_ingress-operator-655cf4d46c-4lzjr_ingress-operator.log

I expect to see "updated router deployment" logged a smaller number of times, as in this CI run, which has only 7 occurrences of "updated router deployment":

https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/391/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/1494/artifacts/e2e-aws-operator/pods/openshift-ingress-operator_ingress-operator-77595bd9f5-wzlnp_ingress-operator.log

Comment 3 Hongan Li 2020-05-19 09:27:08 UTC

verified with 4.5.0-0.nightly-2020-05-18-225907 and issue has been fixed.

Comment 4 errata-xmlrpc 2020-07-13 17:37:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409