1781950 – [4.3] Ingress operator logs spurious "failed to sync ingresscontroller status" errors

Bug 1781950 - [4.3] Ingress operator logs spurious "failed to sync ingresscontroller status" errors

Summary: [4.3] Ingress operator logs spurious "failed to sync ingresscontroller status...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.3.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.3.0
Assignee:	Miciah Dashiel Butler Masters
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:	1781948
Blocks:
TreeView+	depends on / blocked

Reported:	2019-12-10 23:24 UTC by Miciah Dashiel Butler Masters
Modified:	2022-08-04 22:27 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1781948
Environment:
Last Closed:	2020-01-23 11:18:54 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-ingress-operator pull 338	0	None	closed	[release-4.3] Bug 1781950: Do not wrap errors from syncIngressControllerStatus	2020-11-22 13:48:03 UTC
Red Hat Product Errata	RHBA-2020:0062	0	None	None	None	2020-01-23 11:19:16 UTC

Description Miciah Dashiel Butler Masters 2019-12-10 23:24:14 UTC

+++ This bug was initially created as a clone of Bug #1781948 +++

The ingress operator logs spurious "failed to sync ingresscontroller status" errors, even if it has successfully updated the ingresscontroller's status, when the ingresscontroller's "Degraded" status condition is true. 

+++ This bug was initially created as a clone of Bug #1781345 +++

Description of problem:
This on an 4.3 OCP IPI installed cluster on Azure.  When trying to run the node-vertical test where we try to deploy up to 250 gcr.io/google_containers/pause-amd64:3.0 pods per worker node in a single namespace, the ingress operator degraded and 2 worker nodes became NotReady.
This cluster is fips-enabled and using SDN network type.

root@ip-172-31-40-229: ~/openshift-scale/workloads/workloads # oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      148m
cloud-credential                           4.3.0-0.nightly-2019-12-09-035405   True        False         False      170m
cluster-autoscaler                         4.3.0-0.nightly-2019-12-09-035405   True        False         False      161m
console                                    4.3.0-0.nightly-2019-12-09-035405   True        False         False      156m
dns                                        4.3.0-0.nightly-2019-12-09-035405   True        False         False      166m
image-registry                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      17m
ingress                                    4.3.0-0.nightly-2019-12-09-035405   False       True          True       23m
insights                                   4.3.0-0.nightly-2019-12-09-035405   True        False         False      167m
kube-apiserver                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      165m
kube-controller-manager                    4.3.0-0.nightly-2019-12-09-035405   True        False         False      164m
kube-scheduler                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      163m
machine-api                                4.3.0-0.nightly-2019-12-09-035405   True        False         False      166m
machine-config                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      161m
marketplace                                4.3.0-0.nightly-2019-12-09-035405   True        False         False      162m
monitoring                                 4.3.0-0.nightly-2019-12-09-035405   False       True          True       22m
network                                    4.3.0-0.nightly-2019-12-09-035405   True        True          True       165m
node-tuning                                4.3.0-0.nightly-2019-12-09-035405   True        False         False      162m
openshift-apiserver                        4.3.0-0.nightly-2019-12-09-035405   True        False         False      161m
openshift-controller-manager               4.3.0-0.nightly-2019-12-09-035405   True        False         False      165m
openshift-samples                          4.3.0-0.nightly-2019-12-09-035405   True        False         False      161m
operator-lifecycle-manager                 4.3.0-0.nightly-2019-12-09-035405   True        False         False      166m
operator-lifecycle-manager-catalog         4.3.0-0.nightly-2019-12-09-035405   True        False         False      166m
operator-lifecycle-manager-packageserver   4.3.0-0.nightly-2019-12-09-035405   True        False         False      162m
service-ca                                 4.3.0-0.nightly-2019-12-09-035405   True        False         False      167m
service-catalog-apiserver                  4.3.0-0.nightly-2019-12-09-035405   True        False         False      164m
service-catalog-controller-manager         4.3.0-0.nightly-2019-12-09-035405   True        False         False      164m
storage                                    4.3.0-0.nightly-2019-12-09-035405   True        False         False      162m
root@ip-172-31-40-229: ~/openshift-scale/workloads/workloads # 


In openshift-ingress-operator logs, I am seeing:

2019-12-09T15:48:01.016Z        ERROR   operator.init.controller-runtime.controller     controller/controller.go:218    Reconciler error        {"controller": "ingress_controller", "request": "openshift-ingress-operator/default", "error": "failed to sync ingresscontroller status: IngressController is degraded", "errorCauses": [{"error": "failed to sync ingresscontroller status: IngressController is degraded"}]}

[...]

Comment 1 Dan Mace 2019-12-11 13:36:14 UTC

Just waiting for https://github.com/openshift/cluster-ingress-operator/pull/337 to merge for the automatic backport to happen.

Comment 3 Hongan Li 2019-12-16 04:11:09 UTC

verified with 4.3.0-0.nightly-2019-12-13-180405 and issue has been fixed.

ensure ingresscontroller is "Degraded" 
$ oc -n openshift-ingress-operator logs ingress-operator-bcf5799-d8rlv -c ingress-operator | grep "failed to sync"

Comment 5 errata-xmlrpc 2020-01-23 11:18:54 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062

Note You need to log in before you can comment on or make changes to this bug.