Bug 1781950

Summary: [4.3] Ingress operator logs spurious "failed to sync ingresscontroller status" errors
Product: OpenShift Container Platform Reporter: Miciah Dashiel Butler Masters <mmasters>
Component: NetworkingAssignee: Miciah Dashiel Butler Masters <mmasters>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: aos-bugs, dmace, hongli, mifiedle, mmasters, wabouham
Version: 4.3.0   
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1781948 Environment:
Last Closed: 2020-01-23 11:18:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1781948    
Bug Blocks:    

Description Miciah Dashiel Butler Masters 2019-12-10 23:24:14 UTC
+++ This bug was initially created as a clone of Bug #1781948 +++

The ingress operator logs spurious "failed to sync ingresscontroller status" errors, even if it has successfully updated the ingresscontroller's status, when the ingresscontroller's "Degraded" status condition is true. 

+++ This bug was initially created as a clone of Bug #1781345 +++

Description of problem:
This on an 4.3 OCP IPI installed cluster on Azure.  When trying to run the node-vertical test where we try to deploy up to 250 gcr.io/google_containers/pause-amd64:3.0 pods per worker node in a single namespace, the ingress operator degraded and 2 worker nodes became NotReady.
This cluster is fips-enabled and using SDN network type.

root@ip-172-31-40-229: ~/openshift-scale/workloads/workloads # oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      148m
cloud-credential                           4.3.0-0.nightly-2019-12-09-035405   True        False         False      170m
cluster-autoscaler                         4.3.0-0.nightly-2019-12-09-035405   True        False         False      161m
console                                    4.3.0-0.nightly-2019-12-09-035405   True        False         False      156m
dns                                        4.3.0-0.nightly-2019-12-09-035405   True        False         False      166m
image-registry                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      17m
ingress                                    4.3.0-0.nightly-2019-12-09-035405   False       True          True       23m
insights                                   4.3.0-0.nightly-2019-12-09-035405   True        False         False      167m
kube-apiserver                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      165m
kube-controller-manager                    4.3.0-0.nightly-2019-12-09-035405   True        False         False      164m
kube-scheduler                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      163m
machine-api                                4.3.0-0.nightly-2019-12-09-035405   True        False         False      166m
machine-config                             4.3.0-0.nightly-2019-12-09-035405   True        False         False      161m
marketplace                                4.3.0-0.nightly-2019-12-09-035405   True        False         False      162m
monitoring                                 4.3.0-0.nightly-2019-12-09-035405   False       True          True       22m
network                                    4.3.0-0.nightly-2019-12-09-035405   True        True          True       165m
node-tuning                                4.3.0-0.nightly-2019-12-09-035405   True        False         False      162m
openshift-apiserver                        4.3.0-0.nightly-2019-12-09-035405   True        False         False      161m
openshift-controller-manager               4.3.0-0.nightly-2019-12-09-035405   True        False         False      165m
openshift-samples                          4.3.0-0.nightly-2019-12-09-035405   True        False         False      161m
operator-lifecycle-manager                 4.3.0-0.nightly-2019-12-09-035405   True        False         False      166m
operator-lifecycle-manager-catalog         4.3.0-0.nightly-2019-12-09-035405   True        False         False      166m
operator-lifecycle-manager-packageserver   4.3.0-0.nightly-2019-12-09-035405   True        False         False      162m
service-ca                                 4.3.0-0.nightly-2019-12-09-035405   True        False         False      167m
service-catalog-apiserver                  4.3.0-0.nightly-2019-12-09-035405   True        False         False      164m
service-catalog-controller-manager         4.3.0-0.nightly-2019-12-09-035405   True        False         False      164m
storage                                    4.3.0-0.nightly-2019-12-09-035405   True        False         False      162m
root@ip-172-31-40-229: ~/openshift-scale/workloads/workloads # 


In openshift-ingress-operator logs, I am seeing:

2019-12-09T15:48:01.016Z        ERROR   operator.init.controller-runtime.controller     controller/controller.go:218    Reconciler error        {"controller": "ingress_controller", "request": "openshift-ingress-operator/default", "error": "failed to sync ingresscontroller status: IngressController is degraded", "errorCauses": [{"error": "failed to sync ingresscontroller status: IngressController is degraded"}]}

[...]

Comment 1 Dan Mace 2019-12-11 13:36:14 UTC
Just waiting for https://github.com/openshift/cluster-ingress-operator/pull/337 to merge for the automatic backport to happen.

Comment 3 Hongan Li 2019-12-16 04:11:09 UTC
verified with 4.3.0-0.nightly-2019-12-13-180405 and issue has been fixed.

ensure ingresscontroller is "Degraded" 
$ oc -n openshift-ingress-operator logs ingress-operator-bcf5799-d8rlv -c ingress-operator | grep "failed to sync"

Comment 5 errata-xmlrpc 2020-01-23 11:18:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062