The generic retry logic now used for various resource types in ovnk master attempts to add or delete a given object indefinitely until it succeeds. Just like with level-driven controllers, we should add an upper bound to number of retries, after which the retry entry should be discarded.
upstream PR: https://github.com/ovn-org/ovn-kubernetes/pull/2970
Scaled a few times up to 123 nodes, unable to get any failed attempts. 4.12.0-0.nightly-2022-08-05-045104 log_ovnkube-master-gk925_ip.eu-central-1.compute.internal:I0805 17:35:39.639802 1 obj_retry.go:1245] Retry successful for *v1.Pod openshift-multus/multus-b2449 after 0 failed attempt(s) log_ovnkube-master-gk925_ip.eu-central-1.compute.internal:I0805 17:35:39.639896 1 obj_retry.go:1245] Retry successful for *v1.Pod openshift-dns/node-resolver-ktqd7 after 0 failed attempt(s) log_ovnkube-master-gk925_ip.eu-central-1.compute.internal:I0805 17:35:39.639942 1 obj_retry.go:1245] Retry successful for *v1.Pod openshift-machine-config-operator/machine-config-daemon-grxwt after 0 failed attempt(s) log_ovnkube-master-gk925_ip.eu-central-1.compute.internal:I0805 17:35:39.642075 1 obj_retry.go:1245] Retry successful for *v1.Pod openshift-network-diagnostics/network-check-target-brvnm after 0 failed attempt(s) log_ovnkube-master-gk925_ip.eu-central-1.compute.internal:I0805 17:35:39.642480 1 obj_retry.go:1245] Retry successful for *v1.Pod openshift-multus/network-metrics-daemon-d9n6v after 0 failed attempt(s)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399