Is this problem happening on the same ACM instance as bug #2043568 ? Both problems seem to be related. We need to take a closer look at the resync logic. The frequent resyncs from multiple managed clusters could be adding too much stress to the search service in the hub. The backoff logic of up to 10 minutes after each error is meant to prevent problems like this to spiral, but this is a contributing factor in preventing the service on the hub to fully recover.
The aggregator log on the hub is showing parsing errors with some labels. Could we get more information about the resource(s) using those labels? 2022-01-18T21:25:12.686283758Z I0118 21:25:12.686111 1 resyncCluster.go:30] Resync for cluster: [REMOVED] edges to insert: 4524 2022-01-18T21:25:13.125077072Z W0118 21:25:13.124019 1 resyncCluster.go:297] Unable to parse string value from interface{} : ['app.kubernetes.io/part-of=day2-ops'] 2022-01-18T21:25:13.125077072Z W0118 21:25:13.124046 1 resyncCluster.go:297] Unable to parse string value from interface{} : <nil> 2022-01-18T21:23:47.536773446Z W0118 21:23:47.536703 1 resyncCluster.go:297] Unable to parse string value from interface{} : ['operators.coreos.com/advanced-cluster-management.open-cluster-management='] 2022-01-18T21:23:47.536773446Z W0118 21:23:47.536731 1 resyncCluster.go:297] Unable to parse string value from interface{} : <nil> 2022-01-18T21:23:38.778877208Z W0118 21:23:38.778751 1 resyncCluster.go:297] Unable to parse string value from interface{} : ['olm.owner=compliance-operator.v0.1.47'] 2022-01-18T21:23:38.778877208Z W0118 21:23:38.778794 1 resyncCluster.go:297] Unable to parse string value from interface{} : <nil>
This is indeed the same ACM instance as https://bugzilla.redhat.com/show_bug.cgi?id=2043568
*** Bug 2043568 has been marked as a duplicate of this bug. ***
This problem seem to be similar to BZ 2030005, for which a fix has been merged for ACM 2.4.2
*** This bug has been marked as a duplicate of bug 2030005 ***