2258681 – Backingstore Reconcilliation happening over 2000 times an hour

Bug 2258681 - Backingstore Reconcilliation happening over 2000 times an hour

Summary: Backingstore Reconcilliation happening over 2000 times an hour

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	Multi-Cloud Object Gateway
Sub Component:
Version:	4.13
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	ODF 4.15.0
Assignee:	Liran Mauda
QA Contact:	Uday kurundwade
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2260330 2260331
TreeView+	depends on / blocked

Reported:	2024-01-17 02:55 UTC by Alexander
Modified:	2024-07-18 04:25 UTC (History)
CC List:	6 users (show)
Fixed In Version:	4.15.0-126
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	2260330 2260331 (view as bug list)
Environment:
Last Closed:	2024-03-19 15:31:29 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	noobaa noobaa-operator issues 1282	None	closed	2000 reconciles every hour	2024-01-23 08:48:46 UTC
Github	noobaa noobaa-operator pull 1257	None	open	Update README	2024-01-23 08:48:45 UTC
Github	noobaa noobaa-operator pull 1291	None	Merged	Remove upgradeBackingStore	2024-01-23 08:48:44 UTC
Github	noobaa noobaa-operator pull 1293	None	Merged	[Backport into 5.15] backporting some bug fixes	2024-01-23 08:48:43 UTC
Red Hat Product Errata	RHSA-2024:1383	None	None	None	2024-03-19 15:31:47 UTC

Description Alexander 2024-01-17 02:55:51 UTC

Description of problem (please be detailed as possible and provide log
snippests):

Backingstore Reconcilliation happening over 2000 times an hour

Please review the code with me. The pjg(backingstore.go) and the controller(backingstore_controller.go) below that calls it.

// MapSecretToBackingStores returns a list of backingstores that uses the secret in their secretRefernce
// used by backingstore_contorller to watch secrets changes
func MapSecretToBackingStores(secret types.NamespacedName) []reconcile.Request {
	log := util.Logger()
	log.Infof("checking which backingstore to reconcile. mapping secret %v to backingstores", secret)
	bsList := &nbv1.BackingStoreList{
		TypeMeta: metav1.TypeMeta{Kind: "BackingStoreList"},
	}
	if !util.KubeList(bsList, &client.ListOptions{Namespace: secret.Namespace}) {
		log.Infof("Cloud not found backingStores in namespace %q, while trying to find Backingstore that uses %s secrte", secret.Namespace, secret.Name)
		return nil
	}

	reqs := []reconcile.Request{}

	for _, bs := range bsList.Items {
		bsSecret, err := util.GetBackingStoreSecret(&bs)
		if err != nil {
			log.Errorf(err.Error())
		}
		if bsSecret != nil && bsSecret.Name == secret.Name {
			reqs = append(reqs, reconcile.Request{
				NamespacedName: types.NamespacedName{
					Name:      bs.Name,
					Namespace: bs.Namespace,
				},
			})
		}
	}
	log.Infof("will reconcile these backingstores: %v", reqs)

	return reqs
}


and then from its controller thats calling above

	// setting another handler to watch events on secrets that not necessarily owned by the Backingstore.
	// only one OwnerReference can be a controller see:
	// https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/controller/controllerutil/controllerutil.go#L54
	secretsHandler := handler.EnqueueRequestsFromMapFunc(func(ctx context.Context, obj client.Object) []reconcile.Request {
		return backingstore.MapSecretToBackingStores(types.NamespacedName{
			Name:      obj.GetName(),
			Namespace: obj.GetNamespace(),
		})
	})
	err = c.Watch(source.Kind(mgr.GetCache(), &corev1.Secret{}), secretsHandler, logEventsPredicate)
	if err != nil {
		return err
	}


should there be a limit on how many times it checks every hour?


Version of all relevant components (if applicable): ODF 4.13


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)? it does not stop operations, it slows down operations


Is there any workaround available to the best of your knowledge? the code is working, but this is an enquiry whether or not it should be limited


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible? n/a customer environment only


Can this issue reproduce from the UI? n/a customer environment only


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.
2.
3.


Actual results:


Expected results:


Additional info:

Comment 16 errata-xmlrpc 2024-03-19 15:31:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383

Comment 17 Red Hat Bugzilla 2024-07-18 04:25:28 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days

Note You need to log in before you can comment on or make changes to this bug.