Bug 2258681 - Backingstore Reconcilliation happening over 2000 times an hour [NEEDINFO]
Summary: Backingstore Reconcilliation happening over 2000 times an hour
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ODF 4.15.0
Assignee: Liran Mauda
QA Contact: Uday kurundwade
URL:
Whiteboard:
Depends On:
Blocks: 2260330 2260331
TreeView+ depends on / blocked
 
Reported: 2024-01-17 02:55 UTC by Alexander
Modified: 2024-03-19 15:31 UTC (History)
6 users (show)

Fixed In Version: 4.15.0-126
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2260330 2260331 (view as bug list)
Environment:
Last Closed: 2024-03-19 15:31:29 UTC
Embargoed:
lmauda: needinfo? (allee)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github noobaa noobaa-operator issues 1282 0 None closed 2000 reconciles every hour 2024-01-23 08:48:46 UTC
Github noobaa noobaa-operator pull 1257 0 None open Update README 2024-01-23 08:48:45 UTC
Github noobaa noobaa-operator pull 1291 0 None Merged Remove upgradeBackingStore 2024-01-23 08:48:44 UTC
Github noobaa noobaa-operator pull 1293 0 None Merged [Backport into 5.15] backporting some bug fixes 2024-01-23 08:48:43 UTC
Red Hat Product Errata RHSA-2024:1383 0 None None None 2024-03-19 15:31:47 UTC

Description Alexander 2024-01-17 02:55:51 UTC
Description of problem (please be detailed as possible and provide log
snippests):

Backingstore Reconcilliation happening over 2000 times an hour

Please review the code with me. The pjg(backingstore.go) and the controller(backingstore_controller.go) below that calls it.

// MapSecretToBackingStores returns a list of backingstores that uses the secret in their secretRefernce
// used by backingstore_contorller to watch secrets changes
func MapSecretToBackingStores(secret types.NamespacedName) []reconcile.Request {
	log := util.Logger()
	log.Infof("checking which backingstore to reconcile. mapping secret %v to backingstores", secret)
	bsList := &nbv1.BackingStoreList{
		TypeMeta: metav1.TypeMeta{Kind: "BackingStoreList"},
	}
	if !util.KubeList(bsList, &client.ListOptions{Namespace: secret.Namespace}) {
		log.Infof("Cloud not found backingStores in namespace %q, while trying to find Backingstore that uses %s secrte", secret.Namespace, secret.Name)
		return nil
	}

	reqs := []reconcile.Request{}

	for _, bs := range bsList.Items {
		bsSecret, err := util.GetBackingStoreSecret(&bs)
		if err != nil {
			log.Errorf(err.Error())
		}
		if bsSecret != nil && bsSecret.Name == secret.Name {
			reqs = append(reqs, reconcile.Request{
				NamespacedName: types.NamespacedName{
					Name:      bs.Name,
					Namespace: bs.Namespace,
				},
			})
		}
	}
	log.Infof("will reconcile these backingstores: %v", reqs)

	return reqs
}


and then from its controller thats calling above

	// setting another handler to watch events on secrets that not necessarily owned by the Backingstore.
	// only one OwnerReference can be a controller see:
	// https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/controller/controllerutil/controllerutil.go#L54
	secretsHandler := handler.EnqueueRequestsFromMapFunc(func(ctx context.Context, obj client.Object) []reconcile.Request {
		return backingstore.MapSecretToBackingStores(types.NamespacedName{
			Name:      obj.GetName(),
			Namespace: obj.GetNamespace(),
		})
	})
	err = c.Watch(source.Kind(mgr.GetCache(), &corev1.Secret{}), secretsHandler, logEventsPredicate)
	if err != nil {
		return err
	}


should there be a limit on how many times it checks every hour?


Version of all relevant components (if applicable): ODF 4.13


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)? it does not stop operations, it slows down operations


Is there any workaround available to the best of your knowledge? the code is working, but this is an enquiry whether or not it should be limited


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible? n/a customer environment only


Can this issue reproduce from the UI? n/a customer environment only


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.
2.
3.


Actual results:


Expected results:


Additional info:

Comment 16 errata-xmlrpc 2024-03-19 15:31:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383


Note You need to log in before you can comment on or make changes to this bug.