Bug 1974654 - Policy propagator propagates owner from root policy
Summary: Policy propagator propagates owner from root policy
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: GRC & Policy
Version: rhacm-2.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: rhacm-2.3
Assignee: Gus Parvin
QA Contact: Derek Ho
Mikela Dockery
URL:
Whiteboard:
Depends On: 1974648
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-22 09:02 UTC by Ricardo Carrillo Cruz
Modified: 2021-11-08 20:57 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1974648
Environment:
Last Closed: 2021-11-08 20:57:00 UTC
Target Upstream Version:
Embargoed:
ming: rhacm-2.2.z+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github open-cluster-management backlog issues 13611 0 None None None 2021-06-23 18:12:22 UTC

Description Ricardo Carrillo Cruz 2021-06-22 09:02:34 UTC
Description of the problem:

We are writing a Kubernetes operator that creates ACM secondary resources.
When the operator creates a primary resource that itself causes an ACM policy creation, the policy propagator pod enters an infinite loop of reconciling the policy:

<snip>

{"level":"info","ts":1624288258.3368125,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-2-policy-deployment-policy"}
{"level":"info","ts":1624288258.337039,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-2-policy-deployment-policy"}
{"level":"info","ts":1624288258.404348,"logger":"policy-propagator","msg":"Found reconciliation request from replicated policy...","Namespace":"spoke1","Name":"default.group1-batch-1-policy-deployment-policy"}
{"level":"info","ts":1624288258.447739,"logger":"policy-propagator","msg":"Reconciliation complete.","Policy-Namespace":"default","Policy-Name":"group1-batch-1-policy-deployment-policy"}
{"level":"info","ts":1624288258.4478629,"logger":"policy-propagator","msg":"Reconciling Policy...","Request.Namespace":"default","Request.Name":"group1-batch-2-policy-deployment-policy"}
{"level":"info","ts":1624288258.4709926,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-1-policy-deployment-policy"}
{"level":"info","ts":1624288258.4710896,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-1-policy-deployment-policy"}
{"level":"info","ts":1624288258.4720683,"logger":"policy-propagator","msg":"Found reconciliation request from replicated policy...","Namespace":"spoke2","Name":"default.group1-batch-2-policy-deployment-policy"}
{"level":"info","ts":1624288258.483452,"logger":"policy-propagator","msg":"Reconciliation complete.","Policy-Namespace":"default","Policy-Name":"group1-batch-2-policy-deployment-policy"}
{"level":"info","ts":1624288258.4835362,"logger":"policy-propagator","msg":"Reconciling Policy...","Request.Namespace":"default","Request.Name":"group1-batch-1-policy-deployment-policy"}
{"level":"info","ts":1624288258.4845123,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-2-policy-deployment-policy"}
{"level":"info","ts":1624288258.4845293,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-2-policy-deployment-policy"}
{"level":"info","ts":1624288258.5060313,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-1-policy-deployment-policy"}
{"level":"info","ts":1624288258.506073,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-1-policy-deployment-policy"}
{"level":"info","ts":1624288258.5080616,"logger":"policy-propagator","msg":"Reconciliation complete.","Policy-Namespace":"default","Policy-Name":"group1-batch-1-policy-deployment-policy"}
{"level":"info","ts":1624288258.5084293,"logger":"policy-propagator","msg":"Reconciling Policy...","Request.Namespace":"default","Request.Name":"group1-batch-2-policy-deployment-policy"}

</snip>

Gus Parvin looked into the issue and it appears is due to policy propagator propagating the policy owner reference from the root policy to the managed cluster namespaces. Since our operator is the owner of the root policy, the error condition appears.
Gus fixed that behaviour on ACM code and after that the operator created the policy object fine and no infinite loop occurred.

Release version:

2.2

Operator snapshot version:

OCP version:

4.8

Browser Info:

Steps to reproduce:
1. Clone https://github.com/rcarrillocruz/cluster-group-lcm/commit/d7b5c8ff942085fe9525f40f996da5c668b902c1
2. Run 'make deploy' with a KUBECONFIG targetting a hub cluster with ACM
3. Run oc edit clusterrole open-cluster-management:grc-b5e97:clusterrole and add to the end:
- apiGroups:
  - ran.openshift.io
  resources:
  - groups/finalizers
  verbs:
  - update
4. Run oc apply -f group.yaml (group.yaml is located in the cloned repo)

Actual results:

Policy object is constantly recreated and policy propagator pod logs show an infinite loop reconciling the policy.

Expected results:

Policy object is created just once and no infinite loop of reconcile actions is shown in the propagator pod logs.

Comment 1 Gus Parvin 2021-06-22 15:45:47 UTC
Thanks for helping us improve grc!

Comment 2 Gus Parvin 2021-06-23 18:16:34 UTC
After discussing more with the customer we decided this is not needed in ACM 2.2, only in 2.3.  The fix has been delivered into the release and is being validated.

Comment 3 Ricardo Carrillo Cruz 2021-06-24 07:51:29 UTC
Thanks for this Gus.
Once this BZ moves to verified, I assume that means the fix is in some branch.
Is it possible for us to install with OLM non-released ACM, i.e. I'd like to develop my operator against a ACM with the fix as soon as it's available.

Thanks

Comment 4 Mike Ng 2021-06-30 20:31:31 UTC
G2Bsync 871640744 comment 
 gparvin Wed, 30 Jun 2021 18:40:40 UTC 
 G2Bsync
The instructions for installing the pre-release ACM 2.3 is available in the repository https://github.com/open-cluster-management/deploy and make sure you follow the instructions for obtaining access to the open-cluster-management quay organization so your pull secret can pull our images.

Comment 6 juhsu 2021-11-08 20:57:00 UTC
Fixed in 2.3 GA.


Note You need to log in before you can comment on or make changes to this bug.