Fedora Account System
Red Hat Associate
Red Hat Customer
Description of the problem: We are writing a Kubernetes operator that creates ACM secondary resources. When the operator creates a primary resource that itself causes an ACM policy creation, the policy propagator pod enters an infinite loop of reconciling the policy: <snip> {"level":"info","ts":1624288258.3368125,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-2-policy-deployment-policy"} {"level":"info","ts":1624288258.337039,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-2-policy-deployment-policy"} {"level":"info","ts":1624288258.404348,"logger":"policy-propagator","msg":"Found reconciliation request from replicated policy...","Namespace":"spoke1","Name":"default.group1-batch-1-policy-deployment-policy"} {"level":"info","ts":1624288258.447739,"logger":"policy-propagator","msg":"Reconciliation complete.","Policy-Namespace":"default","Policy-Name":"group1-batch-1-policy-deployment-policy"} {"level":"info","ts":1624288258.4478629,"logger":"policy-propagator","msg":"Reconciling Policy...","Request.Namespace":"default","Request.Name":"group1-batch-2-policy-deployment-policy"} {"level":"info","ts":1624288258.4709926,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-1-policy-deployment-policy"} {"level":"info","ts":1624288258.4710896,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-1-policy-deployment-policy"} {"level":"info","ts":1624288258.4720683,"logger":"policy-propagator","msg":"Found reconciliation request from replicated policy...","Namespace":"spoke2","Name":"default.group1-batch-2-policy-deployment-policy"} {"level":"info","ts":1624288258.483452,"logger":"policy-propagator","msg":"Reconciliation complete.","Policy-Namespace":"default","Policy-Name":"group1-batch-2-policy-deployment-policy"} {"level":"info","ts":1624288258.4835362,"logger":"policy-propagator","msg":"Reconciling Policy...","Request.Namespace":"default","Request.Name":"group1-batch-1-policy-deployment-policy"} {"level":"info","ts":1624288258.4845123,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-2-policy-deployment-policy"} {"level":"info","ts":1624288258.4845293,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-2-policy-deployment-policy"} {"level":"info","ts":1624288258.5060313,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-1-policy-deployment-policy"} {"level":"info","ts":1624288258.506073,"logger":"policy-propagator","msg":"Found reconciliation request from root policy...","Namespace":"default","Name":"group1-batch-1-policy-deployment-policy"} {"level":"info","ts":1624288258.5080616,"logger":"policy-propagator","msg":"Reconciliation complete.","Policy-Namespace":"default","Policy-Name":"group1-batch-1-policy-deployment-policy"} {"level":"info","ts":1624288258.5084293,"logger":"policy-propagator","msg":"Reconciling Policy...","Request.Namespace":"default","Request.Name":"group1-batch-2-policy-deployment-policy"} </snip> Gus Parvin looked into the issue and it appears is due to policy propagator propagating the policy owner reference from the root policy to the managed cluster namespaces. Since our operator is the owner of the root policy, the error condition appears. Gus fixed that behaviour on ACM code and after that the operator created the policy object fine and no infinite loop occurred. Release version: 2.2 Operator snapshot version: OCP version: 4.8 Browser Info: Steps to reproduce: 1. Clone https://github.com/rcarrillocruz/cluster-group-lcm/commit/d7b5c8ff942085fe9525f40f996da5c668b902c1 2. Run 'make deploy' with a KUBECONFIG targetting a hub cluster with ACM 3. Run oc edit clusterrole open-cluster-management:grc-b5e97:clusterrole and add to the end: - apiGroups: - ran.openshift.io resources: - groups/finalizers verbs: - update 4. Run oc apply -f group.yaml (group.yaml is located in the cloned repo) Actual results: Policy object is constantly recreated and policy propagator pod logs show an infinite loop reconciling the policy. Expected results: Policy object is created just once and no infinite loop of reconcile actions is shown in the propagator pod logs.
Thanks for helping us improve grc!
After discussing more with the customer we decided this is not needed in ACM 2.2, only in 2.3. The fix has been delivered into the release and is being validated.
Thanks for this Gus. Once this BZ moves to verified, I assume that means the fix is in some branch. Is it possible for us to install with OLM non-released ACM, i.e. I'd like to develop my operator against a ACM with the fix as soon as it's available. Thanks
G2Bsync 871640744 comment gparvin Wed, 30 Jun 2021 18:40:40 UTC G2Bsync The instructions for installing the pre-release ACM 2.3 is available in the repository https://github.com/open-cluster-management/deploy and make sure you follow the instructions for obtaining access to the open-cluster-management quay organization so your pull secret can pull our images.
Verified - https://github.com/open-cluster-management/backlog/issues/13611#issuecomment-893138940
Fixed in 2.3 GA.