Description of problem: Using the ZTP flow and the repo https://github.com/openshift-kni/cnf-features-deploy we deploy an SNO cluster and the policies associated with the environment. The thing is, when the ACM policies are created in the hub cluster has the behaviour "MustOnlyHave" this ends on an issue regarding the labels and the namespaces: - The Policy creates a NS with concrete labels (E.G monitoring) - The OLM tries to path that NS with a label (olm.operatorgroup.uid/bb373fcd-1a63-4b7a-83cd-011226dc71ad: "") (automatically generated by OLM operator - The policy enter in NonCompliant state. - The policy get applied again.. - The loop goes on I'm using the Hooks for PolicyGen Version-Release number of selected component (if applicable): ACM 2.3.3 Hub 4.8.5 SNO 4.8.11 How reproducible: Always Steps to Reproduce: 1. Deploy ACM and the gitops-operator 2. Fill the code repo as exists on the cnf-feature-deploy repo 3. git push to the repo and then let the hooks deploy the SNO and the ACM Policies 4. Wait until it starts flapping Actual results: - Policy flapping between between NonCompliant and Compliant state - Many errors on the OLM Operator Logs Expected results: No errors Additional info: - Logs on the OLM Operator: time="2021-09-29T10:30:12Z" level=info msg="checking ptp-operator.4.8.0-202108312109" time="2021-09-29T10:30:12Z" level=info msg="checking performance-addon-operator.v4.8.1" {"level":"error","ts":1632911412.328675,"logger":"controllers.operator","msg":"Could not update Operator status","request":"/ptp-operator.openshift-ptp","error":"Operation cannot be fulfilled on operators.operators.coreos.com \"ptp-operator.openshift-ptp\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:293\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"} time="2021-09-29T10:30:12Z" level=info msg="checking ptp-operator.4.8.0-202108312109" {"level":"error","ts":1632911412.4046497,"logger":"controllers.operator","msg":"Could not update Operator status","request":"/local-storage-operator.openshift-local-storage","error":"Operation cannot be fulfilled on operators.operators.coreos.com \"local-storage-operator.openshift-local-storage\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:293\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"} {"level":"error","ts":1632911412.4187012,"logger":"controllers.operator","msg":"Could not update Operator status","request":"/local-storage-operator.openshift-local-storage","error":"Operation cannot be fulfilled on operators.operators.coreos.com \"local-storage-operator.openshift-local-storage\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:293\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"} time="2021-09-29T10:30:12Z" level=info msg="checking sriov-fec.v1.3.0" E0929 10:30:12.680671 1 queueinformer_operator.go:290] sync {"update" "openshift-performance-addon-operator"} failed: Operation cannot be fulfilled on namespaces "openshift-performance-addon-operator": the object has been modified; please apply your changes to the latest version and try again E0929 10:30:12.730499 1 queueinformer_operator.go:290] sync {"update" "openshift-sriov-network-operator"} failed: Operation cannot be fulfilled on namespaces "openshift-sriov-network-operator": the object has been modified; please apply your changes to the latest version and try again time="2021-09-29T10:30:13Z" level=info msg="checking ptp-operator.4.8.0-202108312109" time="2021-09-29T10:30:13Z" level=info msg="checking performance-addon-operator.v4.8.1" time="2021-09-29T10:30:14Z" level=info msg="checking performance-addon-operator.v4.8.1"
Patching the ACM policy with "complianceType: musthave" is a temp workaround that you can apply, but if you modify the repo this will be overrided by the hooks.
Hey folks, this is also happening with PVCs: Error on the policy: - eventName: vz-wc-lab-policies.vz-wc-lab-image-registry-policy.16ab6a5510dab516 lastTimestamp: "2021-10-07T05:40:23Z" message: 'NonCompliant; violation - Error updating the object `registry-storage`, the error is `Operation cannot be fulfilled on persistentvolumeclaims "registry-storage": the object has been modified; please apply your changes to the latest version and try again`; notification - configs [cluster] found as specified, therefore this Object template is compliant' - eventName: vz-wc-lab-policies.vz-wc-lab-image-registry-policy.16aba20db2e6167c lastTimestamp: "2021-10-07T05:35:06Z" message: "NonCompliant; violation - Error updating the object `registry-storage`, the error is `PersistentVolumeClaim \"registry-storage\" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims\n core.PersistentVolumeClaimSpec{\n \tAccessModes: {\"ReadWriteOnce\"},\n \ \tSelector: nil,\n \tResources: {Requests: {s\"storage\": {i: {...}, s: \"100Gi\", Format: \"BinarySI\"}}},\n- \tVolumeName: \"\",\n+ \tVolumeName: \"local-pv-b908200e\",\n \tStorageClassName: nil,\n \tVolumeMode: \ &\"Filesystem\",\n \tDataSource: nil,\n }\n`; notification - configs [cluster] found as specified, therefore this Object template is compliant" - eventName: vz-wc-lab-policies.vz-wc-lab-image-registry-policy.16ab6a5510dab516 lastTimestamp: "2021-10-07T05:10:08Z" message: 'NonCompliant; violation - Error updating the object `registry-storage`, the error is `Operation cannot be fulfilled on persistentvolumeclaims "registry-storage": the object has been modified; please apply your changes to the latest version and try again`; notification - configs [cluster] found as specified, therefore this Object template is compliant' This is the object that the policy want to enforce: apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: volume.beta.kubernetes.io/storage-class: fs-lso creationTimestamp: "2021-10-06T10:04:09Z" finalizers: - kubernetes.io/pvc-protection name: registry-storage namespace: openshift-image-registry resourceVersion: "1623757" uid: 36578f14-2c57-4e46-b116-8aabedf759ed spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi volumeMode: Filesystem volumeName: local-pv-b908200e status: accessModes: - ReadWriteOnce capacity: storage: 100Gi phase: Bound This is the object that other operator want to apply: apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: pv.kubernetes.io/bind-completed: "yes" volume.beta.kubernetes.io/storage-class: fs-lso creationTimestamp: "2021-10-06T10:04:09Z" finalizers: - kubernetes.io/pvc-protection name: registry-storage namespace: openshift-image-registry resourceVersion: "1623757" uid: 36578f14-2c57-4e46-b116-8aabedf759ed spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi volumeMode: Filesystem volumeName: local-pv-b908200e status: accessModes: - ReadWriteOnce capacity: storage: 100Gi phase: Bound
We currently don't have a formal test env to test ZTP for 4.10 nightly at the moment. Mark it as verified to unblock merge to 4.9, and will verify this change in 4.9.
Reopening. Further testing showed there is still excess CPU use.
Doc Text would be helpful in documenting this in the 4.10 release notes. Please supply.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056