Bug 2074626
| Summary: | Policy placement failure during ZTP SNO scale test | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Advanced Cluster Management for Kubernetes | Reporter: | jun | ||||
| Component: | GRC & Policy | Assignee: | Gus Parvin <gparvin> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Derek Ho <dho> | ||||
| Severity: | high | Docs Contact: | Mikela Dockery <mdockery> | ||||
| Priority: | unspecified | ||||||
| Version: | rhacm-2.5 | CC: | akrzos, gparvin, imiller, jkulikau | ||||
| Target Milestone: | --- | Flags: | bot-tracker-sync:
rhacm-2.5+
jun: needinfo- |
||||
| Target Release: | rhacm-2.5 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-06-09 02:10:53 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Do you have an ACM must-gather instead of just the OpenShift must-gather? Must gather from the hub: https://drive.google.com/file/d/1GrE2zg1QG58pJFazKlqXr0-UNeV7AmS6/view?usp=sharing The hub must-gather is also for OCP. Waiting for this to be reproduced in a new run. Thanks! Reproduced in the 1700 cluster run:
[root@e24-h01-000-r640 ~]# oc --kubeconfig bm/kubeconfig get policy -n sno00205
NAME REMEDIATION ACTION COMPLIANCE STATE AGE
ztp-install.sno00205-common-config-policy enforce NonCompliant 17h
status:
compliant: NonCompliant
details:
- compliant: NonCompliant
history:
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T22:00:30Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T22:00:28Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T22:00:25Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T22:00:10Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T21:59:54Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T21:59:39Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T21:59:24Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T21:59:08Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T21:58:53Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
- eventName: ztp-install.sno00205-common-config-policy.16e544fbef0fbeb2
lastTimestamp: "2022-04-12T21:58:38Z"
message: NonCompliant; no matches for kind "ConfigurationPolicy" in version
"policy.open-cluster-management.io/v1", please check if you have CRD deployed.
templateMeta:
creationTimestamp: null
name: sno00205-common-config-policy-config
[root@e24-h01-000-r640 ~]# oc --kubeconfig hv-sno/manifests/sno00205/kubeconfig get crd|grep configurationpolicies
configurationpolicies.policy.open-cluster-management.io 2022-04-12T22:01:16Z
ACM must gather from the hub and must gather from sno00205:
https://drive.google.com/drive/folders/1GsrH1uUpiBJ0nVFkcDCDoq8p_p-jQNZe?usp=sharing
Note that something happened before the must gathers were taken and the policy got recreated:
[root@e24-h01-000-r640 ~]# oc --kubeconfig bm/kubeconfig get policy -n sno00205
NAME REMEDIATION ACTION COMPLIANCE STATE AGE
ztp-install.sno00205-common-config-policy enforce Compliant 5m6s
G2Bsync 1099211258 comment gparvin Thu, 14 Apr 2022 13:53:41 UTC G2Bsync We are investigating a fix to the policy framework, specifically an issue @JustinKuli has identified in the template sync controller that could cause the described behavior. Thanks! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Advanced Cluster Management 2.5 security updates, images, and bug fixes), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:4956 |
Created attachment 1871999 [details] must gather sno00367 part1 Description of the problem: During 1000 SNO ZTP deployment test, about 1% of the clusters get stuck on the policy with configuration policy template due to this error: - eventName: ztp-install.sno00367-common-config-policy.16e4f8fd8984682b lastTimestamp: "2022-04-11T22:45:44Z" message: NonCompliant; no matches for kind "ConfigurationPolicy" in version "policy.open-cluster-management.io/v1", please check if you have CRD deployed. Release version: Operator snapshot version: OCP version: Browser Info: Steps to reproduce: 1. SNO deployment with DU profile at scale (50 clusters or 100 clusters per hour) 2. 3. Actual results: Config policy fails to be placed and enforced to the SNO Expected results: Config policy should be placed/enforced and become compliant Additional info: