Created attachment 1771602 [details] InstallEnv and NMStateConfig CRDS Description of problem: In the case where an InstallEnv references an invalid NMStateConfig the reconcile of the InstallEnv is continuously attempted; even though there is no change to the NMStateConfig. In this case InstallEnv reconciliation and ISO generation should only be re-attempted once the NMStateConfig is changed. Version-Release number of selected component (if applicable): assisted-service image: quay.io/ocpmetal/assisted-service@sha256:c65af18f741660660a04e4a3b155c10a6668527bb790de06a9708f6bec17479b Steps to Reproduce: 1. Create ClusterDeployment 2. Create invalid NMStateConfig 3. Create InstallEnv referencing invalid NMStateConfig Actual results: InstallEnv reconcile is continually attempted with no change made to NMStateConfig Expected results: InstallEnv reconcile should only be attempted if the NMStateConfig is changed
This bug is a duplicate of https://issues.redhat.com/browse/MGMT-4695 In short, for invalid nmstate config we get the wrong status code, which makes it hard to determine whether or not we should reqeueue. For invalid config, we would expect HTTP StatusBadRequest (code 400), while we get HTTP StatusInternalServerError (code500) here. I have added some debug prints and reproduced ths issue here (added prints marked with 'ZZZ'): https://gist.github.com/nmagnezi/cd4e21691e8c64647bd00d32b0a60b30 See that we initially get 500, followed up by many 409 for requests that arrived in under 10 seconds. For the latter (code 409), I will try to extend the requeue time to a time longer than 10 seconds, yet it will fix part of the issue. Yevgeny, any plans for https://issues.redhat.com/browse/MGMT-MGMT-4696 ?
Yevgeny, see the question on comment#1
Fix merged to master. QE Verification: ================ You may verify the fix by the referenced YAMLs from: https://github.com/openshift/assisted-service/pull/1696#issuecomment-848670736
Verified: The infraenv is reconciled twice with the invalid nmstate config, and then reconciled again only when there is a change to nmstateconfig matching the label is changed quay.io/ocpmetal/assisted-service@sha256:434617dd691c2f5f1a410ffd9866908fc0e9c72e0c3b26ced3d0d8578180fc3a
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438