Description of problem: If a cluster is patched to a dummy cincinnati which has invalid risk names, CVO sets the Available condition to False which is confusing. It would be better to prompt the invalid risk name error but keep the cluster available. # oc get clusterversion/version -ojson | jq -r .status.conditions [ { "lastTransitionTime": "2021-12-08T05:53:39Z", "status": "False", "type": "Available" }, { "lastTransitionTime": "2021-12-08T05:53:39Z", "message": "ClusterVersion.config.openshift.io \"version\" is invalid: status.conditionalUpdates.conditions.reason: Invalid value: \"Multiple releases\": status.conditionalUpdates.conditions.reason in body should match '^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$'", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2021-12-07T08:51:24Z", "message": "Error ensuring the cluster version is up to date: ClusterVersion.config.openshift.io \"version\" is invalid: status.conditionalUpdates.conditions.reason: Invalid value: \"Multiple releases\": status.conditionalUpdates.conditions.reason in body should match '^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$'", "status": "False", "type": "Progressing" }, { "lastTransitionTime": "2021-12-08T05:50:33Z", "status": "True", "type": "RetrievedUpdates" } ] Version-Release number of the following components: 4.10.0-0.nightly-2021-12-03-213835 How reproducible: Always Steps to Reproduce: 1. Install a cluster 2. Patch to use the dummy cincinnati # oc patch clusterversion/version --patch '{"spec":{"upstream":"https://raw.githubusercontent.com/shellyyang1989/upgrade-cincy/master/cincy-conditional-edge-invalid-multi-payloads.json"}}' --type=merge clusterversion.config.openshift.io/version patched Actual results: status.Available is set to False Expected results: It would be better to prompt the invalid risk name error but keep the cluster available. Additional info: Please attach logs from ansible-playbook with the -vvv flag
CVO log is available online https://drive.google.com/file/d/1zcPyDqTePN6Hdey4je2Y6pqCXkKkG2U6/view?usp=sharing.
This turned out to be trickier than just the invalid risk name, since we have other properties that are only validated in the Kube-API-server today. We've opened [1] to discuss and pick up a plan that covers all of them (or decides we're ok leaving them uncovered, because the risk of graph-data admins creating this invalid data seems low). I'm closing this WONTFIX for now, but depending on how [1] works out, we may end up re-opening later. [1]: https://issues.redhat.com/browse/OTA-537