Bug 2030186 - cv.status.conditions.Available goes to False if conditionalEdges have invalid risk names
Summary: cv.status.conditions.Available goes to False if conditionalEdges have invalid...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: W. Trevor King
QA Contact: Yang Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-08 07:03 UTC by Yang Yang
Modified: 2021-12-09 06:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-09 06:06:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Yang Yang 2021-12-08 07:03:19 UTC
Description of problem:
If a cluster is patched to a dummy cincinnati which has invalid risk names, CVO sets the Available condition to False which is confusing. It would be better to prompt the invalid risk name error but keep the cluster available.

# oc get clusterversion/version -ojson | jq -r .status.conditions
[
  {
    "lastTransitionTime": "2021-12-08T05:53:39Z",
    "status": "False",
    "type": "Available"
  },
  {
    "lastTransitionTime": "2021-12-08T05:53:39Z",
    "message": "ClusterVersion.config.openshift.io \"version\" is invalid: status.conditionalUpdates.conditions.reason: Invalid value: \"Multiple releases\": status.conditionalUpdates.conditions.reason in body should match '^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$'",
    "status": "True",
    "type": "Failing"
  },
  {
    "lastTransitionTime": "2021-12-07T08:51:24Z",
    "message": "Error ensuring the cluster version is up to date: ClusterVersion.config.openshift.io \"version\" is invalid: status.conditionalUpdates.conditions.reason: Invalid value: \"Multiple releases\": status.conditionalUpdates.conditions.reason in body should match '^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$'",
    "status": "False",
    "type": "Progressing"
  },
  {
    "lastTransitionTime": "2021-12-08T05:50:33Z",
    "status": "True",
    "type": "RetrievedUpdates"
  }
]

Version-Release number of the following components:
4.10.0-0.nightly-2021-12-03-213835

How reproducible:
Always

Steps to Reproduce:
1. Install a cluster
2. Patch to use the dummy cincinnati
# oc patch clusterversion/version --patch '{"spec":{"upstream":"https://raw.githubusercontent.com/shellyyang1989/upgrade-cincy/master/cincy-conditional-edge-invalid-multi-payloads.json"}}' --type=merge
clusterversion.config.openshift.io/version patched

Actual results:
status.Available is set to False

Expected results:
It would be better to prompt the invalid risk name error but keep the cluster available.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Yang Yang 2021-12-08 07:53:05 UTC
CVO log is available online https://drive.google.com/file/d/1zcPyDqTePN6Hdey4je2Y6pqCXkKkG2U6/view?usp=sharing.

Comment 2 W. Trevor King 2021-12-09 06:06:41 UTC
This turned out to be trickier than just the invalid risk name, since we have other properties that are only validated in the Kube-API-server today.  We've opened [1] to discuss and pick up a plan that covers all of them (or decides we're ok leaving them uncovered, because the risk of graph-data admins creating this invalid data seems low).  I'm closing this WONTFIX for now, but depending on how [1] works out, we may end up re-opening later.

[1]: https://issues.redhat.com/browse/OTA-537


Note You need to log in before you can comment on or make changes to this bug.