Bug 1711964
| Summary: | "Error while reconciling" and "the update could not be applied" many hours after upgrade reported complete/successful | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> | ||||
| Component: | Cluster Version Operator | Assignee: | Abhinav Dahiya <adahiya> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | liujia <jiajliu> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 4.1.0 | CC: | aos-bugs, bleanhar, erich, jokerman, jupierce, mmccomas, wking | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.3.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-09-30 17:17:59 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Mike Fiedler
2019-05-20 14:07:08 UTC
Unfortunately the must-gather logs aren't going to contain any actionable info in this situation. This is what we need to fix (though in 4.1.z). At the same time, we'll likely improve the wording in the CVO to make it more clear that the problem is that another operator is flapping. > At the same time, we'll likely improve the wording in the CVO to make it more clear that the problem is that another operator is flapping. Some initial groundwork for this in https://github.com/openshift/cluster-version-operator/pull/194 Created attachment 1581485 [details]
4.1.2 listings
Based on https://bugzilla.redhat.com/attachment.cgi?id=1581485 CVO is correctly reporting that it's failing to make progress on reconcile due to cloud-creds-operator. the summary for `oc get clusterversion version` cannot be all encompassing. It provides enough details to go look for details in the actual object. I would like to see concrete examples of status updates in the object in contrast to the expected message from users. It seems like https://bugzilla.redhat.com/show_bug.cgi?id=1714484 was the root cause of my cloud credential operator failing. So the reconciling message error was valid. An area to consider is the message: "Cluster version is quay.io/openshift-release-dev/ocp-release:4.1.2" which reads as success to a typical user and seems to contradict: "Error while reconciling 4.1.2: the update could not be applied". Consistently displaying the error message or concatenating the messages would have left no room for misunderstanding. Since their is no ERROR column on clusterversion output, this message presently serves as an important UX for a human operator to sanity check the CVO's state. [ec2-user us-east-1 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version True False 2d20h Cluster version is quay.io/openshift-release-dev/ocp-release:4.1.2 This is working as intended. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |