Bug 1823701
| Summary: | [CNV-2.4] when a single component is failing, HCO can continue reporting outdated negative conditions also on other components | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Lukas Bednar <lbednar> |
| Component: | Installation | Assignee: | Nahshon Unna-Tsameret <nunnatsa> |
| Status: | CLOSED ERRATA | QA Contact: | Lukas Bednar <lbednar> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 2.4.0 | CC: | cnv-qe-bugs, fdeutsch, lbednar, ncredi, ocohen, stirabos |
| Target Milestone: | --- | Keywords: | Regression, TestBlocker |
| Target Release: | 2.4.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | hyperconverged-cluster-operator-container-v2.4.0-40 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-07-28 19:09:47 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This is just a kind of false positive on HCO side. The real issue was only https://bugzilla.redhat.com/1823699 on network component. But due to how condition handling is currently implemented in HCO, HCO will continue reporting past bad conditions on other components (even if now fixed) until all the components are fully positive. See: https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/pkg/controller/hyperconverged/hyperconverged_controller.go#L334 HCO will reset all of its conditions in a single shot if and only if no components reported any negative condition. I did not experience this issue in past 30 builds (last build was v2.3.0-486). I believe we can consider this issue as solved. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3194 |
Description of problem: [cnv-qe-jenkins@cnv-executor-lbednar ~]$ oc get KubevirtCommonTemplatesBundle -n openshift common-templates-kubevirt-hyperconverged -o yaml apiVersion: ssp.kubevirt.io/v1 kind: KubevirtCommonTemplatesBundle metadata: creationTimestamp: 2020-04-14T09:28:02Z generation: 1 labels: app: kubevirt-hyperconverged name: common-templates-kubevirt-hyperconverged namespace: openshift resourceVersion: "50163" selfLink: /apis/ssp.kubevirt.io/v1/namespaces/openshift/kubevirtcommontemplatesbundles/common-templates-kubevirt-hyperconverged uid: 340168c4-6425-405f-b982-52d1bdcfb181 spec: {} status: conditions: - lastTransitionTime: 2020-04-14T09:32:49Z message: Templates progressing. reason: progressing status: "False" type: Progressing - lastTransitionTime: 2020-04-14T09:32:49Z message: Common templates available. reason: available status: "True" type: Available - ansibleResult: changed: 0 completion: 2020-04-14T10:00:38.936157 failures: 0 ok: 11 skipped: 0 lastTransitionTime: 2020-04-14T09:28:02Z message: Awaiting next reconciliation reason: Successful status: "True" type: Running Version-Release number of selected component (if applicable): CNV BUNDLE: 726.0.0 OCP-4.5 How reproducible: 100 Steps to Reproduce: 1. deploying CNV 2. HCO is not getting ready waiting for KubevirtCommonTemplatesBundle 3. Actual results: CNV is not deployed in 15min timeout Expected results: CNV gets deployed Additional info: I consulted it with Simone T. and he claims that there is some problem with common-templates-kubevirt-hyperconverged is not progressing anymore and it's just a dirty condition on hco.