Bug 1823701

Summary: [CNV-2.4] when a single component is failing, HCO can continue reporting outdated negative conditions also on other components
Product: Container Native Virtualization (CNV) Reporter: Lukas Bednar <lbednar>
Component: InstallationAssignee: Nahshon Unna-Tsameret <nunnatsa>
Status: CLOSED ERRATA QA Contact: Lukas Bednar <lbednar>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 2.4.0CC: cnv-qe-bugs, fdeutsch, lbednar, ncredi, ocohen, stirabos
Target Milestone: ---Keywords: Regression, TestBlocker
Target Release: 2.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: hyperconverged-cluster-operator-container-v2.4.0-40 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-28 19:09:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lukas Bednar 2020-04-14 10:08:01 UTC
Description of problem:

[cnv-qe-jenkins@cnv-executor-lbednar ~]$ oc get KubevirtCommonTemplatesBundle -n openshift   common-templates-kubevirt-hyperconverged -o yaml
apiVersion: ssp.kubevirt.io/v1
kind: KubevirtCommonTemplatesBundle
metadata:
  creationTimestamp: 2020-04-14T09:28:02Z
  generation: 1
  labels:
    app: kubevirt-hyperconverged
  name: common-templates-kubevirt-hyperconverged
  namespace: openshift
  resourceVersion: "50163"
  selfLink: /apis/ssp.kubevirt.io/v1/namespaces/openshift/kubevirtcommontemplatesbundles/common-templates-kubevirt-hyperconverged
  uid: 340168c4-6425-405f-b982-52d1bdcfb181
spec: {}
status:
  conditions:
  - lastTransitionTime: 2020-04-14T09:32:49Z
    message: Templates progressing.
    reason: progressing
    status: "False"
    type: Progressing
  - lastTransitionTime: 2020-04-14T09:32:49Z
    message: Common templates available.
    reason: available
    status: "True"
    type: Available
  - ansibleResult:
      changed: 0
      completion: 2020-04-14T10:00:38.936157
      failures: 0
      ok: 11
      skipped: 0
    lastTransitionTime: 2020-04-14T09:28:02Z
    message: Awaiting next reconciliation
    reason: Successful
    status: "True"
    type: Running



Version-Release number of selected component (if applicable):
CNV BUNDLE: 726.0.0
OCP-4.5


How reproducible: 100


Steps to Reproduce:
1. deploying CNV
2. HCO is not getting ready waiting for KubevirtCommonTemplatesBundle
3.

Actual results: CNV is not deployed in 15min timeout


Expected results: CNV gets deployed


Additional info:

I consulted it with Simone T. and he claims that there is some problem with common-templates-kubevirt-hyperconverged is not progressing anymore and it's just a dirty condition on hco.

Comment 1 Simone Tiraboschi 2020-04-14 13:03:19 UTC
This is just a kind of false positive on HCO side.
The real issue was only https://bugzilla.redhat.com/1823699 on network component.

But due to how condition handling is currently implemented in HCO,
HCO will continue reporting past bad conditions on other components (even if now fixed) until all the components are fully positive.
See:
https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/pkg/controller/hyperconverged/hyperconverged_controller.go#L334
HCO will reset all of its conditions in a single shot if and only if no components reported any negative condition.

Comment 3 Lukas Bednar 2020-07-22 12:00:15 UTC
I did not experience this issue in past 30 builds (last build was v2.3.0-486).
I believe we can consider this issue as solved.

Comment 6 errata-xmlrpc 2020-07-28 19:09:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3194