Bug 1778723

Summary: HCO fails to reconcile resources in a different namespace
Product: Container Native Virtualization (CNV) Reporter: Simone Tiraboschi <stirabos>
Component: InstallationAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED ERRATA QA Contact: zhe peng <zpeng>
Severity: high Docs Contact:
Priority: high    
Version: 2.2.0CC: cnv-qe-bugs, danken, fdeutsch, ncredi, rhallise, sgordon, talayan
Target Milestone: ---   
Target Release: 2.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: hco-bundle-registry-container-v2.2.0-225 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-30 16:27:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Simone Tiraboschi 2019-12-02 11:45:06 UTC
Description of problem:
In HCO logs we see something like:

{"level":"info","ts":1574269878.4764912,"logger":"controller_hyperconverged","msg":"Reconciling HyperConverged operator","Request.Namespace":"openshift","Request.Name":"hyperconverged-cluster"}
{"level":"info","ts":1574269878.4765778,"logger":"controller_hyperconverged","msg":"No HyperConverged resource","Request.Namespace":"openshift","Request.Name":"hyperconverged-cluster"}
{"level":"info","ts":1574269878.4878688,"logger":"controller_hyperconverged","msg":"Reconciling HyperConverged operator","Request.Namespace":"openshift","Request.Name":"hyperconverged-cluster"}
{"level":"info","ts":1574269878.4879391,"logger":"controller_hyperconverged","msg":"No HyperConverged resource","Request.Namespace":"openshift","Request.Name":"hyperconverged-cluster"}

The issue is that all of our secondary watches (all the watches that aren't on the HyperConverged resource itself) are added to the queue for the owner. But the stuff that is cluster-wide or in different namespaces like 'openshift' vs 'openshift-cnv' -- the request that is added to the queue is in a different place than the HyperConverged resource.


Version-Release number of selected component (if applicable):
2.2.0

How reproducible:
100%

Steps to Reproduce:
1. deploy CNV
2. check HCO logs
3.

Actual results:
{"level":"info","ts":1574269878.4765778,"logger":"controller_hyperconverged","msg":"No HyperConverged resource","Request.Namespace":"openshift","Request.Name":"hyperconverged-cluster"}

Expected results:
Successful reconciliation

Additional info:

Comment 1 Simone Tiraboschi 2019-12-02 13:30:07 UTC
Workaround:

  echo -e "spec:\n  test: 1234" > test.yaml
  oc patch -n openshift KubevirtCommonTemplatesBundle common-templates-hyperconverged-cluster --type merge --patch "$(cat patch-file.yaml)"
  oc patch -n openshift-cnv KubevirtNodeLabellerBundles node-labeller-hyperconverged-cluster --type merge --patch "$(cat patch-file.yaml)"

Comment 2 Dan Kenigsberg 2020-01-07 06:56:25 UTC
We must not release cnv-2.2 without this fix, as it can cause data loss.

Comment 3 Dan Kenigsberg 2020-01-07 07:14:13 UTC
(In reply to Dan Kenigsberg from comment #2)
> We must not release cnv-2.2 without this fix, as it can cause data loss.

Sorry, this bug is related to the data loss bug 1786475, but on its own it is not a blocker. Typically, underlying CRs are unlikely to be modified on their own volition.

Comment 4 zhe peng 2020-01-15 06:45:32 UTC
Deploy CNV2.2 on psi env.
check HCO logs

"level":"info","ts":1579044558.9341733,"logger":"controller_hyperconverged","msg":"Reconciling HyperConverged operator","Request.Namespace":"openshift-cnv","Request.Name":"hyperconverged-cluster"}
{"level":"info","ts":1579044558.9342268,"logger":"controller_hyperconverged","msg":"KubeVirt config already exists","Request.Namespace":"openshift-cnv","Request.Name":"hyperconverged-cluster","KubeVirtConfig.Namespace":"openshift-cnv","KubeVirtConfig.Name":"kubevirt-config"}

move to verified

Comment 5 zhe peng 2020-01-15 08:02:16 UTC
update verfied version
Client Version: 4.3.0-0.nightly-2020-01-14-043441
Server Version: 4.3.0-0.nightly-2020-01-14-043441
Kubernetes Version: v1.16.2

Comment 7 errata-xmlrpc 2020-01-30 16:27:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0307