Bug 1853133
| Summary: | [CNV-2.4] Deployment fails on KubeVirtMetricsAggregationNotAvailable | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Lukas Bednar <lbednar> |
| Component: | SSP | Assignee: | Karel Šimon <ksimon> |
| Status: | CLOSED ERRATA | QA Contact: | Israel Pinto <ipinto> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 2.4.0 | CC: | cnv-qe-bugs, dollierp, irose, lbednar, ncredi, nunnatsa, oyahud, rnetser, stirabos, talayan |
| Target Milestone: | --- | Keywords: | Regression, TestBlocker |
| Target Release: | 2.4.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | kubevirt-ssp-operator-container-v2.4.0-66, hco-bundle-registry-container-v2.3.0-445 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-07-28 19:10:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
What is the timeout? how log it run before failing the test? *** Bug 1853185 has been marked as a duplicate of this bug. *** After more than 6 hours the issue is still there,
and indeed KubeVirtMetricsAggregationNotAvailable is not available:
[cloud-user@ocp-psi-executor ~]$ oc get KubeVirtMetricsAggregation -n openshift-cnv metrics-aggregation-kubevirt-hyperconverged -o yaml
apiVersion: ssp.kubevirt.io/v1
kind: KubevirtMetricsAggregation
metadata:
creationTimestamp: "2020-07-02T00:58:04Z"
generation: 1
labels:
app: kubevirt-hyperconverged
managedFields:
- apiVersion: ssp.kubevirt.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.: {}
f:app: {}
f:ownerReferences: {}
f:spec: {}
manager: hyperconverged-cluster-operator
operation: Update
time: "2020-07-02T00:58:04Z"
- apiVersion: ssp.kubevirt.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
.: {}
f:conditions: {}
manager: ansible-operator
operation: Update
time: "2020-07-02T07:30:06Z"
name: metrics-aggregation-kubevirt-hyperconverged
namespace: openshift-cnv
ownerReferences:
- apiVersion: hco.kubevirt.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: HyperConverged
name: kubevirt-hyperconverged
uid: 5231311f-39f4-42bb-af2e-1a4d1533c0bb
resourceVersion: "8270927"
selfLink: /apis/ssp.kubevirt.io/v1/namespaces/openshift-cnv/kubevirtmetricsaggregations/metrics-aggregation-kubevirt-hyperconverged
uid: ab78b85b-4d1f-447c-ac71-ce9fd006a52f
spec: {}
status:
conditions:
- lastTransitionTime: "2020-07-02T07:30:01Z"
message: Running reconciliation
reason: Running
status: "False"
type: Running
- ansibleResult:
changed: 0
completion: 2020-07-02T07:30:05.990898
failures: 1
ok: 2
skipped: 0
lastTransitionTime: "2020-07-02T07:30:06Z"
message: |
The task includes an option with an undefined variable. The error was: 'operator_version' is undefined
The error appears to be in '/opt/ansible/roles/KubevirtMetricsAggregation/tasks/main.yml': line 2, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
---
- name: Set operatorVersion and targetVersion
^ here
reason: Failed
status: "True"
type: Failure
In SSP operator logs:
--------------------------- Ansible Task StdOut -------------------------------
TASK [Set operatorVersion and targetVersion] ********************************
fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'operator_version' is undefined\n\nThe error appears to be in '/opt/ansible/roles/KubevirtCommonTemplatesBundle/tasks/main.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: Set operatorVersion and targetVersion\n ^ here\n"}
-------------------------------------------------------------------------------
{"level":"error","ts":1593675251.394438,"logger":"logging_event_handler","msg":"","name":"common-templates-kubevirt-hyperconverged","namespace":"openshift","gvk":"ssp.kubevirt.io/v1, Kind=KubevirtCommonTemplatesBundle","event_type":"runner_on_failed","job":"8769843336475300403","EventData.Task":"Set operatorVersion and targetVersion","EventData.TaskArgs":"","EventData.FailedTaskPath":"/opt/ansible/roles/KubevirtCommonTemplatesBundle/tasks/main.yml:2","error":"[playbook task failed]","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/events.loggingEventHandler.Handle\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/events/log_events.go:87"}
{"level":"error","ts":1593675251.5331721,"logger":"runner","msg":"\u001b[0;34mansible-playbook 2.9.10\u001b[0m\r\n\u001b[0;34m config file = /etc/ansible/ansible.cfg\u001b[0m\r\n\u001b[0;34m configured module search path = [u'/opt/ansible/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']\u001b[0m\r\n\u001b[0;34m ansible python module location = /usr/lib/python2.7/site-packages/ansible\u001b[0m\r\n\u001b[0;34m executable location = /usr/bin/ansible-playbook\u001b[0m\r\n\u001b[0;34m python version = 2.7.5 (default, Sep 26 2019, 13:23:47) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]\u001b[0m\r\n\u001b[0;34mUsing /etc/ansible/ansible.cfg as config file\u001b[0m\r\n\r\nPLAYBOOK: kubevirtnodelabeller.yaml ********************************************\n\u001b[0;34m1 plays in /opt/ansible/kubevirtnodelabeller.yaml\u001b[0m\n\r\nPLAY [localhost] ***************************************************************\n\r\nTASK [Gathering Facts] *********************************************************\r\n\u001b[1;30mtask path: /opt/ansible/kubevirtnodelabeller.yaml:1\u001b[0m\n\u001b[0;32mok: [localhost]\u001b[0m\n\u001b[0;34mMETA: ran handlers\u001b[0m\n\r\nTASK [KubevirtCircuitBreaker : Extract the CR info] ****************************\r\n\u001b[1;30mtask path: /opt/ansible/roles/KubevirtCircuitBreaker/tasks/main.yml:3\u001b[0m\n\u001b[0;32mok: [localhost] => {\"ansible_facts\": {\"cr_info\": {\"apiVersion\": \"ssp.kubevirt.io/v1\", \"kind\": \"KubevirtNodeLabellerBundle\", \"metadata\": {\"creationTimestamp\": \"2020-07-02T00:58:04Z\", \"generation\": 1, \"labels\": {\"app\": \"kubevirt-hyperconverged\"}, \"managedFields\": [{\"apiVersion\": \"ssp.kubevirt.io/v1\", \"fieldsType\": \"FieldsV1\", \"fieldsV1\": {\"f:metadata\": {\"f:labels\": {\".\": {}, \"f:app\": {}}, \"f:ownerReferences\": {}}, \"f:spec\": {}}, \"manager\": \"hyperconverged-cluster-operator\", \"operation\": \"Update\", \"time\": \"2020-07-02T00:58:04Z\"}, {\"apiVersion\": \"ssp.kubevirt.io/v1\", \"fieldsType\": \"FieldsV1\", \"fieldsV1\": {\"f:status\": {\".\": {}, \"f:conditions\": {}}}, \"manager\": \"ansible-operator\", \"operation\": \"Update\", \"time\": \"2020-07-02T07:34:01Z\"}], \"name\": \"node-labeller-kubevirt-hyperconverged\", \"namespace\": \"openshift-cnv\", \"ownerReferences\": [{\"apiVersion\": \"hco.kubevirt.io/v1alpha1\", \"blockOwnerDeletion\": true, \"controller\": true, \"kind\": \"HyperConverged\", \"name\": \"kubevirt-hyperconverged\", \"uid\": \"5231311f-39f4-42bb-af2e-1a4d1533c0bb\"}], \"resourceVersion\": \"8273277\", \"selfLink\": \"/apis/ssp.kubevirt.io/v1/namespaces/openshift-cnv/kubevirtnodelabellerbundles/node-labeller-kubevirt-hyperconverged\", \"uid\": \"c22dd1a9-f565-4832-9d9a-2965410c41e2\"}, \"spec\": {}, \"status\": {\"conditions\": [{\"ansibleResult\": {\"changed\": 0, \"completion\": \"2020-07-02T07:17:21.089307\", \"failures\": 1, \"ok\": 6, \"skipped\": 0}, \"lastTransitionTime\": \"2020-07-02T07:17:21Z\", \"message\": \"The task includes an option with an undefined variable. The error was: 'operator_version' is undefined\\n\\nThe error appears to be in '/opt/ansible/roles/KubevirtNodeLabeller/tasks/main.yml': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Set operatorVersion and targetVersion\\n ^ here\\n\", \"reason\": \"Failed\", \"status\": \"False\", \"type\": \"Failure\"}, {\"lastTransitionTime\": \"2020-07-02T07:34:01Z\", \"message\": \"Running reconciliation\", \"reason\": \"Running\", \"status\": \"True\", \"type\": \"Running\"}]}}}, \"changed\": false}\u001b[0m\n\r\nTASK [KubevirtCircuitBreaker : Extract the disable info] ***********************\r\n\u001b[1;30mtask path: /opt/ansible/roles/KubevirtCircuitBreaker/tasks/main.yml:6\u001b[0m\n\u001b[0;32mok: [localhost] => {\"ansible_facts\": {\"is_paused\": false}, \"changed\": false}\u001b[0m\n\u001b[0;34mMETA: \u001b[0m\n\r\nTASK [KubevirtRepoInfo : Extract the image name] *******************************\r\n\u001b[1;30mtask path: /opt/ansible/roles/KubevirtRepoInfo/tasks/main.yml:3\u001b[0m\n\u001b[0;32mok: [localhost] => {\"ansible_facts\": {\"operator_image_name\": \"registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-kubevirt-ssp-operator@sha256:d4b4133c2c3e9402c895856a968495a3539b32c87e3947268bbdb9763994956a\"}, \"changed\": false}\u001b[0m\n\r\nTASK [KubevirtRepoInfo : Extract the SSP registry] *****************************\r\n\u001b[1;30mtask path: /opt/ansible/roles/KubevirtRepoInfo/tasks/main.yml:6\u001b[0m\n\u001b[0;32mok: [localhost] => {\"ansible_facts\": {\"image_name_prefix\": \"container-native-virtualization-\", \"ssp_registry\": \"registry-proxy.engineering.redhat.com/rh-osbs\"}, \"changed\": false}\u001b[0m\n\r\nTASK [KubevirtRepoInfo : Show the SSP registry] ********************************\r\n\u001b[1;30mtask path: /opt/ansible/roles/KubevirtRepoInfo/tasks/main.yml:10\u001b[0m\n\u001b[0;32mok: [localhost] => {\u001b[0m\r\n\u001b[0;32m \"msg\": \"registry: registry-proxy.engineering.redhat.com/rh-osbs prefix: container-native-virtualization-\"\u001b[0m\r\n\u001b[0;32m}\u001b[0m\n\r\nTASK [KubevirtNodeLabeller : Set operatorVersion and targetVersion] ************\r\n\u001b[1;30mtask path: /opt/ansible/roles/KubevirtNodeLabeller/tasks/main.yml:2\u001b[0m\n\u001b[0;31mfatal: [localhost]: FAILED! => {\"msg\": \"The task includes an option with an undefined variable. The error was: 'operator_version' is undefined\\n\\nThe error appears to be in '/opt/ansible/roles/KubevirtNodeLabeller/tasks/main.yml': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Set operatorVersion and targetVersion\\n ^ here\\n\"}\u001b[0m\n\r\nPLAY RECAP *********************************************************************\r\n\u001b[0;31mlocalhost\u001b[0m : \u001b[0;32mok=6 \u001b[0m changed=0 unreachable=0 \u001b[0;31mfailed=1 \u001b[0m skipped=0 rescued=0 ignored=0 \r\n\n","job":"583103985801850551","name":"node-labeller-kubevirt-hyperconverged","namespace":"openshift-cnv","error":"exit status 2","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/runner.(*runner).Run.func1\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/runner/runner.go:239"}
--------------------------- Ansible Task Status Event StdOut -----------------
PLAY RECAP *********************************************************************
localhost : ok=6 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
This looks definitively like a bug on SSP side, moving there.
I think that the issue is caused by a missing value for OPERATOR_VERSION in the deployment for SSP in the CSV. If so it should be already addressed by: https://github.com/MarSik/kubevirt-ssp-operator/pull/198 The issue is with a missing variable in the _defaults.yaml ansible file downstream, working on a fix right now. seeing little bit different message with HCO-v2.3.0-434
- lastHeartbeatTime: "2020-07-02T10:47:10Z"
lastTransitionTime: "2020-07-02T09:59:39Z"
message: missing "Available" condition
reason: KubeVirtMetricsAggregationNotAvailable
status: "False"
type: Available
The new ssp-operator build contains the missing variable, moved to ON_QA Still failing with kubevirt-ssp-operator:v2.4.0-65:
> Failed to import the required Python library (openshift >= 0.9.2) on kubevirt-ssp-operator-69c7fdb484-s97hc's Python /usr/bin/python2.
> This is required for apply.
> Please read module documentation and install in the appropriate location.
> If the required library is installed, but Ansible is using the wrong Python interpreter, please consult the documentation on ansible_python_interpreter
According to kubernetes ansible collection, openshift python module required version is 0.9.2 (See file /opt/ansible/.ansible/collections/ansible_collections/community/kubernetes/plugins/module_utils/raw.py).
However, kubevirt-ssp-operator image contains version 0.8.11.
> rpm -q python2-openshift
> python2-openshift-0.8.11-1.el7.noarch
Where do you see this error? I'm running kubevirt-ssp-operator-container-v2.4.0-65 and it is working fine I managed to reproduce the issue, will work on a solution on sunday Upstream workaround PR can be found here: https://github.com/MarSik/kubevirt-ssp-operator/pull/200 operator-sdk bug can be found here: https://bugzilla.redhat.com/show_bug.cgi?id=1853915 A new SSP build is available: kubevirt-ssp-operator-container-v2.4.0-66, not sure if we have to wait for an HCO build as well Since CVP is broken ATM, I deployed CNV from OSBS instead of Brew CSV kubevirt-hyperconverged-operator.v2.4.0: - createdAt: "2020-07-06 17:46:10" - hyperconverged-cluster-operator:v2.4.0-62 - kubevirt-ssp-operator:v2.4.0-66 CNV deployment is successful with those versions. New builds were provided (see Fixed In Version), moving to ON_QA. Verified on OCP 4.5.0-rc., SSP v2.4.0-66:
Clean installation and after upgrade
$ oc get KubeVirtMetricsAggregation -n openshift-cnv metrics-aggregation-kubevirt-hyperconverged -o yaml
apiVersion: ssp.kubevirt.io/v1
kind: KubevirtMetricsAggregation
metadata:
creationTimestamp: "2020-07-08T11:07:30Z"
generation: 1
labels:
app: kubevirt-hyperconverged
managedFields:
- apiVersion: ssp.kubevirt.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.: {}
f:app: {}
f:ownerReferences: {}
f:spec: {}
manager: hyperconverged-cluster-operator
operation: Update
time: "2020-07-08T11:07:30Z"
- apiVersion: ssp.kubevirt.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:observedVersion: {}
f:operatorVersion: {}
f:targetVersion: {}
manager: Swagger-Codegen
operation: Update
time: "2020-07-08T11:08:21Z"
- apiVersion: ssp.kubevirt.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
.: {}
f:conditions: {}
manager: ansible-operator
operation: Update
time: "2020-07-08T11:08:23Z"
name: metrics-aggregation-kubevirt-hyperconverged
namespace: openshift-cnv
ownerReferences:
- apiVersion: hco.kubevirt.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: HyperConverged
name: kubevirt-hyperconverged
uid: 463aa27a-76e5-43c5-a88c-f8792c04724a
resourceVersion: "119186"
selfLink: /apis/ssp.kubevirt.io/v1/namespaces/openshift-cnv/kubevirtmetricsaggregations/metrics-aggregation-kubevirt-hyperconverged
uid: 203a87d8-7828-462e-9be9-490ee4a990d5
spec: {}
status:
conditions:
- lastTransitionTime: "2020-07-08T11:08:23Z"
message: KubevirtMetricsAggregation is available.
reason: available
status: "True"
type: Available
- lastTransitionTime: "2020-07-08T11:08:23Z"
message: KubevirtMetricsAggregation progressing
reason: progressing
status: "False"
type: Progressing
- lastTransitionTime: "2020-07-08T11:08:23Z"
message: KubevirtMetricsAggregation degraded
reason: degraded
status: "False"
type: Degraded
- ansibleResult:
changed: 6
completion: 2020-07-08T11:08:23.59718
failures: 0
ok: 9
skipped: 0
lastTransitionTime: "2020-07-08T11:07:57Z"
message: Awaiting next reconciliation
reason: Successful
status: "True"
type: Running
observedVersion: v2.4.0
operatorVersion: v2.4.0
targetVersion: v2.4.0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3194 |
Description of problem: - lastHeartbeatTime: "2020-07-02T04:35:15Z" lastTransitionTime: "2020-07-02T00:58:04Z" message: missing "Available" condition reason: KubeVirtMetricsAggregationNotAvailable status: "False" type: Available Version-Release number of selected component (if applicable): HCO-v2.3.0-433 OCP-4.5 How reproducible: 100 Steps to Reproduce: 1. Deploy CNV 2. 3. Actual results: Failing on KubeVirtMetricsAggregationNotAvailable Expected results: CNV deployed successfully Additional info: