Bug 1877698 - ssp operator can't patch the placement api
Summary: ssp operator can't patch the placement api
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: SSP
Version: 2.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 2.5.0
Assignee: Karel Šimon
QA Contact: Or Bairey-Sehayek
URL:
Whiteboard:
Depends On: 1889401
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-10 08:40 UTC by Karel Šimon
Modified: 2020-11-17 13:24 UTC (History)
3 users (show)

Fixed In Version: kubevirt-ssp-operator-container-v2.5.0-50
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-17 13:24:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt-ssp-operator pull 239 0 None closed Fix placement api 2020-11-09 11:41:39 UTC
Red Hat Product Errata RHEA-2020:5127 0 None None None 2020-11-17 13:24:41 UTC

Description Karel Šimon 2020-09-10 08:40:47 UTC
Description of problem:
Operator patches the TV deployment and NL daemonset using merge patch or json patch, the operator seems to not be able to remove or edit the placement fields if they exist.

For example:

A CR is created with placement API - works as expected
A CR is created without placement API - works as expected
A CR is created without placement API and then updated to have placement API - works as expected
A CR with placement API is being updated to remove or update the placement fields - the operator cannot seem to do it

Comment 1 Or Bairey-Sehayek 2020-10-16 20:14:53 UTC
After editing HCO, nodeSelector parameters work fine.

When attempting to reverse changes, the following error is received:

error: hyperconvergeds.hco.kubevirt.io "kubevirt-hyperconverged" could not be patched: Internal error occurred: failed calling webhook "validate-hco.kubevirt.io": Post "https://hco-operator-service.openshift-cnv.svc:4343/validate-hco-kubevirt-io-v1beta1-hyperconverged?timeout=30s": no endpoints available for service "hco-operator-service"
You can run `oc replace -f /tmp/oc-edit-ikelt.yaml` to try this update again.

The only way I have found to fix this is to redeploy CNV.

Comment 2 Or Bairey-Sehayek 2020-10-19 15:19:16 UTC
(In reply to Or Bairey-Sehayek from comment #1)
> After editing HCO, nodeSelector parameters work fine.
> 
> When attempting to reverse changes, the following error is received:
> 
> error: hyperconvergeds.hco.kubevirt.io "kubevirt-hyperconverged" could not
> be patched: Internal error occurred: failed calling webhook
> "validate-hco.kubevirt.io": Post
> "https://hco-operator-service.openshift-cnv.svc:4343/validate-hco-kubevirt-
> io-v1beta1-hyperconverged?timeout=30s": no endpoints available for service
> "hco-operator-service"
> You can run `oc replace -f /tmp/oc-edit-ikelt.yaml` to try this update again.
> 
> The only way I have found to fix this is to redeploy CNV.

Changes made:

$ oc -n openshift-cnv edit hyperconverged

BEFORE:

spec:
  infra: {}
  workloads: {}

AFTER:

spec:
  infra:
    nodePlacement:
      nodeSelector:
        foo: bar
  workloads:
    nodePlacement:
      nodeSelector:
        foo: bar


It should be noted that I tried doing this twice: once with no nodes labelled foo=bar and once with a worker labelled foo=bar. The result was the same: attempting to revert the change caused the error above.

Opened a separate bug about this https://bugzilla.redhat.com/show_bug.cgi?id=1889401

Comment 3 Kedar Bidarkar 2020-11-04 18:33:36 UTC
Checking for Infra/template-validator
---------------------------------------
[kbidarka@localhost ocp-python-wrapper]$ oc get hyperconverged kubevirt-hyperconverged -n openshift-cnv -o yaml 
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"hco.kubevirt.io/v1beta1","kind":"HyperConverged","metadata":{"annotations":{},"name":"kubevirt-hyperconverged","namespace":"openshift-cnv"},"spec":{"bareMetalPlatform":true}}
  finalizers:
  ...
  name: kubevirt-hyperconverged
  namespace: openshift-cnv
spec:
  bareMetalPlatform: true
  infra:
    nodePlacement:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: nodeType
                operator: In
                values:
                - infra
  version: v2.5.0
  workloads: {}
---
[kbidarka@localhost ocp-python-wrapper]$ oc get pods -n openshift-cnv | grep validator 
virt-template-validator-6876b65456-z6g5r              0/1     Pending   0          54s
virt-template-validator-69df488767-cg9cv              1/1     Running   0          8m36s
virt-template-validator-69df488767-wxn8h              1/1     Running   0          8m40s

[kbidarka@localhost ocp-python-wrapper]$ oc get deployment -n openshift-cnv virt-template-validator -o yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
...
spec:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: nodeType
                operator: In
                values:
                - infra
After reverting:
-----------------
[kbidarka@localhost ocp-python-wrapper]$ oc get pods -n openshift-cnv | grep validator
virt-template-validator-75f6c79bdf-9pdsk              1/1     Running   0          28s
virt-template-validator-75f6c79bdf-zw4ph              1/1     Running   0          24s
--------------------------------------------------------------------------------------------------------------------------------------------------
Checking for Workloads/node-labeller
------------------------------------
[kbidarka@localhost ocp-python-wrapper]$ oc get hyperconverged kubevirt-hyperconverged -n openshift-cnv -o yaml 
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"hco.kubevirt.io/v1beta1","kind":"HyperConverged","metadata":{"annotations":{},"name":"kubevirt-hyperconverged","namespace":"openshift-cnv"},"spec":{"bareMetalPlatform":true}}
  finalizers:
  ...
  name: kubevirt-hyperconverged
  namespace: openshift-cnv
spec:
  bareMetalPlatform: true
  infra: {}
  version: v2.5.0
  workloads:
    nodePlacement:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: nodeType
                operator: In
                values:
                - infra
[kbidarka@localhost ocp-python-wrapper]$ oc get pods -n openshift-cnv  | grep labeller
[kbidarka@localhost ocp-python-wrapper]$ 

After reverting the hyperconverged CR 
----------------------------------------

[kbidarka@localhost ocp-python-wrapper]$ oc get pods -n openshift-cnv  | grep labeller
kubevirt-node-labeller-c4xth                          1/1     Running   0          23s
kubevirt-node-labeller-mssgf                          1/1     Running   0          23s

Comment 6 errata-xmlrpc 2020-11-17 13:24:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 2.5.0 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:5127


Note You need to log in before you can comment on or make changes to this bug.