1877698 – ssp operator can't patch the placement api

Bug 1877698 - ssp operator can't patch the placement api

Summary: ssp operator can't patch the placement api

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	SSP
Sub Component:
Version:	2.5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	2.5.0
Assignee:	Karel Šimon
QA Contact:	Or Bairey-Sehayek
Docs Contact:
URL:
Whiteboard:
Depends On:	1889401
Blocks:
TreeView+	depends on / blocked

Reported:	2020-09-10 08:40 UTC by Karel Šimon
Modified:	2020-11-17 13:24 UTC (History)
CC List:	3 users (show)
Fixed In Version:	kubevirt-ssp-operator-container-v2.5.0-50
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-11-17 13:24:22 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	kubevirt kubevirt-ssp-operator pull 239	0	None	closed	Fix placement api	2020-11-09 11:41:39 UTC
Red Hat Product Errata	RHEA-2020:5127	0	None	None	None	2020-11-17 13:24:41 UTC

Description Karel Šimon 2020-09-10 08:40:47 UTC

Description of problem:
Operator patches the TV deployment and NL daemonset using merge patch or json patch, the operator seems to not be able to remove or edit the placement fields if they exist.

For example:

A CR is created with placement API - works as expected
A CR is created without placement API - works as expected
A CR is created without placement API and then updated to have placement API - works as expected
A CR with placement API is being updated to remove or update the placement fields - the operator cannot seem to do it

Comment 1 Or Bairey-Sehayek 2020-10-16 20:14:53 UTC

After editing HCO, nodeSelector parameters work fine.

When attempting to reverse changes, the following error is received:

error: hyperconvergeds.hco.kubevirt.io "kubevirt-hyperconverged" could not be patched: Internal error occurred: failed calling webhook "validate-hco.kubevirt.io": Post "https://hco-operator-service.openshift-cnv.svc:4343/validate-hco-kubevirt-io-v1beta1-hyperconverged?timeout=30s": no endpoints available for service "hco-operator-service"
You can run `oc replace -f /tmp/oc-edit-ikelt.yaml` to try this update again.

The only way I have found to fix this is to redeploy CNV.

Comment 2 Or Bairey-Sehayek 2020-10-19 15:19:16 UTC

(In reply to Or Bairey-Sehayek from comment #1)
> After editing HCO, nodeSelector parameters work fine.
> 
> When attempting to reverse changes, the following error is received:
> 
> error: hyperconvergeds.hco.kubevirt.io "kubevirt-hyperconverged" could not
> be patched: Internal error occurred: failed calling webhook
> "validate-hco.kubevirt.io": Post
> "https://hco-operator-service.openshift-cnv.svc:4343/validate-hco-kubevirt-
> io-v1beta1-hyperconverged?timeout=30s": no endpoints available for service
> "hco-operator-service"
> You can run `oc replace -f /tmp/oc-edit-ikelt.yaml` to try this update again.
> 
> The only way I have found to fix this is to redeploy CNV.

Changes made:

$ oc -n openshift-cnv edit hyperconverged

BEFORE:

spec:
  infra: {}
  workloads: {}

AFTER:

spec:
  infra:
    nodePlacement:
      nodeSelector:
        foo: bar
  workloads:
    nodePlacement:
      nodeSelector:
        foo: bar


It should be noted that I tried doing this twice: once with no nodes labelled foo=bar and once with a worker labelled foo=bar. The result was the same: attempting to revert the change caused the error above.

Opened a separate bug about this https://bugzilla.redhat.com/show_bug.cgi?id=1889401

Comment 3 Kedar Bidarkar 2020-11-04 18:33:36 UTC

Checking for Infra/template-validator
---------------------------------------
[kbidarka@localhost ocp-python-wrapper]$ oc get hyperconverged kubevirt-hyperconverged -n openshift-cnv -o yaml 
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"hco.kubevirt.io/v1beta1","kind":"HyperConverged","metadata":{"annotations":{},"name":"kubevirt-hyperconverged","namespace":"openshift-cnv"},"spec":{"bareMetalPlatform":true}}
  finalizers:
  ...
  name: kubevirt-hyperconverged
  namespace: openshift-cnv
spec:
  bareMetalPlatform: true
  infra:
    nodePlacement:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: nodeType
                operator: In
                values:
                - infra
  version: v2.5.0
  workloads: {}
---
[kbidarka@localhost ocp-python-wrapper]$ oc get pods -n openshift-cnv | grep validator 
virt-template-validator-6876b65456-z6g5r              0/1     Pending   0          54s
virt-template-validator-69df488767-cg9cv              1/1     Running   0          8m36s
virt-template-validator-69df488767-wxn8h              1/1     Running   0          8m40s

[kbidarka@localhost ocp-python-wrapper]$ oc get deployment -n openshift-cnv virt-template-validator -o yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
...
spec:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: nodeType
                operator: In
                values:
                - infra
After reverting:
-----------------
[kbidarka@localhost ocp-python-wrapper]$ oc get pods -n openshift-cnv | grep validator
virt-template-validator-75f6c79bdf-9pdsk              1/1     Running   0          28s
virt-template-validator-75f6c79bdf-zw4ph              1/1     Running   0          24s
--------------------------------------------------------------------------------------------------------------------------------------------------
Checking for Workloads/node-labeller
------------------------------------
[kbidarka@localhost ocp-python-wrapper]$ oc get hyperconverged kubevirt-hyperconverged -n openshift-cnv -o yaml 
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"hco.kubevirt.io/v1beta1","kind":"HyperConverged","metadata":{"annotations":{},"name":"kubevirt-hyperconverged","namespace":"openshift-cnv"},"spec":{"bareMetalPlatform":true}}
  finalizers:
  ...
  name: kubevirt-hyperconverged
  namespace: openshift-cnv
spec:
  bareMetalPlatform: true
  infra: {}
  version: v2.5.0
  workloads:
    nodePlacement:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: nodeType
                operator: In
                values:
                - infra
[kbidarka@localhost ocp-python-wrapper]$ oc get pods -n openshift-cnv  | grep labeller
[kbidarka@localhost ocp-python-wrapper]$ 

After reverting the hyperconverged CR 
----------------------------------------

[kbidarka@localhost ocp-python-wrapper]$ oc get pods -n openshift-cnv  | grep labeller
kubevirt-node-labeller-c4xth                          1/1     Running   0          23s
kubevirt-node-labeller-mssgf                          1/1     Running   0          23s

Comment 6 errata-xmlrpc 2020-11-17 13:24:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 2.5.0 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:5127

Note You need to log in before you can comment on or make changes to this bug.