2216276 – virt-launcher pod ignores node placement configuration

Bug 2216276 - virt-launcher pod ignores node placement configuration

Summary: virt-launcher pod ignores node placement configuration

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Virtualization
Sub Component:
Version:	4.12.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.14.0
Assignee:	Itamar Holder
QA Contact:	Kedar Bidarkar
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2023-06-20 17:17 UTC by Denys Shchedrivyi
Modified:	2023-08-06 10:15 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-08-06 10:15:13 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	CNV-30130	0	None	None	None	2023-06-20 17:20:49 UTC

Description Denys Shchedrivyi 2023-06-20 17:17:38 UTC

After applying workload tolerations to the hco - it was applied to virt-handler pods, but was not applied to virt-launcher pods. As result - no any VMs can run on tainted nodes


# tolerations in Kubevirt CR:

$ oc get kubevirt -n openshift-cnv -o json | jq .items[0].spec.workloads
{
  "nodePlacement": {
    "tolerations": [
      {
        "effect": "NoSchedule",
        "key": "key1",
        "operator": "Equal",
        "value": "value1"
      }
    ]
  }
}


# virt-handler pod has same tolerations and can succesfully run on specified nodes

$ oc get pod -n openshift-cnv virt-handler-zlm4f -o json | jq .spec.tolerations
[
.
  {
    "effect": "NoSchedule",
    "key": "key1",
    "operator": "Equal",
    "value": "value1"
  },


# virt-launcher pods does not have same tolerations so can't run on that nodes:

$ oc describe pod
.
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  10s   default-scheduler  0/3 nodes are available: 3 node(s) had untolerated taint {key1: value1}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

Comment 1 Itamar Holder 2023-08-02 07:29:49 UTC

From looking at the code, it seems pretty clear that indeed Kubevirt CR's nodePlacement affects core components only (e.g. virt-handler, virt-controller, etc) but does not affect virt-launcher pods.

However, I don't see a clear evidence that this field was meant to affect virt-launcher pods.
The upstream documentation [1] is pretty vague, saying "nodePlacement describes scheduling configuration for specific KubeVirt components".
The doc for Replicas field [2] (which is another co-field in the same ComponentConfig struct) says: "replicas indicates how many replicas should be created for each KubeVirt infrastructure component (like virt-api or virt-controller)".

Are we sure this field is supposed to affect virt-launchers? Was it documented somewhere? Was it asked for by some user? Is there any clear use-case here?

I would also like to note that we already have a NodeSelector field in Kubevirt CR [3] that can set node-selectors to virt-launchers.

So unless I'm missing something, I think this can be closed as not a bug. However, we can indeed improve the documentation.

[1] https://github.com/kubevirt/kubevirt/blob/v1.0.0/staging/src/kubevirt.io/api/core/v1/componentconfig.go#L39
[2] https://github.com/kubevirt/kubevirt/blob/v1.0.0/staging/src/kubevirt.io/api/core/v1/componentconfig.go#L47
[3] https://github.com/kubevirt/kubevirt/blob/v1.0.0/staging/src/kubevirt.io/api/core/v1/types.go#L2543

Comment 2 Denys Shchedrivyi 2023-08-02 15:08:54 UTC

 Thanks Itamar, I think you are correct - workloads tolerations affects virt-handler pods only, for VMs we have to set tolerations in the VM spec:

https://docs.openshift.com/container-platform/4.13/virt/virtual_machines/advanced_vm_management/virt-specifying-nodes-for-vms.html#virt-example-vm-node-placement-tolerations_virt-specifying-nodes-for-vms

 Btw, example in the doc is not working. The toleration should be set under .spec.template.spec, so it should be:

metadata:
  name: example-vm-tolerations
apiVersion: kubevirt.io/v1
kind: VirtualMachine
spec:
  template:
    spec:
      tolerations:
      - key: "key"
        operator: "Equal"
        value: "virtualization"
        effect: "NoSchedule"


I think we can close this bug as not a bug, just need to update example in the documentation

Comment 3 Itamar Holder 2023-08-06 10:15:13 UTC

Thanks Denys!

While I agree this is not a bug, I also think that you have a valid point that both the documentation and the API is not very clear.

What concerns me here is that with "nodePlacement" you can make core components schedule on tainted nodes. So, virt-handler would be scheduled there which will cause a `kubevirt.io/schedulable` label to appear on that node. But to actually schedule VMs (i.e. virt-launcher pods) to that node the user would need to also add a toleration on the VM level.

This is somewhat related to the discussion that took place here: https://github.com/kubevirt/kubevirt/pull/10169#issuecomment-1651537898.

Perhaps an issue can be opened to discuss this although this is not a bug.

Thank you!

Note You need to log in before you can comment on or make changes to this bug.