Bug 2192858
| Summary: | kubevirt-job pod ignores node placement configuration | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | nijin ashok <nashok> | 
| Component: | Virtualization | Assignee: | Prita Narayan <prnaraya> | 
| Status: | CLOSED ERRATA | QA Contact: | Denys Shchedrivyi <dshchedr> | 
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.12.2 | CC: | acardace, dshchedr, gveitmic, kbidarka, prnaraya | 
| Target Milestone: | --- | ||
| Target Release: | 4.12.4 | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | v4.12.4-35 | Doc Type: | If docs needed, set a value | 
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-06-27 19:10:40 UTC | Type: | Bug | 
| Regression: | --- | Mount Type: | --- | 
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| 
        
          Description
        
        
          nijin ashok
        
        
        
        
        
          2023-05-03 10:50:10 UTC
        
        I verified on CNV-v4.12.4-52 - it seems not everything works as expected. 2 questions to clarify:
1. .spec.infra.nodePlacement.tolerations successfully applied to virt-controller and virt-api, but was *not applied to virt-operator*
 steps: 
 1) set taints on the nodes:
  $ oc adm taint nodes cnv-qe-infra-04.cnvqe3.lab.eng.rdu2.redhat.com key1=value1:NoSchedule
 
 2) added tolerations to HCO
   infra:
    nodePlacement:
      tolerations:
      - effect: NoSchedule
        key: key1
        operator: Equal
        value: value1
  3) check it was propagated to kubevirt:
   $ oc get kubevirt -o json | jq .items[0].spec.infra
   {
     "nodePlacement": {
       "tolerations": [
         {
           "effect": "NoSchedule",
           "key": "key1",
           "operator": "Equal",
           "value": "value1"
         }
       ]
     }
   }
  4) check virt pods:
 VIRT-API and VIRT-CONTROLLER pods have it:
   $oc get pod virt-api-769645b799-2z5kp -o json | jq .spec.tolerations
      {
        "effect": "NoSchedule",
        "key": "key1",
        "operator": "Equal",
        "value": "value1"
      },
 
  VIRT-OPERATOR - does not have it:
   $  oc get pod virt-operator-6c675b7888-9vlsx -o json | jq .spec.tolerations
   [
     {
       "key": "CriticalAddonsOnly",
       "operator": "Exists"
     },
     {
       "effect": "NoExecute",
       "key": "node.kubernetes.io/not-ready",
       "operator": "Exists",
       "tolerationSeconds": 300
     },
     {
       "effect": "NoExecute",
       "key": "node.kubernetes.io/unreachable",
       "operator": "Exists",
       "tolerationSeconds": 300
     },
     {
       "effect": "NoSchedule",
       "key": "node.kubernetes.io/memory-pressure",
       "operator": "Exists"
     }
   ]
  As result, if I remove virt-operator pod - it will be re-created and stuck in Pending state:
    virt-operator-6c675b7888-rt8sj                         0/1     Pending   0          40s
 Probably virt-operator pods should also include necessary tolerations
2. And another question about .spec.workloads 
 When I set workloads tolerations to the hco - it is applied to virt-handler pods, but does not apply to virt-launcher pods, so no any VMs can be created on that node
 $ oc describe pod virt-launcher-vm-label-mdfqt
 .
   Warning  FailedScheduling  22m   default-scheduler  0/3 nodes are available: 3 node(s) had untolerated taint {key1: value1}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.
(In reply to Denys Shchedrivyi from comment #1) > I verified on CNV-v4.12.4-52 - it seems not everything works as expected. 2 > questions to clarify: > > 1. .spec.infra.nodePlacement.tolerations successfully applied to > virt-controller and virt-api, but was *not applied to virt-operator* > > steps: > 1) set taints on the nodes: > $ oc adm taint nodes cnv-qe-infra-04.cnvqe3.lab.eng.rdu2.redhat.com > key1=value1:NoSchedule > > 2) added tolerations to HCO > infra: > nodePlacement: > tolerations: > - effect: NoSchedule > key: key1 > operator: Equal > value: value1 > > 3) check it was propagated to kubevirt: > $ oc get kubevirt -o json | jq .items[0].spec.infra > { > "nodePlacement": { > "tolerations": [ > { > "effect": "NoSchedule", > "key": "key1", > "operator": "Equal", > "value": "value1" > } > ] > } > } > > 4) check virt pods: > VIRT-API and VIRT-CONTROLLER pods have it: > $oc get pod virt-api-769645b799-2z5kp -o json | jq .spec.tolerations > { > "effect": "NoSchedule", > "key": "key1", > "operator": "Equal", > "value": "value1" > }, > > VIRT-OPERATOR - does not have it: > $ oc get pod virt-operator-6c675b7888-9vlsx -o json | jq > .spec.tolerations > [ > { > "key": "CriticalAddonsOnly", > "operator": "Exists" > }, > { > "effect": "NoExecute", > "key": "node.kubernetes.io/not-ready", > "operator": "Exists", > "tolerationSeconds": 300 > }, > { > "effect": "NoExecute", > "key": "node.kubernetes.io/unreachable", > "operator": "Exists", > "tolerationSeconds": 300 > }, > { > "effect": "NoSchedule", > "key": "node.kubernetes.io/memory-pressure", > "operator": "Exists" > } > ] > > As result, if I remove virt-operator pod - it will be re-created and stuck > in Pending state: > virt-operator-6c675b7888-rt8sj 0/1 Pending > 0 40s > > Probably virt-operator pods should also include necessary tolerations You're right, but I think that's trickier than expected as KubeVirt starts "living" when virt-operator starts running, so by that time virt-operator pods have already been scheduled and placed onto nodes (before the KubeVirt CR is created). If we want to do this then HCO needs to install the virt-operator with the scheduling hints already in place in its deployment spec, so I guess a different HCO bug must be filed for this to happen. > > > > 2. And another question about .spec.workloads > When I set workloads tolerations to the hco - it is applied to virt-handler > pods, but does not apply to virt-launcher pods, so no any VMs can be created > on that node > $ oc describe pod virt-launcher-vm-label-mdfqt > . > Warning FailedScheduling 22m default-scheduler 0/3 nodes are > available: 3 node(s) had untolerated taint {key1: value1}. preemption: 0/3 > nodes are available: 3 Preemption is not helpful for scheduling. This sounds like a bug, though a different one. In terms of this bug, I'd say everything works as expected actually even with the shortcomings you highlighted. Based on commment #2 closing this bug For virt-launcher pods opened new one: bug 2216276 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.12.4 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:3889 |