Bug 2152534

Summary: Default CPU request in namespace limitrange takes precedence over the VMs configured vCPU
Product: Container Native Virtualization (CNV) Reporter: nijin ashok <nashok>
Component: VirtualizationAssignee: Igor Bezukh <ibezukh>
Status: CLOSED ERRATA QA Contact: Denys Shchedrivyi <dshchedr>
Severity: high Docs Contact:
Priority: high    
Version: 4.11.1CC: dholler, divshah, ibezukh
Target Milestone: ---   
Target Release: 4.11.3   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: hco-bundle-v4.11.3-21 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2164431 (view as bug list) Environment:
Last Closed: 2023-05-18 02:56:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2164431, 2164432    

Description nijin ashok 2022-12-12 09:24:48 UTC
Description of problem:

The namespace has got default cpu requests configured to 100m:

~~~
oc describe limits
Name:       resource-limits
Namespace:  default
Type        Resource  Min  Max  Default Request  Default Limit  Max Limit/Request Ratio
----        --------  ---  ---  ---------------  -------------  -----------------------
Container   cpu       -    2    100m             2              -
Container   memory    -    5Gi  4Gi              4Gi            -
~~~

Created a new VM with 10 vCPU in this namespace from OpenShift console using RHEL 7 template.

~~~
oc get vm rhel7-anonymous-flea -o yaml |yq -y '.spec.template.spec.domain.cpu'
cores: 10
sockets: 1
threads: 1
~~~

With the default cpuAllocationRatio of 10, the virt-launcher pod should get requests.cpu of 1. However, it gets 100m (default request) from the limitrange.

~~~
oc get vmi rhel7-anonymous-flea -o yaml |yq -y '.spec.domain.resources.requests'
cpu: 100m
memory: 2Gi

oc get pod virt-launcher-rhel7-anonymous-flea-d54ls -o yaml|yq -y '.spec.containers[0].resources.requests'
cpu: 100m
devices.kubevirt.io/kvm: '1'
devices.kubevirt.io/tun: '1'
devices.kubevirt.io/vhost-net: '1'
ephemeral-storage: 50M
memory: 2364Mi
~~~ 

So although VM will see 10 vCPU, it is not requested in the POD and can cause CPU starvation in the virtual machine.

If I delete the limitrange, it works as expected:

~~~
oc get pod virt-launcher-rhel7-anonymous-flea-cl85h -o yaml|yq -y '.spec.containers[0].resources.requests'
cpu: '1'
devices.kubevirt.io/kvm: '1'
devices.kubevirt.io/tun: '1'
devices.kubevirt.io/vhost-net: '1'
ephemeral-storage: 50M
memory: 2364Mi
~~~

Version-Release number of selected component (if applicable):

OpenShift Virtualization   4.11.1

How reproducible:

100%

Steps to Reproduce:

1. Configure limitrange in the namespace. 
2. Create a virtual machine from the OpenShift console.
3. Start the virtual machine and check the requests.cpu of the vmi and virt-launcher pod. It will be always default requests.cpu configured in the limitrange.

Actual results:

Default CPU request in namespace limitrange takes precedence over the VMs configured CPU

Expected results:

The default cpu in limitrange is not applicable when vCPU is configured for the VM. Here all the VMs in the namespace will get the same cpu.requests regardless of the number of vCPU configured for the VM.

Additional info:

Comment 1 sgott 2022-12-12 21:40:40 UTC
Can you share the limitrange object you used as a reproducer?

Comment 2 sgott 2022-12-13 18:29:01 UTC
Nijin, can you please post manifests for the VMI, the limitrange object and the resulting pod?

Comment 5 Denys Shchedrivyi 2023-02-02 17:31:45 UTC
Verified, LimitRange does not take precedence over the VMs cpu/cores field


Created LimitRange:

apiVersion: v1
kind: LimitRange
metadata:
  name: resource-limits
  namespace: default
spec:
  limits:
  - default:
      cpu: "2"
      memory: 4Gi
    defaultRequest:
      cpu: 100m
      memory: 4Gi
    max:
      cpu: "2"
      memory: 5Gi
    type: Container


Created VM with domain.cpu.cores field but without resources.requests.cpu:
> $ oc get vmi vm-fedora -o yaml | yq .spec.domain.cpu.cores
> 3

> $ oc get vmi vm-fedora -o yaml | yq .spec.domain.resources.requests
> memory: 1Gi


The POD has expected cpu value
> $ oc get pod virt-launcher-vm-fedora-xtvkp -o yaml | yq .spec.containers[0].resources.requests
> cpu: 300m

Comment 10 Igor Bezukh 2023-05-16 11:35:47 UTC
Yes it should be in

Comment 11 errata-xmlrpc 2023-05-18 02:56:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.13.0 Images security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:3205