Bug 2214362 - VM with containerDisk does not run when "Auto CPU Limit" feature enabled and ResourceQuota has limits.cpu=1
Summary: VM with containerDisk does not run when "Auto CPU Limit" feature enabled and ...
Keywords:
Status: CLOSED DUPLICATE of bug 2221801
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 4.14.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.14.0
Assignee: Jed Lejosne
QA Contact: Kedar Bidarkar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-12 18:29 UTC by Denys Shchedrivyi
Modified: 2023-08-21 13:18 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-21 13:17:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CNV-29787 0 None None None 2023-06-12 18:45:56 UTC

Description Denys Shchedrivyi 2023-06-12 18:29:06 UTC
Description of problem:
 After enabling autoCPULimitNamespaceLabelSelector and creating resourceQuota with `limits.cpu=1` can't run VMs with container disks:
 
> Warning  FailedCreate      3s (x2 over 6s)  virtualmachine-controller    (combined from similar events): Error creating pod: pods "virt-launcher-vm-fedora-hjn9j" is forbidden: exceeded quota: quota-limit-1, requested: limits.cpu=1010m, used: limits.cpu=0, limited: limits.cpu=1

autoCPULimitNamespaceLabelSelector automatically set the cpu limit to "1" for "compute" container on the POD, but "containerDisk" requires additional "10m" of cpu so the POD exceeds the allowed limit and can't be created.


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. enable autoCPULimitNamespaceLabelSelector
2. create resourceQuota with cpu limit 1
3. create VM with container disk

Actual results:
 VMI stuck in Pending state, can't create POD

Expected results:
 VMI succesfully started

Additional info:
 also due to this limitation, for example if I set the cpu limit to 5, I can only run 4 VMs with container disk in that namespace.

> $  oc get vmi
> NAME          AGE     PHASE     IP             NODENAME                            READY
> vm-fedora     5m55s   Running   10.131.1.191   virt-den-414-zzpcr-worker-0-hk7vb   True
> vm-fedora-2   5m51s   Running   10.128.2.69    virt-den-414-zzpcr-worker-0-b54wh   True
> vm-fedora-3   5m40s   Running   10.129.2.128   virt-den-414-zzpcr-worker-0-hp7rc   True
> vm-fedora-4   41s     Running   10.128.2.70    virt-den-414-zzpcr-worker-0-b54wh   True
> vm-fedora-5   9s      Pending                                                      False

> $ oc get resourcequota
> NAME        AGE     REQUEST                LIMIT
> cpi-quota   7m20s   requests.cpu: 404m/1   limits.cpu: 4040m/5

Comment 1 Jed Lejosne 2023-06-20 20:17:28 UTC
This is a valid concern, and a nice catch!
The auto CPU limit feature was indeed created to work together with quotas to allow admins to control how many vCPUs users can create.
And yes, a quota of 1 sounds like users should be allowed to create 1 VM with 1 CPU.
However, in reality, more things might require CPU resources, like containerdisks as pointed out here. Disk hotplug is another concern, and maybe other things...

I don't see a viable way to address that in the code, and it's probably something we need to defer to documentation.
Most admins probably know if users will use container-disks/disk-hotplug and can account for it in their quota (10m per container-disk, 100m for disk-hotplug).
Otherwise, for each vCPU admins want to allocate their users, they can add 1.2 (1200m) to the quota (instead of 1), which should be plenty (enables disk-hotplug + 10 containerdisks per VM).

Comment 2 Antonio Cardace 2023-07-31 11:50:11 UTC
@jlejosne How do we want to proceed here? Is this something we can address in KubeVirt?

Comment 3 Jed Lejosne 2023-07-31 12:53:25 UTC
The second paragraph above should mostly answer your question.
In extreme use-cases, like when using auto CPU limit + a really small CPUAllocationRatio + few vCPUs + disk-hotplug/containerdisks, virt-launcher will need more that 1 host CPU per vCPU.
Documentation should therefore tell cluster admins that creating CPU quotas equal to the number of vCPUs used by VMs may not be enough.

We can make this simple with some quick math though. A 1 vCPU VM that uses disk-hotplug and 10 containerdisks (worst case scenario) will need 200m extra CPU.
So, as long as CPUAllocationRatio is set at or above... 2(?) everything will work fine. (If I understand it right, 2 means 500m CPU per vCPU, so plently left for overhead).
To keep the documentation simple, we could just say "do not use auto CPU limits with a CPUAllocationRatio of 1. We could also enforce that in HCO when validating the CR.

Comment 4 Jed Lejosne 2023-08-21 13:17:27 UTC

*** This bug has been marked as a duplicate of bug 2221801 ***


Note You need to log in before you can comment on or make changes to this bug.