Bug 2203291 - kubevirt should allow runtimeclass to be configured in a pod
Summary: kubevirt should allow runtimeclass to be configured in a pod
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Installation
Version: 4.12.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.14.0
Assignee: Simone Tiraboschi
QA Contact: SATHEESARAN
URL:
Whiteboard:
: 2185411 2192636 (view as bug list)
Depends On:
Blocks: 2217910 2217913
TreeView+ depends on / blocked
 
Reported: 2023-05-11 19:09 UTC by Marcelo Tosatti
Modified: 2023-11-08 14:06 UTC (History)
11 users (show)

Fixed In Version: v4.14.0.rhel9-863
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2217910 (view as bug list)
Environment:
Last Closed: 2023-11-08 14:05:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt hyperconverged-cluster-operator pull 2370 0 None Merged Expose defaultRuntimeClass 2023-06-09 16:19:43 UTC
Red Hat Issue Tracker CNV-28792 0 None None None 2023-05-11 19:10:27 UTC
Red Hat Product Errata RHSA-2023:6817 0 None None None 2023-11-08 14:06:11 UTC

Description Marcelo Tosatti 2023-05-11 19:09:55 UTC
Description of problem:

For DPDK type applications, the vCPU should not be interrupted or throttled
by the cgroup cpu quota limitations. By default k8s sets cpu quota limitations to positive
integer values, which throttles the vCPU.

To disable cpu quota for pods it is necessary to annotate the pod with

     cpu-quota.crio.io: "disable"

And set runtimeClassName to the performance profile runtimeClassName (as described 
at "Disabling CPU CFS quota" section of https://docs.openshift.com/container-platform/4.12/scalability_and_performance/cnf-low-latency-tuning.html).

However KubeVirt does not support setting of runtimeClassName.

In a discussion with Vladik, it appears an acceptable way to allow pods to 
set runtimeClassName would be for a scheduling policy to be created for VMs,
similarly to migration policies.


Version-Release number of selected component (if applicable):

4.12

How reproducible:

Always

Steps to Reproduce:
1. Start KubeVirt VM with cpu-quota.crio.io: "disable" annotation and runtimeclassname set (per cnf low latency tuning document above).
2. 
3.

Actual results:

cpu.cpu_quota_us value in the pod cgroup is not -1.

Expected results:

cpu.cpu_quota_us value in the pod cgroup is -1.


Additional info:

Comment 2 Marcelo Tosatti 2023-05-12 12:25:02 UTC
*** Bug 2192636 has been marked as a duplicate of this bug. ***

Comment 6 Kedar Bidarkar 2023-05-31 12:14:05 UTC
As per Petr from comment5 it appears it needs update in HCO first.

Simone, could you please take a look?

Comment 10 Simone Tiraboschi 2023-06-05 09:43:27 UTC
(In reply to Kedar Bidarkar from comment #6)
> As per Petr from comment5 it appears it needs update in HCO first.
> 
> Simone, could you please take a look?

Sure, a few questions (for the sake of inline documenting the new configuration option):
1. can the the value of defaultRuntimeClass be amended as a day two operations when we have existing VMIs?
2. if so, what's the impact on existing VMIs?
3. is it going to affect live migration with the target pod getting configured with the new value for defaultRuntimeClass?
4. is 4.14 enough or should we backport this down to 4.13?

Comment 11 Kedar Bidarkar 2023-06-06 10:05:25 UTC
Petr, feel you could help answer Simone's questions from comment10

Comment 15 Petr Horáček 2023-06-15 12:21:08 UTC
*** Bug 2185411 has been marked as a duplicate of this bug. ***

Comment 16 Ivan 2023-06-20 07:45:57 UTC
@stirabos , regarding your 4th question on comment #10;

4. is 4.14 enough or should we backport this down to 4.13? --> The end Partner needs this bug to be backported to 4.12 as it would be the version that they will Go Live in September.

Can you please let me know if you need me to file it or you can duplicate this one for 4.12?

Thanks in advance!

Comment 17 Simone Tiraboschi 2023-06-26 09:58:10 UTC
(In reply to Ivan from comment #16)
> Can you please let me know if you need me to file it or you can duplicate
> this one for 4.12?

OK, thanks.
We will handle the BZ and the backport process on our side.

Comment 18 SATHEESARAN 2023-07-10 12:03:49 UTC
Verified with CNV v4.14 interim build (HCO bundle: v4.14.0.rhel9-1154)

tl;dr: 
New config option: defaultRuntimeClass is introduced and it get propagated to kubevirt and VMI

Validated with the following test cases:
1. hco.spec.defaultRuntimeClass and kubevirt.spec.defaultRuntimeClass gives helpful information
to understand about the new option 'defaultRuntimeClass'

2. hco.spec.defaultRuntimeClass validates for the valid input value, which is a string.
Boolean or numerical values didn't work as expected.

3. When hco.spec.defaultRuntimeClass is set, the value propagates as expected to kubevirt and VMI
For the value to propagate to VMI, the performance profile has to be created and then the same
to be set on the hco.spec.defaultRuntimeClass

4. When hco.spec.defaultRuntimeClass is set, it affects only the newly created VM, restarted VM, 
migrated VM. Running VMs doesn't get affected with the new option.

With the above observations, marking this bug as VERIFIED

Comment 20 errata-xmlrpc 2023-11-08 14:05:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Virtualization 4.14.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6817


Note You need to log in before you can comment on or make changes to this bug.