Bug 2166512 - VM can't start because of requests/limits CPU number mismatch after adding the overallocated one
Summary: VM can't start because of requests/limits CPU number mismatch after adding th...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 4.12.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.13.0
Assignee: lpivarc
QA Contact: Akriti Gupta
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-02-02 04:24 UTC by Gu Nini
Modified: 2023-12-09 04:25 UTC (History)
10 users (show)

Fixed In Version: hco-bundle-registry-container-v4.13.0.rhel9-1639
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-18 02:57:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt pull 9163 0 None open bug: fix the requests/limits CPU number mismatch for VMs with isolatedEmulatorThread 2023-02-14 08:42:43 UTC
Github kubevirt kubevirt pull 9311 0 None open [release-0.59] bug: fix the requests/limits CPU number mismatch for VMs with isolatedEmulatorThread 2023-02-23 17:44:50 UTC
Red Hat Issue Tracker CNV-24922 0 None None None 2023-02-02 04:25:38 UTC
Red Hat Product Errata RHSA-2023:3205 0 None None None 2023-05-18 02:57:39 UTC

Description Gu Nini 2023-02-02 04:24:32 UTC
Description of problem:
After the environment upgrading to OCP + CNV 4.12.1, the VMs with limits/requests resources specified can't start.

        memory:
          hugepages:
            pageSize: 1Gi
        resources:
          limits:
            cpu: "4"
            memory: 8Gi
          requests:
            cpu: "4"
            memory: 8Gi


Logs generated from virtualmachine-controller is as follows:

Error creating pod: Pod "virt-launcher-rhel8-ngu-2-5gmm4" is invalid: spec.containers[0].resources.requests: Invalid value: "5": must be less than or equal to cpu limit


The overhead(1 CPU) was added to the virt-launcher pod(4+1=5). However, it was only added to the requests while not to the limits, as we're getting denied by the validating webhook:

Error "spec.template.spec.domain.resources.requests.cpu or spec.template.spec.domain.resources.limits.cpu must be equal when DedicatedCPUPlacement is true " for field "spec.template.spec.domain.cpu.dedicatedCpuPlacement".


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Upgrade OCP + CNV env from version 4.11.* to 4.12.1.
2. Try to start a VM

Actual results:
The VM is always in 'starting' status, from the log, it's found the requests/limits CPU number mismatch after adding the overallocated cpu.


Expected results:
The overallocated CPU should be added to both requests and limits of the vmi pod
.
Additional info:

Comment 3 Akriti Gupta 2023-03-13 12:15:27 UTC
verified on v4.13.0.rhel9-1689

[akriti@fedora cnv-tests]$ oc get vmi
NAME      AGE     PHASE     IP            NODENAME                            READY
example   2m44s   Running   10.129.2.79   virt-akr-413-mdxr5-worker-0-2crmn   True
[akriti@fedora cnv-tests]$ oc get vm example -o json | jq .spec.template.spec.domain.resources
{
  "limits": {
    "cpu": "4",
    "memory": "8Gi"
  },
  "requests": {
    "cpu": "4",
    "memory": "8Gi"
  }
}

VM is running successfully

Comment 6 errata-xmlrpc 2023-05-18 02:57:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.13.0 Images security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:3205

Comment 7 Dan Kenigsberg 2023-08-10 13:09:42 UTC
(In reply to Marcelo Tosatti from comment #5)
> (In reply to sgott from comment #2)
> > Targetting this to 4.13, but we will certainly need to backport it once
> > fixed.
> 
> Yes, it would be good to have backports for 4.12.z on this.
> Hit this on a customers PoC, and also have the documentation:
> 
> 
> https://access.redhat.com/solutions/7007632
> 

Marcelo, can you start this by proposing a backporting PR?

Kaedar, would you file a 4.12 backport BZ?

Comment 8 Red Hat Bugzilla 2023-12-09 04:25:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.