Created attachment 1908863 [details] virt-handler.log Description of problem: After the env upgrade from OCP4.10.26 to OCP4.11.1/OCP-V4.11.0, the real time VM can't boot up since it's run as non-root VM by default. Please check the attached 'virt-handler.log' for details. Conclude what Vladik Romanovsky said about the root cause of the issue as follows: ''' CNV runs non-root VMs by default now, this removes cap_sys_nice from the launchers. The problem is that CNV makes this switch before upstream KubeVirt did: https://github.com/kubevirt/kubevirt/blob/782b82aff8adc516d98421466ab9e43835efb89c/pkg/virt-controller/services/rendercontainer.go#L244 ''' Version-Release number of selected component (if applicable): OpenShift Virtualization: 4.11.0 Openshift: 4.11.1 How reproducible: 100% Steps to Reproduce: 1. Upgrade env to OCP4.11.1/OCP-V4.11.0 2. Try to boot up a real time VM created formerly 3. Actual results: It's found the VM failed to boot up successfully. Expected results: The VM could boot up without issue. Additional info:
*** Bug 2123207 has been marked as a duplicate of this bug. ***
PR raised to address this issue. https://github.com/kubevirt/kubevirt/pull/8750
PR has been merged.
https://kubevirt.io/2021/Running-Realtime-Workloads.html
Verified on v4.13.0.rhel9-1834, VM with RT kernel can successfully run: Steps: 1) set worker-rt label on one of worker nodes 2) create MCP pointed to worker-rt node 3) create PerformanceProfile which enables RT kernel 4) wait for MCP to complete update > $ oc get mcp > NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE > worker-rt rendered-worker-rt-a560b74067dfdce8a670390145f51439 True False False 1 1 1 0 129m node switched to rt kernel > oc get node -o wide > NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME > virt-den-413-l7hvr-worker-0-9r8gw Ready worker,worker-rt 6h37m v1.26.2+dc93b13 192.168.2.82 <none> Red Hat Enterprise Linux CoreOS 413.92.202303221220-0 (Plow) 5.14.0-284.1.1.rt14.286.el9_2.x86_64 cri-o://1.26.1-10.rhaos4.13.gitcb86088.el9 5) create vm with realtime kernel (the doc from comment #7): > $ oc get vm -A > NAMESPACE NAME AGE STATUS READY > test-rt fedora-realtime 59m Running True > $ oc get pod > NAME READY STATUS RESTARTS AGE > virt-launcher-fedora-realtime-75lfr 2/2 Running 0 59m > [fedora@fedora-realtime ~]$ uname -r > 5.6.19-300.rt10.2.fc32.ccrma.x86_64+rt
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.13.0 Images security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:3205