Bug 1985670
| Summary: | virt-launcher fails to create v1 controller cpu for group: Read-only file system | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Denis Ollier <dollierp> |
| Component: | Virtualization | Assignee: | Itamar Holder <iholder> |
| Status: | CLOSED ERRATA | QA Contact: | Israel Pinto <ipinto> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 4.9.0 | CC: | cnv-qe-bugs, mtessun, sgott |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-11-02 15:59:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Denis Ollier
2021-07-24 19:58:01 UTC
This bug's root cause has been found and is now fixed upstream with this PR: https://github.com/kubevirt/kubevirt/pull/6153. As explained in the PR itself: Very long story short: Background: multiple libvirtd processes can run for multiple VMs, some of them can be root and some non-root. QEMU's configuration file path is different for root / non-root VMS. For root VMs it's /etc/libvirt/qemu.conf For non-root VMs it's /var/run/libvirt/qemu.conf. (for more info: https://libvirt.org/manpages/libvirtd.html#when-run-as-non-root) In Kubevirt, we also add cgroup_controllers = [ ] string to the configuration file (here: https://github.com/kubevirt/kubevirt/blob/main/pkg/virt-launcher/virtwrap/util/libvirt_helper.go#L454). Bug root cause: As can be seen by this PR, the bug is that the wrong configuration file (the non-root one) is being chosen also for root VMs. Bug outcome: The outcome is this bug. Deep in libvirt's code there an if-else branch (in virCgroupV1DetectControllers() function) that depends on the number on controllers defined in QEMU config file. Previously it was 0, since we had cgroup_controllers = [ ] in the config file, but since this bug causes us to look at the wrong config file (non-root one) the actual config file doesn't have cgroup_controllers defined at all, therefor in libvirt the number of controllers is determined to be -1. This change in libvirt code-path breaks Kubevirt and causes VMs to stay in Scheduled mode until they fail. We need to make sure the configuration file is set up correctly to fix this as this PR does. Thanks very much to @dollierp for helping me with this bug! It has been mitigated by modifying the default /etc/libvirt/qemu.conf file. Removing blocker tags. Verified with http://cnv-version-explorer.apps.cnv.engineering.redhat.com/BundleDetails?ver=v4.9.0-79. virt-launcher does not create file /var/run/libvirt/qemu.conf anymore for root VMs and overrides the file /etc/libvirt/qemu.conf instead. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.9.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4104 |