Bug 2074559
Summary: | RFE core scheduling support in libvirt | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Stefan Hajnoczi <stefanha> |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
libvirt sub component: | General | QA Contact: | Luyao Huang <lhuang> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | berrange, dzheng, jdenemar, jmario, jsuchane, lmen, mprivozn, smitterl, virt-maint, xuzhang |
Version: | 9.0 | Keywords: | FutureFeature, Triaged, Upstream |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libvirt-8.9.0-1.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-05-09 07:26:11 UTC | Type: | Feature Request |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | 8.9.0 |
Embargoed: |
Description
Stefan Hajnoczi
2022-04-12 13:43:29 UTC
The current scenario - The default configuration for QEMU is to not assign any CPU affinity mask for vCPUs, I/O threads, or emulator threads, or rather explicitly set an all-1s mask - The default configuration for hardware is usually to enable SMT - The default configuration for the Linux KVM host is to schedule across any host CPUs - The machine.slice may restrict VMs to some CPUs Given this scenarios, the out of the box deployment for KVM is vulnerable to various information leakage attacks due to various CPU side channel/speculative execution vulnerabilities. IOW, core scheduling should be considered to be a security fix / mitigation from a KVM POV. Until now the only mitigations were to disable SMT (which reduces capacity) or do CPU affinity for VMs (which impacts VM management flexibility & VM density). In practice neither of these is especially viable, so I expect most customers have done neither and simply ignored the security risks inherant in SMT. This core scheduling finally gives us a viable mitigation that I expect customers/layered products would be willing to deploy in most scenarios. Ideally we would enable this by default out of the box, however, there are enough caveats in the kernel docs that I think this could be risky in terms of causing performance regressions for customers in some scenarios. So reluctantly we probably need a config knob in libvirt and have mgmt apps (OSP, CNV, virt-manager, virt-install, cockpit, etc) explicitly opt-in when provisioning new VMs. RFC patches posted on the list: https://listman.redhat.com/archives/libvir-list/2022-May/230902.html And merged upstream as: ab966b9d31 qemu: Enable for vCPUs on hotplug d942422482 qemu: Enable SCHED_CORE for vCPUs 000477115e qemu: Enable SCHED_CORE for helper processes 279527334d qemu_process: Enable SCHED_CORE for QEMU process 4be75216be qemu_domain: Introduce qemuDomainSchedCoreStart() 6a1500b4ea qemu_conf: Introduce a knob to set SCHED_CORE bd481a79d8 virCommand: Introduce APIs for core scheduling c935cead2d virprocess: Core Scheduling support v8.8.0-169-gab966b9d31 There's still some follow up work needed for this to be automatically enabled though. Verify this bug with libvirt-8.9.0-2.el9.x86_64: S1: Test sched_core = "vcpus" 1. prepare a guest which have virtiofs and current vcpu < maxvcpu # virsh dumpxml vm1 <vcpu placement='static' current='2'>10</vcpu> <filesystem type='mount' accessmode='passthrough'> <driver type='virtiofs' queue='1024'/> <binary path='/usr/libexec/virtiofsd' xattr='on'/> <source dir='/mount/test'/> <target dir='test'/> <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> </filesystem> 2. set sched_core = "vcpus" in qemu.conf and restart virtqemud # echo 'sched_core = "vcpus"' >> /etc/libvirt/qemu.conf # service virtqemud restart Redirecting to /bin/systemctl restart virtqemud.service 3. start guest # virsh start vm1 Domain 'vm1' started 4. check cookie values of qemu emulator, vcpus, helper processes. And only vcpus's cookie != 0 emulator: # ./get-cookie 85455 process 85455 cookie is 0 vcpus: # ./get-cookie 85480 85481 process 85480 cookie is 4254838555 process 85481 cookie is 4254838555 helper processes: # ./get-cookie 85444 85446 process 85444 cookie is 0 process 85446 cookie is 0 5. hotplug vcpu via setvcpus and setvcpu, and check new vcpus' cookie: # virsh setvcpus vm1 4 # virsh setvcpu vm1 --enable 9 # ./get-cookie 85956 85958 85974 process 85956 cookie is 4254838555 process 85958 cookie is 4254838555 process 85974 cookie is 4254838555 And change the sched_core value and retest with the same steps, got: when sched_core = "none": qemu emulator, vcpus, helper processes' cookies are 0 when sched_core = "emulator": qemu emulator, vcpus have the same positive integer cookies, helper processes' cookies are 0 when sched_core = "full": qemu emulator, vcpus, helper processes' cookies are the same positive integer Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2171 |