Bug 1332854
Summary: | <vcpu max='...'/> in domacapabilities should take KVM limits into account | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Fangge Jin <fjin> |
Component: | libvirt | Assignee: | Andrea Bolognani <abologna> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.3 | CC: | abologna, chayang, chhu, drjones, dyuan, eric.auger, juzhang, knoel, lhuang, mrezanin, rbalakri, virt-maint, zhguo |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-2.0.0-1.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-11-03 18:44:30 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Fangge Jin
2016-05-04 08:40:28 UTC
We have a RHEL only patch that use kvm recommended value as hard limit (BZ 998708). We have to fix the machine type max value. I agree that the maxCpus queried from the QMP command "query-machines" should be the recommended value, not the max, for RHEL. I should have patched QMP at the same time the other patch was done. I'll get this fixed. Do we want to report maxCpus = 240 in both kvm and TCG mode or only in KVM mode? Also do we want to adapt dynamically to what KVM recommends (KVM_CHECK_EXTENSION ioctl()/KVM_CAP_NR_VCPUS returned value) or do we statically return 240? One solution is to change the mac_cpus setting in hw/i386/pc.c. This impacts all the machines which have TYPE_PC_MACHINE parent, ie. those using DEFINE_PC_MACHINE. This will change the maxvcpus for all the rhel machines <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine canonical='pc-i440fx-rhel7.3.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='240'>pc-q35-rhel7.3.0</machine> <machine canonical='pc-q35-rhel7.3.0' maxCpus='240'>q35</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> (In reply to Eric Auger from comment #5) > Do we want to report maxCpus = 240 in both kvm and TCG mode or only in KVM > mode? Not too worried about TCG for RHEL. It may get used in some "nested" virt type situations for libguestfs, but otherwise it's not really supported. Anyway, more than 240 cpus is probably way more than necessary for it. > Also do we want to adapt dynamically to what KVM recommends > (KVM_CHECK_EXTENSION ioctl()/KVM_CAP_NR_VCPUS returned value) or do we > statically return 240? Depends on how hard it would be to dynamically adjust. Ideally we would, as we'll want them to stay the same, but I'm not sure it's so easy to do, and it's probably enough just to bump them both at the same time when that time comes. > > One solution is to change the mac_cpus setting in hw/i386/pc.c. This impacts > all the machines which have TYPE_PC_MACHINE parent, ie. those using > DEFINE_PC_MACHINE. We should discuss with others on whether we want to change all machine types, or just RHEL-7.3 and later. I think probably just 7.3 and later. Posted a series just addressing RHEL7.3 machines (at least the ones I am aware of). Also the KVM KVM_CAP_NR_VCPUS recommended value is dynamically retrieved in KVM case. However I was forced to directly use KVM ioctl in the hw/i386/pc.c which looks weird but I did not find any other solution at that point. So while respinning v1 I am now able to override all the machines' max_cpus in qemu kvm_init (by the way this is where the KVM recommended value already is fetched) so looks better. The downside of this approach is it works perfectly well for qemu-monitor-command guest '{"execute":"query-machines"}' which will report 240/255 in KVM/TCG mode. However it does not work for virsh capabilities which launches QEMU with default KVM mode. So now virsh capabilities reports 240 :-( After discussing with Andrea, there are plans to fix the issue on libvirt side and this may be more adapted. I expect Andrea to provide more details soon. Let me jump into the discussion :) The information reported by libvirt is clearly suboptimal here: this has been reported and it's being worked on upstream. Shivaprasad from IBM has posted a series[1] that addresses the issue by using the MAX_VCPUS capability to limit the cpu-max value returned by the query-machines QMP command, and exposing the NR_VCPUS capability as well. Please note that libvirt doesn't specify whether KVM or TCG should be used when running the QEMU instance that is used for probing, among others, the list of supported machine types and their limits; hence, the returned value must not be affected by the emulation mode. So I believe the cpu-max value should be left alone, as it represents the architectural limit of QEMU itself (eg. it can only create 256 vCPUs on ppc64, even though the MAX_VCPUS reports 2048 on that architecture); the KVM limit, which is only relevant when actually using it, should be taken into account by higher layers like libvirt, and it will be once the series mentioned above has been merged. I also wonder if, instead of carrying downstream patches both in QEMU and libvirt that ensure we're enforcing vCPU limits that reflect what we support in RHEL, it wouldn't make more sense to patch KVM so that the MAX_VCPUS and NR_VCPUS capabilities would return the correct values... One last note: the proper API to use in this case would be to call 'virsh domcapabilities' and look at the 'max' attribute of the <vcpu> element, since that output is already tailored to the relevant virtualization type / emulator binary / guest architecture / machine type. [1] https://www.redhat.com/archives/libvir-list/2016-June/msg00947.html (In reply to Andrea Bolognani from comment #9) > the KVM limit, which is only relevant when actually using it, > should be taken into account by higher layers like > libvirt, and it will be once the series mentioned above > has been merged. I'm OK with allowing higher levels to deal with this. We only support using QEMU through libvirt anyway. > > I also wonder if, instead of carrying downstream patches > both in QEMU and libvirt that ensure we're enforcing vCPU > limits that reflect what we support in RHEL, it wouldn't > make more sense to patch KVM so that the MAX_VCPUS and > NR_VCPUS capabilities would return the correct values... They do. The difference is that while upstream users may be OK with using MAX_VCPUS, downstream we only support, and therefore allow, NR_VCPUS, which is the recommended number. For x86 this number is based on test results and for other arches it may be based on the number of currently available host cpus. If everyone agrees to handle this in libvirt only, and there's already a BZ being worked for it for 7.3, then this BZ can be closed as not-a-bug. Thanks, drew (In reply to Andrew Jones from comment #10) > > I also wonder if, instead of carrying downstream patches > > both in QEMU and libvirt that ensure we're enforcing vCPU > > limits that reflect what we support in RHEL, it wouldn't > > make more sense to patch KVM so that the MAX_VCPUS and > > NR_VCPUS capabilities would return the correct values... > > They do. The difference is that while upstream users may be OK with using > MAX_VCPUS, downstream we only support, and therefore allow, NR_VCPUS, which > is the recommended number. For x86 this number is based on test results and > for other arches it may be based on the number of currently available host > cpus. I understand that :) My point is that we have patched both QEMU and libvirt to basically ignore MAX_VCPUS and treat NR_VCPUS as if it were the hard limit. Why don't we drop those, and make the kernel report MAX_VCPUS -> 240 (ppc64), NR_VCPUS (other architectures) NR_VCPUS -> whatever the current value is instead? > If everyone agrees to handle this in libvirt only, and there's already a BZ > being worked for it for 7.3, then this BZ can be closed as not-a-bug. There's no libvirt BZ tracking this. Can I just move this one over and assign it to myself, or would creating a new one be preferred? (In reply to Andrea Bolognani from comment #11) > (In reply to Andrew Jones from comment #10) > > > I also wonder if, instead of carrying downstream patches > > > both in QEMU and libvirt that ensure we're enforcing vCPU > > > limits that reflect what we support in RHEL, it wouldn't > > > make more sense to patch KVM so that the MAX_VCPUS and > > > NR_VCPUS capabilities would return the correct values... > > > > They do. The difference is that while upstream users may be OK with using > > MAX_VCPUS, downstream we only support, and therefore allow, NR_VCPUS, which > > is the recommended number. For x86 this number is based on test results and > > for other arches it may be based on the number of currently available host > > cpus. > > I understand that :) > > My point is that we have patched both QEMU and libvirt to > basically ignore MAX_VCPUS and treat NR_VCPUS as if it were > the hard limit. > > Why don't we drop those, and make the kernel report > > MAX_VCPUS -> 240 (ppc64), NR_VCPUS (other architectures) > NR_VCPUS -> whatever the current value is > > instead? We prefer allowing the kernel to be tested (without recompiling it) to the real maximums using non-RHEL userspaces. I agree we're spreading more pain now that it's also getting to libvirt though... > > > If everyone agrees to handle this in libvirt only, and there's already a BZ > > being worked for it for 7.3, then this BZ can be closed as not-a-bug. > > There's no libvirt BZ tracking this. Can I just move this > one over and assign it to myself, or would creating a new > one be preferred? I'm fine with just moving it. It's up to you, as you're moving it to yourself :-) (In reply to Andrew Jones from comment #12) > > My point is that we have patched both QEMU and libvirt to > > basically ignore MAX_VCPUS and treat NR_VCPUS as if it were > > the hard limit. > > > > Why don't we drop those, and make the kernel report > > > > MAX_VCPUS -> 240 (ppc64), NR_VCPUS (other architectures) > > NR_VCPUS -> whatever the current value is > > > > instead? > > We prefer allowing the kernel to be tested (without recompiling it) to the > real maximums using non-RHEL userspaces. I agree we're spreading more pain > now that it's also getting to libvirt though... If that's something we want to allow, then yes, the limits should be enforced by our userspace. It's not really a big deal, and both QEMU and libvirt have carried these downstream patches for years - I just thought we might take this chance to move the policy decision to a single place. Oh well :) > > There's no libvirt BZ tracking this. Can I just move this > > one over and assign it to myself, or would creating a new > > one be preferred? > > I'm fine with just moving it. It's up to you, as you're moving it to > yourself :-) Okay, I'm going to do just that. Thanks, Eric, for bringing this BZ to my attention! Thanks to you for eventually doing the job at libvirt level! This has been fixed by upstream commit commit 8dbb34781646d29aa72e92cd9e8a3c0f2fe462da Author: Shivaprasad G Bhat <sbhat.ibm.com> Date: Fri Jun 24 20:34:13 2016 +0530 qemu: check the kvm host cpu max limits in virConnectGetDomainCapabilities The qemu limit and host limit both should be considered for the domain vcpu max limits. Signed-off-by: Shivaprasad G Bhat <sbhat.ibm.com> v1.3.5-450-g8dbb347 The commit is included in libvirt 2.0.0. Verified on packages: libvirt-2.0.0-6.el7.x86_64 qemu-kvm-rhev-2.6.0-22.el7.x86_64 Steps: 1. virsh capabilites shows the maxCpus number changed from 255 to 240. # virsh capabilities <capabilities> ... <guest> <os_type>hvm</os_type> <arch name='i686'> <wordsize>32</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine canonical='pc-i440fx-rhel7.3.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='240'>pc-q35-rhel7.3.0</machine> <machine canonical='pc-q35-rhel7.3.0' maxCpus='240'>q35</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> <domain type='qemu'/> <domain type='kvm'> <emulator>/usr/libexec/qemu-kvm</emulator> </domain> </arch> <features> <cpuselection/> <deviceboot/> <disksnapshot default='on' toggle='no'/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> <pae/> <nonpae/> </features> </guest> <guest> <os_type>hvm</os_type> <arch name='x86_64'> <wordsize>64</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine canonical='pc-i440fx-rhel7.3.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='240'>pc-q35-rhel7.3.0</machine> <machine canonical='pc-q35-rhel7.3.0' maxCpus='240'>q35</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> <domain type='qemu'/> <domain type='kvm'> <emulator>/usr/libexec/qemu-kvm</emulator> </domain> </arch> 2. Set <vcpu placement='static'>255</vcpu> in guest xml, try to start guest: # virsh start r7t error: Failed to start domain r7t error: unsupported configuration: Maximum CPUs greater than specified machine type limit Change the number from 255 to 241, met the same error. 3. Set <vcpu placement='static'>240</vcpu> in guest xml, start the guest successfully. Login to guest, there are 240 cpus. # virsh start r7t Domain r7t started # virsh dumpxml r7t|grep "<vcpu" <vcpu placement='static'>240</vcpu> 4. qemu-monitor-command query the machines number: 240 # virsh qemu-monitor-command r7t '{"execute":"query-machines"}' {"return":[{"hotpluggable-cpus":true,"name":"pc-i440fx-rhel7.0.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"rhel6.3.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"rhel6.4.0","cpu-max":240},{"hotpluggable-cpus":false,"name":"none","cpu-max":1},{"hotpluggable-cpus":true,"name":"rhel6.0.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"pc-i440fx-rhel7.1.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"pc-i440fx-rhel7.2.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"pc-q35-rhel7.3.0","cpu-max":240,"alias":"q35"},{"hotpluggable-cpus":true,"name":"rhel6.5.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"rhel6.6.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"rhel6.1.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"rhel6.2.0","cpu-max":240},{"hotpluggable-cpus":true,"name":"pc-i440fx-rhel7.3.0","is-default":true,"cpu-max":240,"alias":"pc"}],"id":"libvirt-39"} 5. # virsh maxvcpus --type kvm 240 The maxvcpus qeuried by QMP command "query-machines" is 240, and the virsh capabilities return maxCpus 240. Start the guest with 240 cpus successfully, try to start guest with 241/255 cpus get clear error message. So, change the status to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2577.html |