Created attachment 1472895 [details] libvirtd log Description of problem: Hotplug vcpu will result in wrong vcpupin if pin the vcpu to non-existing host cpu before vm starts. Version-Release number of selected component: libvirt-4.5.0-6.el7.x86_64 How reproducible: 100% Steps to Reproduce: 0. Host cpu info: # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Stepping: 2 CPU MHz: 2600.244 CPU max MHz: 3200.0000 CPU min MHz: 1200.0000 BogoMIPS: 4800.09 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 15360K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23 1. Prepare vm xml with vcpu configuration as below: <vcpu placement='static' current='2'>16</vcpu> ==> 2 of 16 vcpu are enabled <cputune> <vcpupin vcpu='2' cpuset='24'/> ==> The third vcpu is pinned to host cpu 24 which doesn't exist actually </cputune> 2. Start guest, and check vcpu info: # virsh start q35 # virsh vcpuinfo q35 VCPU: 0 CPU: 3 State: running CPU time: 4.7s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy VCPU: 1 CPU: 5 State: running CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy # virsh vcpupin q35 # virsh vcpupin q35 |head -6 VCPU: CPU Affinity ---------------------------------- 0: 0-23 1: 0-23 2: 3: 0-23 3. Hotplug vcpu to 4, it reports error but actually it succeeds to hotplug vcpu 2: # virsh setvcpus q35 4 error: Invalid value '24' for 'cpuset.cpus': Invalid argument # virsh vcpuinfo q35 VCPU: 0 CPU: 4 State: running CPU time: 30.4s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy VCPU: 1 CPU: 20 State: running CPU time: 9.0s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy VCPU: 2 CPU: 14 State: running CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy # virsh vcpupin q35 |head -6 VCPU: CPU Affinity ---------------------------------- 0: 0-23 1: 0-23 2: 3: 0-23 # virsh vcpucount q35 maximum config 16 maximum live 16 current config 2 current live 3 4. Do managedsave, fails: # virsh managedsave q35 error: Failed to save domain q35 state error: operation failed: failed to get domain xml 5. Hotplug vcpu to 4 again # virsh setvcpus q35 4 # virsh vcpuinfo q35 VCPU: 0 CPU: 1 State: running CPU time: 15.0s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy VCPU: 1 CPU: 6 State: running CPU time: 15.7s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy VCPU: 2 CPU: 14 State: running CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy VCPU: 3 CPU: 21 State: running CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyy 5. Do managedsave && restore: # virsh managedsave q35 Domain q35 state saved by libvirt # virsh start q35 error: Failed to start domain q35 error: Invalid value '24' for 'cpuset.cpus': Invalid argument Actual results: As steps Expected results: 1)Hotplug vcpu should fail if the vcpupin is invalid, or 2)Guest should fail to start if the vcpupin is invalid even if the vcpu is not enabled when boot guest, or 3)Guest should fail to define if the vcpupin is invalid even if the vcpu is not enabled when define guest
After hotplug vcpu to 4 in step3, vcpu 2 is hotplugged without cgroup controller: # lscgroup -g cpuset:/ cpuset:/ cpuset:/machine.slice cpuset:/machine.slice/machine-qemu\x2d15\x2dq35.scope cpuset:/machine.slice/machine-qemu\x2d15\x2dq35.scope/vcpu1 cpuset:/machine.slice/machine-qemu\x2d15\x2dq35.scope/vcpu0 cpuset:/machine.slice/machine-qemu\x2d15\x2dq35.scope/emulator # virsh vcpupin q35 2 2 error: Failed to create controller cpu for group: No such file or directory
Seems only when the offline vcpu pins to a non-existing host cpu, VM can start normally. # lscpu CPU(s): 2 On-line CPU(s) list: 0,1 S1: Test online vcpu 0-5, VM will start fail while validation can succeed. # virsh domstate test shut off # virsh vcpupin test 1 4 --config error: CPU 4 in cpulist '4' exceed the maxcpu 2 Edit can succeed. like the following (**validation should fail.**) # virsh dumpxml test --inactive <vcpu placement='static' current='6'>8</vcpu> ... <vcpupin vcpu='1' cpuset='4'/> # virsh start test error: Failed to start domain test error: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d19\x2dtest.scope/vcpu1/cpuset.cpus': Numerical result out of range S2: Test offline 6-7, Both starting and validation can succeed. # virsh domstate test shut off # virsh dumpxml test --inactive <vcpu placement='static' current='6'>8</vcpu> ... <vcpupin vcpu='6' cpuset='4'/> # virsh start test Domain test started
Edit Vm with illeagal cpuset for iothreadpin, validation can also succeed. In this occasion, No matter for vcpu or iothread, I think validation should not pass. # lscpu CPU(s): 2 On-line CPU(s) list: 0,1 # virsh domstate test shut off (Edit VM like following:) # virsh dumpxml test --inactive|grep iothread <iothreads>1</iothreads> <iothreadids> <iothread id='1'/> </iothreadids> <iothreadpin iothread='1' cpuset='5'/> # virsh start test error: Failed to start domain test error: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d28\x2dtest.scope/iothread1/cpuset.cpus': Numerical result out of range
This looks like lack of validation for a corner case that wouldn't work anyways, so I don't think it's important enough to track through the RHEL process. Moving to the upstream libvirt tracker.
*** Bug 1615226 has been marked as a duplicate of this bug. ***
The upstream tracker is for tracking issues the upstream users actually care about.