Bug 1462676 - Cpu-hotplug configuration in RHV-M does not work sometimes while it should do
Summary: Cpu-hotplug configuration in RHV-M does not work sometimes while it should do
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.1.3.2
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Michal Skrivanek
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-19 09:08 UTC by jiyan
Modified: 2017-06-29 08:13 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-29 08:13:42 UTC
oVirt Team: Virt
Embargoed:


Attachments (Terms of Use)
Logs for Step3 (12.26 MB, application/x-tar)
2017-06-19 09:08 UTC, jiyan
no flags Details

Description jiyan 2017-06-19 09:08:04 UTC
Created attachment 1289052 [details]
Logs for Step3

Description of problem:
Cpu-hotplug configuration in RHV-M does not work sometimes while it should do.

Version-Release number of selected component (if applicable):
RHV-M server:
rhevm-4.1.3.2-0.1.el7.noarch
ovirt-engine-setup-plugin-ovirt-engine-4.1.3.2-0.1.el7.noarch

RHV-M register host:
qemu-kvm-rhev-2.9.0-9.el7.x86_64
libvirt-3.2.0-9.el7.x86_64
kernel-3.10.0-679.el7.x86_64
vdsm-4.19.18-1.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Configure 'Virtual Sockets' as '1' and check the environment as following:
1.1> In the RHV-M GUI, remove 'CPU' filter from 'none' scheduling policy, and make'cluster' select the 'none' scheduling policy.

1.2> In the RHV-M GUI, configure the data center with hosts and storage, then New a vm called vm1, confirm the vm can start successfully.

1.3> Configure 'system' configuration of vm as following, and check the vm can start the vm normally.
  Total Virtual CPUs:4
  Virtual Sockets:1
  Cores per Virtual Socket:1
  threads per Core:4

1.4> check the libvirt dumpxml file in register host and execute command 'lscpu' in guest:

In Host check libvirt dumpxml:
#virsh dumpxml vm1
  <vcpu placement='static' current='4'>64</vcpu>
  <cpu mode='custom' match='exact' check='full'>
    <topology sockets='16' cores='1' threads='4'/>
  </cpu>

In vm/Guest:
#lscpu
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1


2. Cpu hot-plug 'Virtual Sockets' as 2  -- succeed
2.1> After Step1, Configure 'system' configuration of vm as following and remain the vm running
  Total Virtual CPUs:8
  Virtual Sockets:2
  Cores per Virtual Socket:1
  threads per Core:4

2.2> check the libvirt dumpxml file in register host and execute command 'lscpu' in guest:

In Host check libvirt dumpxml:
#virsh dumpxml vm1
    <vcpu placement='static' current='8'>64</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='no' order='2'/>
    <vcpu id='2' enabled='yes' hotpluggable='no' order='3'/>
    <vcpu id='3' enabled='yes' hotpluggable='no' order='4'/>
    <vcpu id='4' enabled='yes' hotpluggable='yes' order='5'/>
    <vcpu id='5' enabled='yes' hotpluggable='yes' order='6'/>
    <vcpu id='6' enabled='yes' hotpluggable='yes' order='7'/>
    <vcpu id='7' enabled='yes' hotpluggable='yes' order='8'/>
    <vcpu id='8' enabled='no' hotpluggable='yes'/>
    ...
    <vcpu id='63' enabled='no' hotpluggable='yes'/>
  <cpu mode='custom' match='exact' check='full'>
    <topology sockets='16' cores='1' threads='4'/>
  </cpu>

In vm/Guest:
#lscpu
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             2
NUMA node(s):          1


3. Cpu hot-plug 'Virtual Sockets' as 4  -- faile
3.1> After Step1, Configure 'system' configuration of vm as following and remain the vm running
  Total Virtual CPUs:16
  Virtual Sockets:4
  Cores per Virtual Socket:1
  threads per Core:4

3.2> The error info raises as following
Error while executing action: 
vm1:
CPU_HOTPLUG_TOPOLOGY_INVALID


Actual results:
As step 3.2 shows

Expected results:
Refer to Step 2

Additional info:
The attachment includes logs as following:
log1/RHV-server-engine.log
log1/RHV-host-libvirtd.log
log1/RHV-host-qemu-vm1.log
log1/RHV-host-vdsm.log

Comment 1 Tomas Jelinek 2017-06-21 10:37:01 UTC
(In reply to jiyan from comment #0)
> Created attachment 1289052 [details]
> Logs for Step3
> 
> Description of problem:
> Cpu-hotplug configuration in RHV-M does not work sometimes while it should
> do.
> 
> Version-Release number of selected component (if applicable):
> RHV-M server:
> rhevm-4.1.3.2-0.1.el7.noarch
> ovirt-engine-setup-plugin-ovirt-engine-4.1.3.2-0.1.el7.noarch
> 
> RHV-M register host:
> qemu-kvm-rhev-2.9.0-9.el7.x86_64
> libvirt-3.2.0-9.el7.x86_64
> kernel-3.10.0-679.el7.x86_64
> vdsm-4.19.18-1.el7ev.x86_64
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1. Configure 'Virtual Sockets' as '1' and check the environment as following:
> 1.1> In the RHV-M GUI, remove 'CPU' filter from 'none' scheduling policy,
> and make'cluster' select the 'none' scheduling policy.
> 
> 1.2> In the RHV-M GUI, configure the data center with hosts and storage,
> then New a vm called vm1, confirm the vm can start successfully.
> 
> 1.3> Configure 'system' configuration of vm as following, and check the vm
> can start the vm normally.
>   Total Virtual CPUs:4
>   Virtual Sockets:1
>   Cores per Virtual Socket:1
>   threads per Core:4
> 
> 1.4> check the libvirt dumpxml file in register host and execute command
> 'lscpu' in guest:
> 
> In Host check libvirt dumpxml:
> #virsh dumpxml vm1
>   <vcpu placement='static' current='4'>64</vcpu>
>   <cpu mode='custom' match='exact' check='full'>
>     <topology sockets='16' cores='1' threads='4'/>
>   </cpu>
> 
> In vm/Guest:
> #lscpu
> CPU(s):                4
> On-line CPU(s) list:   0-3
> Thread(s) per core:    1

the reason for this is that the host is an AMD (see bug 1462183)

> Core(s) per socket:    4
> Socket(s):             1
> NUMA node(s):          1
> 
> 
> 2. Cpu hot-plug 'Virtual Sockets' as 2  -- succeed
> 2.1> After Step1, Configure 'system' configuration of vm as following and
> remain the vm running
>   Total Virtual CPUs:8
>   Virtual Sockets:2
>   Cores per Virtual Socket:1
>   threads per Core:4
> 
> 2.2> check the libvirt dumpxml file in register host and execute command
> 'lscpu' in guest:
> 
> In Host check libvirt dumpxml:
> #virsh dumpxml vm1
>     <vcpu placement='static' current='8'>64</vcpu>
>   <vcpus>
>     <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
>     <vcpu id='1' enabled='yes' hotpluggable='no' order='2'/>
>     <vcpu id='2' enabled='yes' hotpluggable='no' order='3'/>
>     <vcpu id='3' enabled='yes' hotpluggable='no' order='4'/>
>     <vcpu id='4' enabled='yes' hotpluggable='yes' order='5'/>
>     <vcpu id='5' enabled='yes' hotpluggable='yes' order='6'/>
>     <vcpu id='6' enabled='yes' hotpluggable='yes' order='7'/>
>     <vcpu id='7' enabled='yes' hotpluggable='yes' order='8'/>
>     <vcpu id='8' enabled='no' hotpluggable='yes'/>
>     ...
>     <vcpu id='63' enabled='no' hotpluggable='yes'/>
>   <cpu mode='custom' match='exact' check='full'>
>     <topology sockets='16' cores='1' threads='4'/>
>   </cpu>
> 
> In vm/Guest:
> #lscpu
> CPU(s):                8
> On-line CPU(s) list:   0-7
> Thread(s) per core:    1
> Core(s) per socket:    4
> Socket(s):             2
> NUMA node(s):          1
> 
> 
> 3. Cpu hot-plug 'Virtual Sockets' as 4  -- faile
> 3.1> After Step1, Configure 'system' configuration of vm as following and
> remain the vm running
>   Total Virtual CPUs:16
>   Virtual Sockets:4
>   Cores per Virtual Socket:1
>   threads per Core:4
> 
> 3.2> The error info raises as following
> Error while executing action: 
> vm1:
> CPU_HOTPLUG_TOPOLOGY_INVALID

this happens when you have a host which has less CPUs than you try to hotplug. 
e.g. does your host have at least 16 CPUs?

> 
> 
> Actual results:
> As step 3.2 shows
> 
> Expected results:
> Refer to Step 2
> 
> Additional info:
> The attachment includes logs as following:
> log1/RHV-server-engine.log
> log1/RHV-host-libvirtd.log
> log1/RHV-host-qemu-vm1.log
> log1/RHV-host-vdsm.log

Comment 2 jiyan 2017-06-27 11:32:42 UTC
Hi, Tomas.

I tested same scenario in other different environment.


The physical host cpu info is as follows:
# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel

The test steps as follows:
Step 1:
1.1> Configure 'system' configuration of vm as following, and check the vm
can start the vm normally.
Total Virtual CPUs:12
Virtual Sockets:1
Cores per Virtual Socket:6
threads per Core:2

1.2> check the libvirt dumpxml file in register host and execute command
'lscpu' in guest:

In Host check libvirt dumpxml:
#virsh dumpxml vm1
<vcpu placement='static' current='12'>192</vcpu>
  <cpu mode='custom' match='exact' check='full'>
    <topology sockets='16' cores='6' threads='2'/>
    <numa>
      <cell id='0' cpus='0-11' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>
 
In vm/Guest:
#lscpu# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                12
On-line CPU(s) list:   0-11
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel

Step 2:
2.1> Configure 'system' configuration of vm as following, and check the vm
can start the vm normally.
Total Virtual CPUs:48
Virtual Sockets:4
Cores per Virtual Socket:6
threads per Core:2

2.2> check the libvirt dumpxml file in register host and execute command
'lscpu' in guest:
In Host check libvirt dumpxml:
#virsh dumpxml vm1
 <vcpu placement='static' current='48'>192</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='no' order='2'/>
    <vcpu id='2' enabled='yes' hotpluggable='no' order='3'/>
...
    <vcpu id='11' enabled='yes' hotpluggable='no' order='12'/>
    <vcpu id='12' enabled='yes' hotpluggable='yes' order='13'/>
    <vcpu id='13' enabled='yes' hotpluggable='yes' order='14'/>
...
    <vcpu id='45' enabled='yes' hotpluggable='yes' order='46'/>
    <vcpu id='46' enabled='yes' hotpluggable='yes' order='47'/>
    <vcpu id='47' enabled='yes' hotpluggable='yes' order='48'/>
    <vcpu id='48' enabled='no' hotpluggable='yes'/>
    <vcpu id='49' enabled='no' hotpluggable='yes'/>
...
    <vcpu id='189' enabled='no' hotpluggable='yes'/>
    <vcpu id='190' enabled='no' hotpluggable='yes'/>
    <vcpu id='191' enabled='no' hotpluggable='yes'/>
  </vcpus>

  <cpu mode='custom' match='exact' check='full'>
    <topology sockets='16' cores='6' threads='2'/>
    <numa>
      <cell id='0' cpus='0-11' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>

In vm/Guest:
#lscpu# lscpu
# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                48
On-line CPU(s) list:   0-47
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             4
NUMA node(s):          1
Vendor ID:             GenuineIntel



>this happens when you have a host which has less CPUs than you try to hotplug. 
>'cpu' filter is not related to the CPU Hotplug, hotplug enforces this limit and lets the overcommit considerations only to scheduler.



In the scenario above, this also happens the host which has less CPUs than trying to hotplug, but that seems work. But in the scenario as follows, in Step 3, it failed. Both of them happen that the host which has less CPUs than trying to hotplug, one succeed while the other failed.


When I try to configure as following,the error info raises:
Step 3:
3.1> Configure 'system' configuration of vm as following, and check the vm
can start the vm normally.
Total Virtual CPUs:60
Virtual Sockets:5
Cores per Virtual Socket:6
threads per Core:2

The error info:
Error while executing action: 
vm1:
CPU_HOTPLUG_TOPOLOGY_INVALID

Comment 3 Tomas Jelinek 2017-06-29 08:13:42 UTC
(In reply to jiyan from comment #2)
> Hi, Tomas.
> 
> I tested same scenario in other different environment.
> 
> 
> The physical host cpu info is as follows:
> # lscpu
> Architecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                24
> On-line CPU(s) list:   0-23
> Thread(s) per core:    2
> Core(s) per socket:    6
> Socket(s):             2
> NUMA node(s):          2
> Vendor ID:             GenuineIntel
> 
> The test steps as follows:
> Step 1:
> 1.1> Configure 'system' configuration of vm as following, and check the vm
> can start the vm normally.
> Total Virtual CPUs:12
> Virtual Sockets:1
> Cores per Virtual Socket:6
> threads per Core:2
> 
> 1.2> check the libvirt dumpxml file in register host and execute command
> 'lscpu' in guest:
> 
> In Host check libvirt dumpxml:
> #virsh dumpxml vm1
> <vcpu placement='static' current='12'>192</vcpu>
>   <cpu mode='custom' match='exact' check='full'>
>     <topology sockets='16' cores='6' threads='2'/>
>     <numa>
>       <cell id='0' cpus='0-11' memory='1048576' unit='KiB'/>
>     </numa>
>   </cpu>
>  
> In vm/Guest:
> #lscpu# lscpu
> Architecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                12
> On-line CPU(s) list:   0-11
> Thread(s) per core:    2
> Core(s) per socket:    6
> Socket(s):             1
> NUMA node(s):          1
> Vendor ID:             GenuineIntel
> 
> Step 2:
> 2.1> Configure 'system' configuration of vm as following, and check the vm
> can start the vm normally.
> Total Virtual CPUs:48
> Virtual Sockets:4
> Cores per Virtual Socket:6
> threads per Core:2
> 
> 2.2> check the libvirt dumpxml file in register host and execute command
> 'lscpu' in guest:
> In Host check libvirt dumpxml:
> #virsh dumpxml vm1
>  <vcpu placement='static' current='48'>192</vcpu>
>   <vcpus>
>     <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
>     <vcpu id='1' enabled='yes' hotpluggable='no' order='2'/>
>     <vcpu id='2' enabled='yes' hotpluggable='no' order='3'/>
> ...
>     <vcpu id='11' enabled='yes' hotpluggable='no' order='12'/>
>     <vcpu id='12' enabled='yes' hotpluggable='yes' order='13'/>
>     <vcpu id='13' enabled='yes' hotpluggable='yes' order='14'/>
> ...
>     <vcpu id='45' enabled='yes' hotpluggable='yes' order='46'/>
>     <vcpu id='46' enabled='yes' hotpluggable='yes' order='47'/>
>     <vcpu id='47' enabled='yes' hotpluggable='yes' order='48'/>
>     <vcpu id='48' enabled='no' hotpluggable='yes'/>
>     <vcpu id='49' enabled='no' hotpluggable='yes'/>
> ...
>     <vcpu id='189' enabled='no' hotpluggable='yes'/>
>     <vcpu id='190' enabled='no' hotpluggable='yes'/>
>     <vcpu id='191' enabled='no' hotpluggable='yes'/>
>   </vcpus>
> 
>   <cpu mode='custom' match='exact' check='full'>
>     <topology sockets='16' cores='6' threads='2'/>
>     <numa>
>       <cell id='0' cpus='0-11' memory='1048576' unit='KiB'/>
>     </numa>
>   </cpu>
> 
> In vm/Guest:
> #lscpu# lscpu
> # lscpu
> Architecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                48
> On-line CPU(s) list:   0-47
> Thread(s) per core:    2
> Core(s) per socket:    6
> Socket(s):             4
> NUMA node(s):          1
> Vendor ID:             GenuineIntel
> 
> 

this is all correct

> 
> >this happens when you have a host which has less CPUs than you try to hotplug. 
> >'cpu' filter is not related to the CPU Hotplug, hotplug enforces this limit and lets the overcommit considerations only to scheduler.
> 
> 
> 
> In the scenario above, this also happens the host which has less CPUs than
> trying to hotplug, but that seems work. But in the scenario as follows, in
> Step 3, it failed. Both of them happen that the host which has less CPUs
> than trying to hotplug, one succeed while the other failed.
> 
> 
> When I try to configure as following,the error info raises:
> Step 3:
> 3.1> Configure 'system' configuration of vm as following, and check the vm
> can start the vm normally.
> Total Virtual CPUs:60
> Virtual Sockets:5
> Cores per Virtual Socket:6
> threads per Core:2
> 
> The error info:
> Error while executing action: 
> vm1:
> CPU_HOTPLUG_TOPOLOGY_INVALID

yes, because you can do overcommit only on start, not on hotplug. So, as far as I see all is working well, closing


Note You need to log in before you can comment on or make changes to this bug.