Bug 1176020

Summary: libvirt should do a right check for numa cpus set
Product: Red Hat Enterprise Linux 7 Reporter: Luyao Huang <lhuang>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dyuan, honzhang, mprivozn, mzhan, rbalakri
Target Milestone: rcKeywords: Upstream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.2.17-5.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 06:05:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luyao Huang 2014-12-19 07:46:19 UTC
description of problem:
libvirt should do a right check for numa cpus set

Version-Release number of selected component (if applicable):
libvirt-1.2.8-10.el7.x86_64

How reproducible:
100%

Steps to Reproduce:

1.prepare a guest have numa set like this

<vcpu placement='static'>4</vcpu>

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='512000'/>
      <cell id='1' cpus='2-3' memory='512000'/>
    </numa>
  </cpu>



2.edit cpus settings like this :
# virsh edit test3
<vcpu placement='static'>4</vcpu>
  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='512000'/>
      <cell id='1' cpus='1-2' memory='512000'/>
    </numa>
  </cpu>

Domain test3 XML configuration edited.

3.edit cpus settings like this:

# virsh edit test3
<vcpu placement='static'>4</vcpu>
  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='512000'/>
      <cell id='1' cpus='1-3' memory='512000'/>
    </numa>
  </cpu>
error: internal error: Number of CPUs in <numa> exceeds the <vcpu> count
Failed. Try again? [y,n,f,?]:

Actual results:
libvirt just check if the total numa cpus number > vcpu number
      
Expected results:
libvirt should output a error in step 2 or no error in both step 2 and 3

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='512000'/>
      <cell id='1' cpus='1-3' memory='512000'/>
    </numa>
  </cpu>

fail(two cell use a same cpu)

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='512000'/>
      <cell id='1' cpus='1-1' memory='512000'/>
    </numa>
  </cpu>

fail(two cell use a same cpu)

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='512000'/>
      <cell id='1' cpus='2-3' memory='512000'/>
    </numa>
  </cpu>

success

or just remove the check for total numa cpus number and turn to check if the cpu number is valid

Additional info:

Comment 1 Luyao Huang 2015-04-02 08:06:06 UTC
I have talked with Michal and he didn't start it, so i proposed a patch (thanks Michal :) ):

https://www.redhat.com/archives/libvir-list/2015-April/msg00079.html

Comment 2 Michal Privoznik 2015-05-05 12:14:49 UTC
I've just pushed the patch upstream:

commit 8fedbbdb67434a5e1c81c23dfb1f744843a74091
Author:     Luyao Huang <lhuang>
AuthorDate: Tue May 5 18:13:38 2015 +0800
Commit:     Michal Privoznik <mprivozn>
CommitDate: Tue May 5 13:31:47 2015 +0200

    conf: Add the cpu duplicate use check for vm numa settings
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1176020
    
    We had a check for the vcpu count total number in <numa>
    before, however this check is not good enough. There are
    some examples:
    
    1. one of cpu id is out of maxvcpus, can set success(cpu count = 5 < 10):
    
    <vcpu placement='static'>10</vcpu>
    <cell id='0' cpus='0-3,100' memory='512000' unit='KiB'/>
    
    2. use the same cpu in 2 cell, can set success(cpu count = 8 < 10):
    <vcpu placement='static'>10</vcpu>
    <cell id='0' cpus='0-3' memory='512000' unit='KiB'/>
    <cell id='1' cpus='0-3' memory='512000' unit='KiB'/>
    
    3. use the same cpu in 2 cell, cannot set success(cpu count = 11 > 10):
    <vcpu placement='static'>10</vcpu>
    <cell id='0' cpus='0-6' memory='512000' unit='KiB'/>
    <cell id='1' cpus='0-3' memory='512000' unit='KiB'/>
    
    Add a check for numa cpus, check if duplicate use one cpu in more
    than one cell.
    
    Signed-off-by: Luyao Huang <lhuang>
    Signed-off-by: Michal Privoznik <mprivozn>


v1.2.15-28-g8fedbbd

Comment 4 hongming 2015-08-07 06:52:48 UTC
Verify it as follows. 

# rpm -q libvirt
libvirt-1.2.17-3.el7.x86_64


# virsh edit test4

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='1-2' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

error: unsupported configuration: NUMA cells 1 and 0 have overlapping vCPU ids
Failed. Try again? [y,n,i,f,?]: 


# virsh edit test4

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='1' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

error: unsupported configuration: NUMA cells 1 and 0 have overlapping vCPU ids
Failed. Try again? [y,n,i,f,?]: 


# virsh edit test4

    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='0-1' memory='1024000' unit='KiB'/>
    </numa>

error: unsupported configuration: NUMA cells 1 and 0 have overlapping vCPU ids
Failed. Try again? [y,n,i,f,?]: 


# virsh edit test4

 <vcpu placement='static'>4</vcpu>

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='2-4' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

error: internal error: Number of CPUs in <numa> exceeds the <vcpu> count
Failed. Try again? [y,n,i,f,?]: 

# virsh edit test4

 <vcpu placement='static'>4</vcpu>

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='2-3' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

Domain test4 XML configuration edited.

Comment 5 hongming 2015-08-07 06:56:30 UTC
But it doesn't check the range of cpus when virsh edit. Is it should be fixed in this bug ?


# virsh edit test4

  <vcpu placement='static'>4</vcpu>

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='9-10' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>


# virsh edit test4
Domain test4 XML configuration not changed.

# virsh start test4
error: Failed to start domain test4
error: internal error: process exited while connecting to monitor: 2015-08-07T06:51:36.357142Z qemu-kvm: -numa node,nodeid=1,cpus=9-10,memdev=ram-node1: CPU index (9) should be smaller than maxcpus (4)

Comment 6 hongming 2015-08-07 06:57:12 UTC
(In reply to hongming from comment #5)
> But it doesn't check the range of cpus when virsh edit. Is it should be
> fixed in this bug ?
> 
> 
> # virsh edit test4
> 
>   <vcpu placement='static'>4</vcpu>
> 
>   <cpu>
>     <numa>
>       <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
>       <cell id='1' cpus='9-10' memory='1024000' unit='KiB'/>
>     </numa>
>   </cpu>
> 
> 
> # virsh edit test4
> Domain test4 XML configuration changed.
> 
> # virsh start test4
> error: Failed to start domain test4
> error: internal error: process exited while connecting to monitor:
> 2015-08-07T06:51:36.357142Z qemu-kvm: -numa
> node,nodeid=1,cpus=9-10,memdev=ram-node1: CPU index (9) should be smaller
> than maxcpus (4)

Comment 7 Michal Privoznik 2015-08-07 14:40:36 UTC
I've proposed the patch upstream:

https://www.redhat.com/archives/libvir-list/2015-August/msg00226.html

Comment 8 Michal Privoznik 2015-08-07 15:30:49 UTC
And moving to POST again:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2015-August/msg00208.html

Comment 9 hongming 2015-08-27 09:15:52 UTC
Verify it as follows.The result is expected. Move its status to VERIFIED.

# rpm -q libvirt
libvirt-1.2.17-6.el7.x86_64

# virsh edit test4

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='1-2' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

error: unsupported configuration: NUMA cells 1 and 0 have overlapping vCPU ids
Failed. Try again? [y,n,i,f,?]: 

# virsh edit test4

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='1' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

error: unsupported configuration: NUMA cells 1 and 0 have overlapping vCPU ids
Failed. Try again? [y,n,i,f,?]: 

# virsh edit test4

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='0-1' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

error: unsupported configuration: NUMA cells 1 and 0 have overlapping vCPU ids
Failed. Try again? [y,n,i,f,?]: 


# virsh edit test4

<vcpu placement='static'>4</vcpu>

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='2-4' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

error: internal error: Number of CPUs in <numa> exceeds the <vcpu> count
Failed. Try again? [y,n,i,f,?]: 


# virsh edit test4

 <vcpu placement='static'>4</vcpu>

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='2-3' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

Domain test4 XML configuration edited.

# virsh edit test4

 <vcpu placement='static'>4</vcpu>

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='9-10' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>
error: internal error: CPU IDs in <numa> exceed the <vcpu> count
Failed. Try again? [y,n,i,f,?]:

Comment 11 errata-xmlrpc 2015-11-19 06:05:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html