This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1256836 - Improvement: start vm on host with unusual numa architecture failed
Improvement: start vm on host with unusual numa architecture failed
Status: CLOSED WONTFIX
Product: ovirt-engine
Classification: oVirt
Component: General (Show other bugs)
3.6.0
ppc64le Linux
unspecified Severity low (vote)
: ---
: ---
Assigned To: nobody nobody
: Improvement
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-25 10:49 EDT by Artyom
Modified: 2017-01-17 16:13 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-01-17 16:13:19 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: SLA
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
dfediuck: ovirt‑future?
rule-engine: planning_ack?
alukiano: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
vdsm (15.82 MB, text/plain)
2015-08-25 10:49 EDT, Artyom
no flags Details

  None (edit)
Description Artyom 2015-08-25 10:49:52 EDT
Created attachment 1066894 [details]
vdsm

Description of problem:
Start vm with 1 CPU and without numa nodes on host, that have numa architecture
# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
node 0 size: 12276 MB
node 0 free: 11218 MB
node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
node 1 size: 12288 MB
node 1 free: 11502 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
and with CPU
lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Model name:            Intel(R) Xeon(R) CPU           E5649  @ 2.53GHz
Stepping:              2
CPU MHz:               2660.000
BogoMIPS:              5066.55
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
NUMA node0 CPU(s):     0-5,12-17
NUMA node1 CPU(s):     6-11,18-23

failed with libvirt error

Version-Release number of selected component (if applicable):
host - vdsm-4.17.3-1.el7ev.noarch
engine - rhevm-3.6.0-0.12.master.el6.noarch

How reproducible:
Always

Steps to Reproduce:
1. Add host with the same numa architecture as above to engine
2. Create vm with one cpu, that pinned to host and without numa nodes
3. Start vm

Actual results:
Vm failed to start with error
libvirtError: internal error: CPU IDs in <numa> exceed the <vcpu> count

Expected results:
Vm run without any errors

Additional info:
If I define one numa node vm success to run

dumpxml of vm with one numa node:
<cpu mode='custom' match='exact'>
    <model fallback='allow'>Conroe</model>
    <topology sockets='16' cores='1' threads='1'/>
    <numa>
      <cell id='0' cpus='0' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>


dumpxml of vm without numa node:
<cpu match="exact">
                <model>Conroe</model>
                <topology cores="1" sockets="16" threads="1"/>
                <numa>
                        <cell cpus="0,1,2,3,4,5,12,13,14,15,16,17" memory="1048576"/>
                </numa>
        </cpu>
Comment 1 Doron Fediuck 2015-08-26 09:29:53 EDT
Please verify this topology is 'legal' from numa perspective.
ie- each cell is hosting cores and ram with some reasonable correlation.
What you describe may break the topology since the data will end up interleaving.
Comment 2 Artyom 2015-08-27 06:37:47 EDT
It from IBM blade center, so I believe it legal, but problem here why we send numa cell when I not defined one via engine:
<numa>
                        <cell cpus="0,1,2,3,4,5,12,13,14,15,16,17" memory="1048576"/>
I see difference on PPC hosts and regular x86_64 hosts in vdsm log when I start vm that pinned to host without create VNUMA node

vdsm-4.17.3-1.el7ev.noarch

On PPC host:
we not send numa node
u'cpuType': u'power8', u'smp': u'1', u'smartcardEnable': u'false'

On x86_64 host:
u'cpuType': u'Conroe', u'smp': u'1', u'guestNumaNodes': [{u'nodeIndex': 0, u'cpus': u'0,1,2,3,4,5,12,13,14,15,16,17', u'memory': u'1024'}], u'smartcardEnable': u'false'
Comment 4 Doron Fediuck 2015-11-17 11:21:22 EST
(In reply to Artyom from comment #2)
> It from IBM blade center, so I believe it legal, but problem here why we
> send numa cell when I not defined one via engine:
> <numa>
>                         <cell cpus="0,1,2,3,4,5,12,13,14,15,16,17"
> memory="1048576"/>
> I see difference on PPC hosts and regular x86_64 hosts in vdsm log when I
> start vm that pinned to host without create VNUMA node
> 
> vdsm-4.17.3-1.el7ev.noarch
> 
> On PPC host:
> we not send numa node
> u'cpuType': u'power8', u'smp': u'1', u'smartcardEnable': u'false'
> 
> On x86_64 host:
> u'cpuType': u'Conroe', u'smp': u'1', u'guestNumaNodes': [{u'nodeIndex': 0,
> u'cpus': u'0,1,2,3,4,5,12,13,14,15,16,17', u'memory': u'1024'}],
> u'smartcardEnable': u'false'

PPC currently does not support NUMA.
Did you find this issue on a PPC machine or a standard AMD64 machine?
Comment 5 Artyom 2015-11-29 05:17:48 EST
1) Where you found information that NUMA not supported on ppc64 architecture, I checked it on our power8 hosts and it works fine?
2) I found this issue on x86_64, but on vdsm-4.17.11-0.el7ev.noarch and libvirt-1.2.17-13.el7.x86_64 error not appear and cpu element looks:
<cpu match="exact">
                <model>Conroe</model>
                <topology cores="1" sockets="16" threads="1"/>
                <numa>
                        <cell cpus="0" memory="1048576"/>
                </numa>
        </cpu>

Note You need to log in before you can comment on or make changes to this bug.