Bug 1254560 - [scale] vdsm failed to start vm
Summary: [scale] vdsm failed to start vm
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.6.0
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Michal Skrivanek
QA Contact: Eldad Marciano
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-18 11:53 UTC by Eldad Marciano
Modified: 2016-04-20 01:29 UTC (History)
13 users (show)

Fixed In Version: 3.6.0-11
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-04-20 01:29:15 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 45113 0 master MERGED core: send guest cpus in numa node, not host cpus Never
oVirt gerrit 45225 0 ovirt-engine-3.6 MERGED core: send guest cpus in numa node, not host cpus Never

Description Eldad Marciano 2015-08-18 11:53:36 UTC
Description of problem:
failed to start vm on top of rhevm 3.6.0.9 build, and host rhel 7.2


seems like Engine computes NUMA settings and VCPU settings
separately, without cross-check the correctness of the total configuration

adding fromani's output:

This is what Engine sent:

Thread-555::DEBUG::2015-08-17
15:39:30,877::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
'VM.create' in bridge with {u'vmParams': {u'acpiEnable': u'true',
u'emulatedMachine': u'pc-i440fx-rhel7.2
.0', u'tabletEnable': u'true', u'vmId':
u'ea8fd65b-990c-428d-a7bd-062e12da56bc', u'memGuaranteedSize': 128,
u'transparentHugePages': u'true', u'spiceSslCipherSuite': u'DEFAULT',
u'cpuType': u'SandyBridge', u'cus
tom': {u'device_eeb02d19-f6af-4643-8565-6e587716b7a7':
u"VmDevice:{id='VmDeviceId:{deviceId='eeb02d19-f6af-4643-8565-6e587716b7a7',
vmId='ea8fd65b-990c-428d-a7bd-062e12da56bc'}', device='ide',
type='CONTROLLER',
 bootOrder='0', specParams='[]', address='{slot=0x01, bus=0x00,
 domain=0x0000, type=pci, function=0x1}', managed='false', plugged='true',
 readOnly='false', deviceAlias='ide0', customProperties='[]', snapshotId='
null', logicalName='null', usingScsiReservation='false'}",
u'device_eeb02d19-f6af-4643-8565-6e587716b7a7device_1eba2661-58b9-475e-9c69-908617582210':
u"VmDevice:{id='VmDeviceId:{deviceId='1eba2661-58b9-475e-9c69
-908617582210', vmId='ea8fd65b-990c-428d-a7bd-062e12da56bc'}', device='unix',
type='CHANNEL', bootOrder='0', specParams='[]', address='{bus=0,
controller=0, type=virtio-serial, port=1}', managed='false', plugged
='true', readOnly='false', deviceAlias='channel0', customProperties='[]',
snapshotId='null', logicalName='null', usingScsiReservation='false'}",
u'device_eeb02d19-f6af-4643-8565-6e587716b7a7device_1eba2661-58b9-
475e-9c69-908617582210device_71345a6e-4bb7-4d2e-9ab7-2dc06ca848b4':
u"VmDevice:{id='VmDeviceId:{deviceId='71345a6e-4bb7-4d2e-9ab7-2dc06ca848b4',
vmId='ea8fd65b-990c-428d-a7bd-062e12da56bc'}', device='unix', type
='CHANNEL', bootOrder='0', specParams='[]', address='{bus=0, controller=0,
type=virtio-serial, port=2}', managed='false', plugged='true',
readOnly='false', deviceAlias='channel1', customProperties='[]', snapshot
Id='null', logicalName='null', usingScsiReservation='false'}"}, u'smp': u'1',
u'guestNumaNodes': [{u'nodeIndex': 0, u'cpus':
u'0,2,4,6,8,10,12,14,16,18,20,22', u'memory': u'512'}], u'numaTune':
{u'nodeset': u'0,
1', u'mode': u'interleave'}, u'maxMemSlots': 16, u'vmType': u'kvm',
u'memSize': 512, u'smpCoresPerSocket': u'1', u'vmName': u'test', u'nice':
u'0', u'maxMemSize': 4194304, u'bootMenuEnable': u'false', u'smartcar
dEnable': u'false', u'keyboardLayout': u'en-us', u'kvmEnable': u'true',
u'pitReinjection': u'false', u'displayNetwork': u'ovirtmgmt', u'devices':
[{u'index': u'3', u'iface': u'ide', u'specParams': {u'vmPayload':
 {u'volId': u'config-2', u'file': {u'openstack/latest/meta_data.json':
 u'ewogICJsYXVuY2hfaW5kZXgiIDogIjAiLAogICJhdmFpbGFiaWxpdHlfem9uZSIgOiAibm92YSIsCiAgInV1aWQiIDogIjVkOGJkZWIwLTEzNTktNDVkYy05Mzk2LTdmNDFkMjU2OT
ViNCIsCiAgIm1ldGEiIDogewogICAgImVzc2VudGlhbCIgOiAiZmFsc2UiLAogICAgInJvbGUiIDogInNlcnZlciIsCiAgICAiZHNtb2RlIiA6ICJsb2NhbCIKICB9Cn0=',
u'openstack/latest/user_data':
u'I2Nsb3VkLWNvbmZpZwpzc2hfcHdhdXRoOiB0cnVlCmRpc
2FibGVfcm9vdDogMApvdXRwdXQ6CiAgYWxsOiAnPj4gL3Zhci9sb2cvY2xvdWQtaW5pdC1vdXRwdXQubG9nJwpjaHBhc3N3ZDoKICBleHBpcmU6IGZhbHNlCnJ1bmNtZDoKLSAnc2VkIC1pICcnL15kYXRhc291cmNlX2xpc3Q6IC9kJycgL2V0Yy9jbG91ZC9jbG91ZC5jZmc7IGVj
aG8gJydkYXRhc291cmNlX2xpc3Q6CiAgWyJOb0Nsb3VkIiwgIkNvbmZpZ0RyaXZlIl0nJyA+PiAvZXRjL2Nsb3VkL2Nsb3VkLmNmZycK'}}},
u'readonly': u'true', u'deviceId': u'3b4ae4e9-0a3a-4ca2-a0ab-021ebecb3168',
u'path': u'', u'device':
u'cdrom', u'shared': u'false', u'type': u'disk'}, {u'device': u'cirrus',
u'specParams': {u'vram': u'32768', u'heads': u'1'}, u'type': u'video',
u'deviceId': u'b04e2610-be9d-4564-af76-9ebef6f00d9a'}, {u'device':
u'vnc', u'specParams': {}, u'type': u'graphics', u'deviceId':
u'2ce08359-708c-4b51-af81-84dfbb08456f'}, {u'index': u'2', u'iface': u'ide',
u'address': {u'bus': u'1', u'controller': u'0', u'type': u'drive', u'tar
get': u'0', u'unit': u'0'}, u'specParams': {u'path': u''}, u'readonly':
u'true', u'deviceId': u'2915ba9f-adf2-4e8f-95ba-776a22425f87', u'path': u'',
u'device': u'cdrom', u'shared': u'false', u'type': u'disk'}, {
u'index': 0, u'iface': u'virtio', u'format': u'raw', u'bootOrder': u'1',
u'poolID': u'c137e30f-34f0-48f2-8c7c-3b0e52b0ff50', u'volumeID':
u'9c0a0cb9-bf2d-4b70-b536-f2b14ee38f72', u'imageID': u'dc07531c-1833-429d
-9138-5a637e587346', u'specParams': {}, u'readonly': u'false', u'domainID':
u'a96545a8-db17-41e3-87ca-c22be1efb7ff', u'optional': u'false', u'deviceId':
u'dc07531c-1833-429d-9138-5a637e587346', u'address': {u'sl
ot': u'0x06', u'bus': u'0x00', u'domain': u'0x0000', u'type': u'pci',
u'function': u'0x0'}, u'device': u'disk', u'shared': u'false',
u'propagateErrors': u'off', u'type': u'disk'}, {u'nicModel': u'pv',
u'macAddr'
: u'00:1a:4a:62:89:00', u'linkActive': u'true', u'network': u'ovirtmgmt',
u'filter': u'vdsm-no-mac-spoofing', u'specParams': {u'inbound': {},
u'outbound': {}}, u'deviceId': u'4f46f12d-0429-4a96-8971-f37316ff6174
', u'device': u'bridge', u'type': u'interface'}, {u'device':
u'virtio-serial', u'specParams': {}, u'type': u'controller', u'deviceId':
u'db018130-6677-4152-8cd2-ae0b1e94eab3', u'address': {u'slot': u'0x05', u'bu
s': u'0x00', u'domain': u'0x0000', u'type': u'pci', u'function': u'0x0'}}],
u'timeOffset': u'0', u'maxVCpus': u'16', u'spiceSecureChannels':
u'smain,sinputs,scursor,splayback,srecord,sdisplay,susbredir,ssmartcar
d', u'display': u'vnc'}, u'vmID': u'ea8fd65b-990c-428d-a7bd-062e12da56bc'}


relevant vdsCaps output:

        cpuCores = '12'
        cpuFlags =
        'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,nopl,xtopology,nonstop_tsc,aperfmperf,eagerfpu,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,dca,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,avx,lahf_lm,ida,arat,pln,pts,dtherm,tpr_shadow,vnmi,flexpriority,ept,vpid,xsaveopt,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270,model_SandyBridge'
        cpuModel = 'Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz'
        cpuSockets = '2'
        cpuSpeed = '1200.156'
        cpuThreads = '24'
        emulatedMachines = ['pc-i440fx-rhel7.1.0',
                            'rhel6.3.0',
                            'pc-q35-rhel7.2.0',
                            'pc-i440fx-rhel7.0.0',
                            'rhel6.1.0',
                            'rhel6.6.0',
                            'rhel6.2.0',
                            'pc',
                            'pc-q35-rhel7.0.0',
                            'pc-q35-rhel7.1.0',
                            'q35',
                            'pc-i440fx-rhel7.2.0',
                            'rhel6.4.0',
                            'rhel6.0.0',
                            'rhel6.5.0']
        guestOverhead = '65'
        hooks = {}
        hostdevPassthrough = 'false'
        kdumpStatus = 0
        kvmEnabled = 'true'
        lastClient = '127.0.0.1'
        lastClientIface = 'lo'
        liveMerge = 'true'
        liveSnapshot = 'true'
        memSize = '64219'
        netConfigDirty = 'False'
        numaNodeDistance = {'0': [10, 20], '1': [20, 10]}
        numaNodes = {'0': {'cpus': [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20,
        22],
                           'totalMemory': '32722'},
                     '1': {'cpus': [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21,
                     23],
                           'totalMemory': '32768'}}
        onlineCpus =
        '0,2,4,6,8,10,12,14,16,18,20,22,1,3,5,7,9,11,13,15,17,19,21,23'
        operatingSystem = {'name': 'RHEL', 'release': '3.el7', 'version':
        '7.2'}
        packages2 = {'kernel': {'buildtime': 1438941184.0,
                                'release': '304.el7.x86_64',
                                'version': '3.10.0'},
                     'librbd1': {'buildtime': 1434713249, 'release': '3.el7',
                     'version': '0.80.7'},
                     'libvirt': {'buildtime': 1438952903, 'release': '4.el7',
                     'version': '1.2.17'},
                     'mom': {'buildtime': 1437041258, 'release': '1.el7ev',
                     'version': '0.5.0'},
                     'qemu-img': {'buildtime': 1439567932,
                                  'release': '18.el7',
                                  'version': '2.3.0'},
                     'qemu-kvm': {'buildtime': 1439567932,
                                  'release': '18.el7',
                                  'version': '2.3.0'},
                     'spice-server': {'buildtime': 1437475779,
                                      'release': '13.el7',
                                      'version': '0.12.4'},
                     'vdsm': {'buildtime': 1439374220, 'release': '1.el7ev',
                     'version': '4.17.2'}}
        reservedMem = '321'
        rngSources = ['random']
        selinux = {'mode': '0'}
        software_revision = '1'
        software_version = '4.17'
        supportedENGINEs = ['3.4', '3.5', '3.6']
        uuid = '4C4C4544-0051-4E10-804D-B5C04F515731'
        version_name = 'Snow Man'
        vlans = {}
        vmTypes = ['kvm']

host lscpu:

[root@host05-rack06 vdsm]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Model name:            Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
Stepping:              7
CPU MHz:               1199.921
BogoMIPS:              4003.77
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              15360K
NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23



Version-Release number of selected component (if applicable):
engine - 3.6.0-0.11.master.el6
vdsm - vdsm-4.17.2-1.el7ev.noarch
libvirt - libvirt-client-1.2.17-4.el7.x86_64
kernel - 3.10.0-304.el7.x86_64
rhel - 7.2

How reproducible:
100%

Steps to Reproduce:
1. install engine 3.6.0.9, add dc, cluster, storage, host(running rhel 7.2)
2. create new vm from template
3. run the vm

Actual results:
vm failed to start

Expected results:
vm successfully running 

Additional info:

Comment 1 Omer Frenkel 2015-08-18 12:45:32 UTC
as a workaround you can disable hot-plug memory in the db:

insert into vdc_options(option_name ,option_value,version) values('HotPlugMemorySupported','{"x86_64":"false","ppc64":"false"}','3.6');

if you do this, you need to remember to enable it back once this bug is solved

Comment 2 Omer Frenkel 2015-08-18 13:26:29 UTC
please attach vdsm log that contains the create + the failure message

Comment 3 Roy Golan 2015-08-18 13:39:39 UTC
I feel this hides another bug - can you create a vm, pinned to this host, and create one numa node, and pin the numa node?

Comment 4 Eldad Marciano 2015-08-19 08:16:44 UTC
(In reply to Omer Frenkel from comment #2)
> please attach vdsm log that contains the create + the failure message

Currently dev(rgolan) investigating the QE setup all logs with you.

Comment 6 Eldad Marciano 2016-01-27 23:05:04 UTC
I switched back to values('HotPlugMemorySupported','{"x86_64":"true","ppc64":"true"}','3.6');
and a vm start up correctly
on top of 3.6.2


Note You need to log in before you can comment on or make changes to this bug.