Bug 1358383 - The engine VM doesn't restart on Conroe hosts regardless of cluster CPU level
Summary: The engine VM doesn't restart on Conroe hosts regardless of cluster CPU level
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.HostedEngine
Version: 4.0.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.0.6
: 4.0.6.2
Assignee: Roy Golan
QA Contact: Nikolai Sednev
URL:
Whiteboard: PM-05
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-20 15:10 UTC by Simone Tiraboschi
Modified: 2017-05-11 09:29 UTC (History)
9 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-01-18 07:26:13 UTC
oVirt Team: SLA
Embargoed:
rule-engine: ovirt-4.0.z+
mgoldboi: planning_ack+
rgolan: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1370120 0 medium CLOSED Inform the user HE VM needs to be restarted after cluster upgrade 2023-09-14 03:30:06 UTC
oVirt gerrit 64425 0 ovirt-engine-4.0 MERGED utils: Encapsulte OVF build algo into OvfBuilder 2016-11-09 12:55:17 UTC
oVirt gerrit 64426 0 ovirt-engine-4.0 MERGED core: Move OvfManager to bll 2016-11-23 14:17:17 UTC
oVirt gerrit 64427 0 ovirt-engine-4.0 MERGED core: Add Hosted Engine OVF writer 2016-11-23 14:17:22 UTC
oVirt gerrit 64439 0 ovirt-engine-4.0 MERGED core: cleanup unnecessary modifiers in IOvfBuilder 2016-11-18 09:03:10 UTC
oVirt gerrit 64957 0 master POST utils: Encapsulte OVF build algo into OvfBuilder 2016-10-31 09:32:07 UTC
oVirt gerrit 64958 0 master POST core: cleanup unnecessary modifiers in IOvfBuilder 2016-10-31 09:31:58 UTC
oVirt gerrit 64959 0 master POST core: Move OvfManager to bll 2016-10-31 09:31:49 UTC
oVirt gerrit 64960 0 master MERGED core: Add Hosted Engine OVF writer 2016-11-16 11:30:39 UTC
oVirt gerrit 67441 0 ovirt-engine-4.0.6 MERGED core: cleanup unnecessary modifiers in IOvfBuilder 2016-11-28 17:01:51 UTC
oVirt gerrit 67442 0 ovirt-engine-4.0.6 MERGED core: Move OvfManager to bll 2016-11-28 17:02:06 UTC
oVirt gerrit 67443 0 ovirt-engine-4.0.6 MERGED core: Add Hosted Engine OVF writer 2016-11-28 17:02:02 UTC

Internal Links: 1370120

Description Simone Tiraboschi 2016-07-20 15:10:59 UTC
Description of problem:
With hosted-engine-setup we let the user choose the CPU type and we configure the cluster for that.

In the initial vm.conf we have:
cpuType=Conroe

But this is not in the vm.conf we convert back from the OVF_STORE and so vdsm tries to use a generic CPU (qemu64) which is > conroe and so the VM refuses to start.

        <cpu match="exact">
                <model>qemu64</model>
                <feature name="svm" policy="disable"/>
        </cpu>
<on_poweroff>destroy</on_poweroff><on_reboot>destroy</on_reboot><on_crash>destroy</on_crash></domain>
Thread-668::ERROR::2016-07-20 10:40:54,871::vm::765::virt.vm::(_startUnderlyingVm) vmId=`0cbecd57-0085-4395-8add-0c21f35595d2`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 706, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 1996, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 916, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3611, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: cx16
Thread-668::INFO::2016-07-20 10:40:54,877::vm::1308::virt.vm::(setDownStatus) vmId=`0cbecd57-0085-4395-8add-0c21f35595d2`::Changed state to Down: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: cx16 (code=1)

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. deploy hosted-engine on a Conroe host
2. add the first regular SD and wait for the OVF_STORE to be created
3. try to restart the engine VM

Actual results:
vm.conf from the OVF_STORE lacks cpuType=Conroe, VDSM uses <model>qemu64</model> and the VM refuses to start with 'Host CPU does not provide required features: cx16'

Expected results:
ovirt-ha-agent is able to restart the engine VM

Additional info:

Comment 1 Simone Tiraboschi 2016-07-20 15:25:21 UTC
In ovf2VmParams.py we already have 
 vmParams['cpuType'] = text(tree, 'Content/CustomCpuName')
but in the ovf from the OVF_STORE we just have 
 <CustomCpuName />

So maybe we should move this to a different component.

Comment 2 Doron Fediuck 2016-07-21 12:26:14 UTC
Roy,
could this be occurring during the vm import?

Comment 3 Roy Golan 2016-07-26 15:39:29 UTC
(In reply to Doron Fediuck from comment #2)
> Roy,
> could this be occurring during the vm import?

How come we didn't see this till now?  is this due to a chane in cpu_map.xml by libvirt?

We should set the custom cpu name on import. What happens when in time the cluster gets updated?

Comment 4 Red Hat Bugzilla Rules Engine 2016-08-10 08:06:04 UTC
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 5 Nikolai Sednev 2016-11-02 13:05:23 UTC
Does reproduction solely depends on physical Conroe CPU or choosing CPU compatibility mode of "Conroe" is enough?

Comment 6 Simone Tiraboschi 2016-11-02 13:29:51 UTC
I'd expect it also happens on nested env setting the CPU on L1 to Conroe but I didn't specific tested it.

Comment 7 Roy Golan 2016-11-28 14:45:51 UTC
This bug was moved to QA while not having the patches in the 4.0.6 branch.

Comment 8 Roy Golan 2016-11-28 14:46:08 UTC
This bug was moved to QA while not having the patches in the 4.0.6 branch.

Comment 9 Martin Sivák 2016-11-28 14:47:38 UTC
Roy, which patches? I see the OVF writer was merged. Btw, shouldn't this be an ovirt-engine bug?

Comment 10 Roy Golan 2016-11-28 15:31:58 UTC
(In reply to Martin Sivák from comment #9)
> Roy, which patches? I see the OVF writer was merged. Btw, shouldn't this be
> an ovirt-engine bug?

it was merged to 4.0 but not 4.0.6 which is the target version. I don't see it under the 4.0.6 tag either.

Comment 11 Roy Golan 2016-11-28 15:34:26 UTC
(In reply to Martin Sivák from comment #9)
> Roy, which patches? I see the OVF writer was merged. Btw, shouldn't this be
> an ovirt-engine bug?

it was merged to 4.0 but not 4.0.6 which is the target version. I don't see it under the 4.0.6 tag either.

Comment 12 Nikolai Sednev 2016-12-06 11:46:33 UTC
# cat /run/ovirt-hosted-engine-ha/vm.conf
vmId=9d3d1383-0314-431e-a164-8ea57b595721
memSize=16384
display=vnc
devices={index:2,iface:ide,address:{ controller:0, target:0,unit:0, bus:1, type:drive},specParams:{},readonly:true,deviceId:7cf03305-6d43-4090-b7c7-ace489127483,path:,device:cdrom,shared:false,type:disk}
devices={index:0,iface:virtio,format:raw,poolID:00000000-0000-0000-0000-000000000000,volumeID:ecf5980c-0c4c-4e20-9627-742d4207579c,imageID:3d316ed4-b8b4-4ef1-9503-3e97506c555a,specParams:{},readonly:false,domainID:9701d2d7-aca8-4c94-ba1a-9fa2ab02ba56,optional:false,deviceId:3d316ed4-b8b4-4ef1-9503-3e97506c555a,address:{bus:0x00, slot:0x06, domain:0x0000, type:pci, function:0x0},device:disk,shared:exclusive,propagateErrors:off,type:disk,bootOrder:1}
devices={device:scsi,model:virtio-scsi,type:controller}
devices={nicModel:pv,macAddr:00:16:3E:7B:BB:BB,linkActive:true,network:ovirtmgmt,specParams:{},deviceId:0440aaed-1c5e-4f3b-ad33-4a0d69f72096,address:{bus:0x00, slot:0x03, domain:0x0000, type:pci, function:0x0},device:bridge,type:interface}
devices={device:console,specParams:{},type:console,deviceId:096d4af8-5e59-4424-8131-c1772c04d991,alias:console0}
devices={device:vga,alias:video0,type:video}
vmName=HostedEngine
spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
smp=4
maxVCpus=4
cpuType=Conroe
emulatedMachine=rhel6.5.0
devices={device:virtio,specParams:{source:random},model:virtio,type:rng}


Engine was successfully restarted on host which was set as Conroe model during hosted-engine deployment.

Works for me on these components on host:
qemu-kvm-rhev-2.6.0-28.el7_3.1.x86_64
vdsm-4.18.18-1.el7ev.x86_64
ovirt-setup-lib-1.0.2-1.el7ev.noarch
ovirt-imageio-common-0.3.0-0.el7ev.noarch
sanlock-3.4.0-1.el7.x86_64
ovirt-vmconsole-1.0.4-1.el7ev.noarch
mom-0.5.8-1.el7ev.noarch
rhev-release-4.0.6-4-001.noarch
ovirt-hosted-engine-ha-2.0.6-1.el7ev.noarch
ovirt-host-deploy-1.5.3-1.el7ev.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch
ovirt-imageio-daemon-0.4.0-0.el7ev.noarch
libvirt-client-2.0.0-10.el7.x86_64
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
rhevm-appliance-20161116.0-1.el7ev.noarch
ovirt-hosted-engine-setup-2.0.4.1-2.el7ev.noarch
Linux version 3.10.0-514.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Wed Oct 19 11:24:13 EDT 2016
Linux 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.3 (Maipo)

Engine:
rhevm-setup-plugins-4.0.0.3-1.el7ev.noarch
rhevm-spice-client-x86-msi-4.0-3.el7ev.noarch
rhev-release-4.0.6-4-001.noarch
rhevm-dependencies-4.0.0-1.el7ev.noarch
rhev-guest-tools-iso-4.0-6.el7ev.noarch
rhevm-doc-4.0.6-1.el7ev.noarch
rhevm-spice-client-x64-msi-4.0-3.el7ev.noarch
rhevm-4.0.6.2-0.1.el7ev.noarch
rhevm-guest-agent-common-1.0.12-3.el7ev.noarch
rhevm-branding-rhev-4.0.0-6.el7ev.noarch
Linux version 3.10.0-514.2.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Wed Nov 16 13:15:13 EST 2016
Linux 3.10.0-514.2.2.el7.x86_64 #1 SMP Wed Nov 16 13:15:13 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.3 (Maipo)

I've witnessed that engine's VM started just fine right after 3 restarts on host.


Note You need to log in before you can comment on or make changes to this bug.