Bug 1479776

Summary: After OVF generation HE VM can not start
Product: [oVirt] ovirt-engine Reporter: Artyom <alukiano>
Component: BLL.HostedEngineAssignee: Andrej Krejcir <akrejcir>
Status: CLOSED CURRENTRELEASE QA Contact: Artyom <alukiano>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.2.0CC: akrejcir, alukiano, bugs, nsednev, stirabos, ylavi
Target Milestone: ovirt-4.2.0Keywords: Regression, Triaged
Target Release: ---Flags: rule-engine: ovirt-4.2+
rule-engine: blocker+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-20 11:20:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1487915    
Bug Blocks: 1426219    
Attachments:
Description Flags
engine, vdsm and agent logs none

Description Artyom 2017-08-09 11:36:27 UTC
Created attachment 1311173 [details]
engine, vdsm and agent logs

Description of problem:
Engine generate HE VM OVF file with incorrect virtio-scsi controller format.

Version-Release number of selected component (if applicable):
vdsm-4.20.2-41.gitd789a5e.el7.centos.x86_64
ovirt-hosted-engine-ha-2.2.0-0.0.master.20170616124434.20170616124430.git18dac95.el7.centos.noarch
ovirt-hosted-engine-setup-2.2.0-0.0.master.20170731095521.gitac5912a.el7.centos.noarch
ovirt-engine-4.2.0-0.0.master.20170806164941.git514c1c9.el7.centos.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy HE environment on clean host
2. Add master storage domain to it and wait for auto-import operation
3. Wait > hour(I believe we can reduce OVF update interval to speed it up, but I did not try it)
4. Enable global maintenance
5. Power-off HE VM
6. Start HE VM

Actual results:
HE VM fails to start with traceback under vdsm log
2017-08-09 12:02:25,378+0300 ERROR (vm/ff13f132) [virt.vm] (vmId='ff13f132-5ab0-4da0-bef8-afec4210042c') The vm start process failed (vm:823)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 752, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2446, in _run
    dom = self._connection.defineXML(domxml)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 125, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 586, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3585, in defineXML
    if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
libvirtError: unsupported configuration: Unknown controller type 'virtio-scsi

Expected results:
HE VM succeeds to start without any tracebacks in the vdsm log

Additional info:
W/A
1. # cp /var/run/ovirt-hosted-engine-ha/vm.conf /root/
2. Remove line devices={device:virtio-scsi,specParams:{index:0,model:virtio-scsi},type:controller,deviceId:c0c21911-915b-418a-b01d-7e37ed0ef6e1,address:{type:pci,slot:0x04,bus:0x00,domain:0x0000,function:0x0}}
3. # hosted-engine --vm-start --vm-conf=vm.conf

Comment 1 Doron Fediuck 2017-08-09 12:19:31 UTC
Can you please check if this is a dup of bug 1363926?
Mainly look at comment 2 (there's a naming issue).

Comment 2 Artyom 2017-08-09 12:29:47 UTC
Yes, I saw this bug, but for me, it looks different.
# vdsm-client StorageDomain getInfo storagedomainID="93400e8c-f145-4b47-96ec-af211d60ecd5"
{
    "uuid": "93400e8c-f145-4b47-96ec-af211d60ecd5", 
    "version": "4", 
    "role": "Regular", 
    "remotePath": "yellow-vdsb.qa.lab.tlv.redhat.com:/Compute_NFS/alukiano/he_0", 
    "type": "NFS", 
    "class": "Data", 
    "pool": [
        "59898fb8-0345-01ea-0144-000000000328"
    ], 
    "name": "hosted_storage"
}

Also, in 4.2 I do not have HostedEngineStorageDomainName parameters under engine-config at all.

And I reproduced it 3 times on different environments

Comment 3 Red Hat Bugzilla Rules Engine 2017-08-10 11:04:20 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 4 Nikolai Sednev 2017-08-13 09:27:51 UTC
Just in case, to shorten 1 hour OVF delay you may use these:
engine-config --list | grep -i ovf
engine-config -s OvfUpdateIntervalInMinutes=1
service ovirt-engine restart

Comment 5 Artyom 2017-09-24 05:54:39 UTC
Verified on ovirt-engine-4.2.0-0.0.master.20170917124606.gita804ef7.el7.centos.noarch

Comment 6 Sandro Bonazzola 2017-12-20 11:20:48 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.