Bug 1363926 - Hosted Engine fails to start with "libvirtError: unsupported configuration: Unknown controller type 'virtio-scsi'"
Summary: Hosted Engine fails to start with "libvirtError: unsupported configuration: U...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.17.33
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Yedidyah Bar David
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-04 01:04 UTC by David Galloway
Modified: 2016-09-11 12:07 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-09-08 08:09:06 UTC
oVirt Team: Integration
Embargoed:
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
vdsm.log during 'hosted-engine --vm-start' (14.87 KB, text/plain)
2016-08-04 01:04 UTC, David Galloway
no flags Details
vdsm.log after running "cat /etc/ovirt-hosted-engine/vm.conf > /var/run/ovirt-hosted-engine-ha/vm.conf; hosted-engine --vm-start" (26.72 KB, text/plain)
2016-08-04 01:08 UTC, David Galloway
no flags Details

Description David Galloway 2016-08-04 01:04:36 UTC
Created attachment 1187312 [details]
vdsm.log during 'hosted-engine --vm-start'

(I apologize if I categorized this bug horribly.. first bug I've filed on RHEV/oVirt.)

Description of problem:

For a while now, I've been unable to start my hosted engine vm using `hosted-engine --vm-start`

The only indication of a problem occurs in vdsm.log.  See below.

Thread-2143::ERROR::2016-08-03 20:43:42,990::vm::759::virt.vm::(_startUnderlyingVm) vmId=`6406669d-7df3-4719-a3ae-bf82f523ff03`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 703, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 1947, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3611, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: unsupported configuration: Unknown controller type 'virtio-scsi'


Version-Release number of selected component (if applicable):
vdsm-4.17.33-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.5.7-1.el7ev.noarch
ovirt-vmconsole-1.0.4-1.el7ev.noarch
ovirt-hosted-engine-setup-1.3.7.2-1.el7ev.noarch

How reproducible:
I haven't been able to locate a bug or error like this anywhere else so probably not very reproducible.

Steps to Reproduce:
1. hosted-engine --vm-start
2. VM Doesn't start

Actual results:
Hosted Engine VM doesn't start

Expected results:
Hosted Engine VM starts

Additional info:
I'm not sure where the relevant parts in vdsm.log start and end so I've included the chunk from the log where the VM attempts to start and fails.  It seems like the remote Hosted Engine VM's metadata is corrupt because I'm able to start the VM if I quickly overwrite /var/run/ovirt-hosted-engine-ha/vm.conf using "cat /etc/ovirt-hosted-engine/vm.conf > /var/run/ovirt-hosted-engine-ha/vm.conf; hosted-engine --vm-start"

[root@hv01 dgalloway]# cat /etc/ovirt-hosted-engine/vm.conf
vmId=6406669d-7df3-4719-a3ae-bf82f523ff03
memSize=4096
display=vnc
devices={index:2,iface:ide,address:{ controller:0, target:0,unit:0, bus:1, type:drive},specParams:{},readonly:true,deviceId:9f0fc5b8-4bee-4145-be64-6b0d2840f958,path:,device:cdrom,shared:false,type:disk}
devices={index:0,iface:virtio,format:raw,poolID:00000000-0000-0000-0000-000000000000,volumeID:a48f01b4-00e2-4846-905f-e0538cf1f1cd,imageID:7ac5eada-7d21-4f1e-b651-207ce5a28d13,specParams:{},readonly:false,domainID:ca46f740-bc09-477f-b09e-cf7e746dca73,optional:false,deviceId:7ac5eada-7d21-4f1e-b651-207ce5a28d13,address:{bus:0x00, slot:0x06, domain:0x0000, type:pci, function:0x0},device:disk,shared:exclusive,propagateErrors:off,type:disk,bootOrder:1}
devices={device:scsi,model:virtio-scsi,type:controller}
devices={nicModel:pv,macAddr:52:54:00:BF:A2:72,linkActive:true,network:rhevm,filter:vdsm-no-mac-spoofing,specParams:{},deviceId:1f3edec2-46b1-44a0-8ae1-37ff68377d32,address:{bus:0x00, slot:0x03, domain:0x0000, type:pci, function:0x0},device:bridge,type:interface}
devices={device:console,specParams:{},type:console,deviceId:b49a1225-64a6-4292-956e-c2e54eb01105,alias:console0}
vmName=RHEV-Manager
spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
smp=1
cpuType=SandyBridge
emulatedMachine=rhel6.5.0


[root@hv01 dgalloway]# cat /var/run/ovirt-hosted-engine-ha/vm.conf
vmId=6406669d-7df3-4719-a3ae-bf82f523ff03
smp=1
memSize=4096
spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
vmName=RHEV-Manager
display=qxl
devices={device:qxl,alias:video0,type:video,deviceId:551690f1-1c06-4ad6-a9ee-8fd7856e0a8a,address:None}
devices={index:2,iface:ide,shared:false,readonly:true,deviceId:8c3179ac-b322-4f5c-9449-c52e3665e0ae,address:{controller:0,target:0,unit:0,bus:1,type:drive},device:cdrom,path:,type:disk}
devices={device:scsi,model:virtio-scsi,type:controller,deviceId:b41a6c14-42cb-4a0f-b198-b25e7cbf3244,address:{slot:0x04,bus:0x00,domain:0x0000,type:pci,function:0x0}}
devices={device:usb,type:controller,deviceId:cb09a6cc-bef5-4119-8bdd-c16320c7f299,address:{slot:0x01,bus:0x00,domain:0x0000,type:pci,function:0x2}}
devices={device:ide,type:controller,deviceId:a0a4c8c4-53dd-4f20-adb4-c7ac73ceb0ae,address:{slot:0x01,bus:0x00,domain:0x0000,type:pci,function:0x1}}
devices={device:virtio-serial,type:controller,deviceId:3e692f20-9d10-4490-aa47-acac087fae8c,address:{slot:0x05,bus:0x00,domain:0x0000,type:pci,function:0x0}}
devices={device:virtio-serial,type:controller,deviceId:03b64140-3a32-486d-a5fb-c24c36a3ca44,address:None}
devices={device:virtio-scsi,type:controller,deviceId:6b262383-004a-41d2-9113-dcfdecedc063,address:None}

Comment 1 David Galloway 2016-08-04 01:08:58 UTC
Created attachment 1187313 [details]
vdsm.log after running "cat /etc/ovirt-hosted-engine/vm.conf > /var/run/ovirt-hosted-engine-ha/vm.conf; hosted-engine --vm-start"

Comment 2 David Galloway 2016-08-04 03:07:10 UTC
I'm sorry.. I may have just fixed this after weeks of struggling.

The main issue was my Hosted Engine's storage domain wasn't properly configured in the engine.  In case anyone else struggles with this,

On a host/hypervisor I ran:

vdsClient -s 0 getConnectedStoragePoolsList 
28fc87ad-2e28-44d2-8ce4-2e63b9bad4c6

vdsClient -s 0 getStorageDomainsList 28fc87ad-2e28-44d2-8ce4-2e63b9bad4c6
67ff9a5d-b5da-4a2f-b5ce-2286bc82e3e4
2a81a4ec-23c1-48b8-b296-e0bcc4f59700
ca46f740-bc09-477f-b09e-cf7e746dca73
d88bb862-7e71-4437-9599-0bb1042a56aa

My Hosted Engine's storage domain is ca46f740-bc09-477f-b09e-cf7e746dca73.

vdsClient -s 0 getStorageDomainInfo ca46f740-bc09-477f-b09e-cf7e746dca73
	uuid = ca46f740-bc09-477f-b09e-cf7e746dca73
	version = 3
	role = Master
	remotePath = 172.21.0.10:/srv/rhevstor
	type = NFS
	class = Data
	pool = ['28fc87ad-2e28-44d2-8ce4-2e63b9bad4c6']
	name = shared_storage

On the Hosted Engine:

engine-config -g HostedEngineStorageDomainName
HostedEngineStorageDomainName: hosted_storage version: general

Aha...

engine-config -s HostedEngineStorageDomainName="shared_storage"

service ovirt-engine restart

I then removed the Hosted Engine VM's storage domain via the web UI and within seconds, it was automatically reimported with the correct name.  I had already set "engine-config -s OvfUpdateIntervalInMinutes=1" so after a few minutes of no longer seeing warnings in /var/log/ovirt-engine/engine.log, I shut it down and was able to run "hosted-engine --vm-start" successfully again.

Basically, in the RHEV UI (and thus, postgres DB?), the storage domain was originally named rhevm_export.
In the engine, HostedEngineStorageDomainName was set to "hosted_storage" and I guess in the OVF_STORE it was named shared_storage.  Not sure how I dug myself into that hole.

Sorry for the noise and red herring.  I'm not sure on proper BZ protocol but this can be closed or canceled.

Comment 3 Yedidyah Bar David 2016-08-04 05:45:06 UTC
Can you please attach sosreports from the engine and hosts? Thanks.

Comment 4 David Galloway 2016-08-04 15:27:31 UTC
(In reply to Yedidyah Bar David from comment #3)
> Can you please attach sosreports from the engine and hosts? Thanks.

See https://bugzilla.redhat.com/show_bug.cgi?id=1363926#c2.

This bug can be canceled.

Comment 5 Yedidyah Bar David 2016-08-07 06:22:39 UTC
(In reply to David Galloway from comment #4)
> (In reply to Yedidyah Bar David from comment #3)
> > Can you please attach sosreports from the engine and hosts? Thanks.
> 
> See https://bugzilla.redhat.com/show_bug.cgi?id=1363926#c2.

Already saw that, and really want to thank you for the investigation. But:

> 
> This bug can be canceled.

(In reply to David Galloway from comment #2)
> I'm sorry.. I may have just fixed this after weeks of struggling.
> 
> The main issue was my Hosted Engine's storage domain wasn't properly
> configured in the engine.

I was hoping to be able to understand how this happened. If you are completely certain this happened due to external reasons (storage server issues, network errors, a human mistake, whatever), then I agree the bug could be closed. But if you are not sure, or if you think this might have happened due to a bug in oVirt, we should probably try to solve it.

Comment 6 David Galloway 2016-08-08 14:01:48 UTC
(In reply to Yedidyah Bar David from comment #5)
> (In reply to David Galloway from comment #4)
> > I'm sorry.. I may have just fixed this after weeks of struggling.
> > 
> > The main issue was my Hosted Engine's storage domain wasn't properly
> > configured in the engine.
> 
> I was hoping to be able to understand how this happened. If you are
> completely certain this happened due to external reasons (storage server
> issues, network errors, a human mistake, whatever), then I agree the bug
> could be closed. But if you are not sure, or if you think this might have
> happened due to a bug in oVirt, we should probably try to solve it.

I don't know for sure whether this was human error or not but it's likely my fault.  I was thinking it might be worthwhile to at least put a check in place to make sure Hosted Engine's storage domain's name matches in all necessary locations and warn if not.

I'm happy to upload sosreports somewhere if you think it's worthwhile.  They're 456.2MB and 30.8MB.  Where would you like me to put them?

Comment 7 Yedidyah Bar David 2016-08-09 06:06:16 UTC
(In reply to David Galloway from comment #6)
> I'm happy to upload sosreports somewhere if you think it's worthwhile. 
> They're 456.2MB and 30.8MB.  Where would you like me to put them?

Any file-sharing service would do - dropbox, google drive, whatever.


Note You need to log in before you can comment on or make changes to this bug.