Hide Forgot
Created attachment 1167112 [details] he-fail Description of problem: Hosted Engine deployment failed. Failed to execute stage 'Misc configuration': Error storage server connection: (u"domType=6, spUUID=00000000-0000-0000-0000-000000000000, conList=[{u'vfsType': u'ext3', u'connection': u'/var/lib/ovirt-hosted-engine-setup/tmp7bEkmr', u'spec': u'/var/lib/ovirt-hosted-engine-setup/tmp7bEkmr', u'id': u'5b91a354-cbdd-4eae-a12e-3bfe54259d46'}]",) Hosted Engine deployment failed: this system is not reliable, please check the issue, fix and redeploy Version-Release number of selected component (if applicable): rhev-hypervisor7-ng-4.0-20160608.0.x86_64 imgbased-0.7.0-0.1.el7ev.noarch ovirt-hosted-engine-ha-2.0.0-1.el7ev.noarch ovirt-hosted-engine-setup-2.0.0.1-1.el7ev.noarch redhat-release-rhev-hypervisor-4.0-0.6.el7.x86_64 vdsm-4.18.2-0.el7ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. install rhev-hypervisor7-ng-4.0-20160608.0.x86_64 2. Deploy HE with correct steps. 3. Please confirm installation settings (Yes, No)[Yes]: Yes Actual results: Hosted Engine deployment failed Expected results: Hosted Engine deployment PASS. Additional info:
Created attachment 1167113 [details] all-log
Add "Regression" keyword due to no such issue on rhev-hypervisor7-ng-4.0-20160527.0 build. Add ova info to here. rhevm-appliance-20160526.0-1.el7ev.4.0.ova
We did not encounter such issue on rhev-hypervisor7-ng-4.0-20160607.1,ovirt-hosted-engine-setup-2.0.0-1.el7ev.
I tested this with ovirt-hsoted-engine-setup-2.0.0-1, but without a new vdsm build (to verify changes to the otopi parser). It definitely looks like it's related to vdsm: jsonrpc.Executor/5::INFO::2016-06-12 16:57:51,553::logUtils::49::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=6, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'vfsType': u'ext3', u'connection' : u'/var/lib/ovirt-hosted-engine-setup/tmpcDfLcz', u'spec': u'/var/lib/ovirt-hosted-engine-setup/tmpcDfLcz', u'id': u'acf296dd-802f-43ea-9dea-bafd4342eb4b'}], options=None) jsonrpc.Executor/5::ERROR::2016-06-12 16:57:51,554::task::868::Storage.TaskManager.Task::(_setError) Task=`97be7d50-ce08-421d-a634-32bd68704f55`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 875, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2381, in connectStorageServer conObj = storageServer.ConnectionFactory.createConnection(conInfo) File "/usr/share/vdsm/storage/storageServer.py", line 791, in createConnection return ctor(**params) File "/usr/share/vdsm/storage/storageServer.py", line 213, in __init__ self._remotePath = fileUtils.normalize_remote_path(spec) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line 103, in normalize_remote_path host, tail = address.hosttail_split(remote_path) File "/usr/lib/python2.7/site-packages/vdsm/common/network/address.py", line 43, in hosttail_split raise HosttailError('%s is not a valid hosttail address:' % hosttail) HosttailError: /var/lib/ovirt-hosted-engine-setup/tmpcDfLcz is not a valid hosttail address: And: https://gerrit.ovirt.org/#/c/55182/ I'm not sure if this should be marked as a remote path (it isn't), or whether vdsm is handling this badly. Should we even be encountering this code path with a local mount?
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
This now happens when we try to connect the fake storage domain we have create in order to use it as a master storage domain fior the SPM before having an engine. We create it as a Posix storage domain over a loopback device. The issue seams here: jsonrpc.Executor/5::DEBUG::2016-06-12 16:57:51,549::__init__::522::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StoragePool.connectStorageServer' in bridge with {u'connectionParams': [{u'vfsType': u'ext3', u'connection': u'/var/lib/ovirt-hosted-engine-setup/tmpcDfLcz', u'spec': u'/var/lib/ovirt-hosted-engine-setup/tmpcDfLcz', u'id': u'acf296dd-802f-43ea-9dea-bafd4342eb4b'}], u'storagepoolID': u'00000000-0000-0000-0000-000000000000', u'domainType': 6} jsonrpc.Executor/5::WARNING::2016-06-12 16:57:51,550::vdsmapi::143::SchemaCache::(_report_inconsistency) Provided value "6" not defined in StorageDomainType enum for StoragePool.connectStorageServer So VDSM doesn't detect/accept anymore that it's a Posix storage domain and it tries then to resolve the path as an NFS path obviously failing. jsonrpc.Executor/5::ERROR::2016-06-12 16:57:51,554::task::868::Storage.TaskManager.Task::(_setError) Task=`97be7d50-ce08-421d-a634-32bd68704f55`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 875, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2381, in connectStorageServer conObj = storageServer.ConnectionFactory.createConnection(conInfo) File "/usr/share/vdsm/storage/storageServer.py", line 791, in createConnection return ctor(**params) File "/usr/share/vdsm/storage/storageServer.py", line 213, in __init__ self._remotePath = fileUtils.normalize_remote_path(spec) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line 103, in normalize_remote_path host, tail = address.hosttail_split(remote_path) File "/usr/lib/python2.7/site-packages/vdsm/common/network/address.py", line 43, in hosttail_split raise HosttailError('%s is not a valid hosttail address:' % hosttail) HosttailError: /var/lib/ovirt-hosted-engine-setup/tmpcDfLcz is not a valid hosttail address:
Simone, could you share the client code for connectStorageServer? POSIXFS_DOMAIN is basically NFS_DOMAIN without some NFS-specific hacks. It is unlike LOCALFS, where a local directory is used for storage.
Accidentally removed the need info from Francesco.
(In reply to Dan Kenigsberg from comment #8) > Simone, could you share the client code for connectStorageServer? > > POSIXFS_DOMAIN is basically NFS_DOMAIN without some NFS-specific hacks. It > is unlike LOCALFS, where a local directory is used for storage. Look here https://gerrit.ovirt.org/gitweb?p=ovirt-hosted-engine-setup.git;a=blob;f=src/plugins/gr-he-setup/storage/storage.py;h=6906cb94928f6605e5ac9ebebf9dbf817b63f1ae;hb=refs/heads/master at _attach_loopback_device and _storageServerConnection at line 773. On my opinion the issue can be just here: jsonrpc.Executor/5::WARNING::2016-06-12 16:57:51,550::vdsmapi::143::SchemaCache::(_report_inconsistency) Provided value "6" not defined in StorageDomainType enum for StoragePool.connectStorageServer since it seams that it doesn't accepted 6 for POSIX storage domain.
Just reproduced with the engine and vdsm moving from vdsm 4.18.1 to 4.18.2. I successfully created a POSIX FS storage domain from the engine using a loopback mounted device on /tmp/fake_sd and it was correctly created and connected when I was using VDSM 4.18.1 jsonrpc.Executor/5::DEBUG::2016-06-13 11:57:41,317::task::597::Storage.TaskManager.Task::(_updateState) Task=`bbdc2632-3e52-4414-83da-7a8c937c384d`::moving from state init -> state preparing jsonrpc.Executor/5::INFO::2016-06-13 11:57:41,317::logUtils::49::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=6, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'1bea12a4-3a8a-468e-a069-e4b4f6f4f720', u'connection': u'/tmp/fake_sd', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'ext3', u'password': '********', u'port': u''}], options=None) jsonrpc.Executor/5::DEBUG::2016-06-13 11:57:41,318::hsm::2332::Storage.HSM::(__prefetchDomains) posix local path: /rhev/data-center/mnt/_tmp_fake__sd jsonrpc.Executor/5::DEBUG::2016-06-13 11:57:41,324::hsm::2350::Storage.HSM::(__prefetchDomains) Found SD uuids: () jsonrpc.Executor/5::DEBUG::2016-06-13 11:57:41,324::hsm::2410::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-06-13 11:57:41,324::logUtils::52::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': u'1bea12a4-3a8a-468e-a069-e4b4f6f4f720'}]} jsonrpc.Executor/5::DEBUG::2016-06-13 11:57:41,325::task::1193::Storage.TaskManager.Task::(prepare) Task=`bbdc2632-3e52-4414-83da-7a8c937c384d`::finished: {'statuslist': [{'status': 0, 'id': u'1bea12a4-3a8a-468e-a069-e4b4f6f4f720'}]} jsonrpc.Executor/5::DEBUG::2016-06-13 11:57:41,325::task::597::Storage.TaskManager.Task::(_updateState) Task=`bbdc2632-3e52-4414-83da-7a8c937c384d`::moving from state preparing -> state finished Then I simply updated VDSM to 4.18.2 and the same storage domain start failing: jsonrpc.Executor/3::INFO::2016-06-13 12:18:32,791::logUtils::49::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=6, spUUID=u'00000001-0001-0001-0001-000000000012', conList=[{u'id': u'1bea12a4-3a8a-468e-a069-e4b4f6f4f720', u'connection': u'/tmp/fake_sd', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'ext3', u'password': '********', u'port': u''}], options=None) jsonrpc.Executor/3::ERROR::2016-06-13 12:18:32,791::task::868::Storage.TaskManager.Task::(_setError) Task=`7fca28e9-8e21-432c-a764-3950d87a0a0c`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 875, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2381, in connectStorageServer conObj = storageServer.ConnectionFactory.createConnection(conInfo) File "/usr/share/vdsm/storage/storageServer.py", line 791, in createConnection return ctor(**params) File "/usr/share/vdsm/storage/storageServer.py", line 213, in __init__ self._remotePath = fileUtils.normalize_remote_path(spec) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line 103, in normalize_remote_path host, tail = address.hosttail_split(remote_path) File "/usr/lib/python2.7/site-packages/vdsm/common/network/address.py", line 43, in hosttail_split raise HosttailError('%s is not a valid hosttail address:' % hosttail) HosttailError: /tmp/fake_sd is not a valid hosttail address: jsonrpc.Executor/3::DEBUG::2016-06-13 12:18:32,792::task::887::Storage.TaskManager.Task::(_run) Task=`7fca28e9-8e21-432c-a764-3950d87a0a0c`::Task._run: 7fca28e9-8e21-432c-a764-3950d87a0a0c (6, u'00000001-0001-0001-0001-000000000012', [{u'id': u'1bea12a4-3a8a-468e-a069-e4b4f6f4f720', u'connection': u'/tmp/fake_sd', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'ext3', u'password': '********', u'port': u''}]) {} failed - stopping task
I'm attaching the relevant logs.
Created attachment 1167418 [details] vdsm 4.18.1
Created attachment 1167419 [details] vdsm 4.18.2
Moving to vdsm according to comment 11
Created attachment 1167488 [details] New Storage POSIX storage domain dialog
Now connectStorageServer fails since the POSIX path is not is host:tail format but also in the webadmin UI we are simply asking for something like '/path/to/my/data'. Check the screenshot: https://bugzilla.redhat.com/attachment.cgi?id=1167488 So I think that this will also fail for all the users that, for any reasons (a DAS device, manually configured DRDB...), attached a POSIX domain with a local path before upgrading to VDSM 4.18.2
Simone, can you please verify https://gerrit.ovirt.org/#/c/59070/ ?
(In reply to Idan Shaby from comment #18) > Simone, can you please verify https://gerrit.ovirt.org/#/c/59070/ ? Yes, sure.
Yes, now it works for me.
(In reply to Simone Tiraboschi from comment #7) > This now happens when we try to connect the fake storage domain we have > create in order to use it as a master storage domain fior the SPM before > having an engine. > > We create it as a Posix storage domain over a loopback device. > > The issue seams here: > > jsonrpc.Executor/5::DEBUG::2016-06-12 > 16:57:51,549::__init__::522::jsonrpc.JsonRpcServer::(_serveRequest) Calling > 'StoragePool.connectStorageServer' in bridge with {u'connectionParams': > [{u'vfsType': u'ext3', u'connection': > u'/var/lib/ovirt-hosted-engine-setup/tmpcDfLcz', u'spec': > u'/var/lib/ovirt-hosted-engine-setup/tmpcDfLcz', u'id': > u'acf296dd-802f-43ea-9dea-bafd4342eb4b'}], u'storagepoolID': > u'00000000-0000-0000-0000-000000000000', u'domainType': 6} > jsonrpc.Executor/5::WARNING::2016-06-12 > 16:57:51,550::vdsmapi::143::SchemaCache::(_report_inconsistency) Provided > value "6" not defined in StorageDomainType enum for > StoragePool.connectStorageServer > > So VDSM doesn't detect/accept anymore that it's a Posix storage domain and > it tries then to resolve the path as an NFS path obviously failing. > > jsonrpc.Executor/5::ERROR::2016-06-12 > 16:57:51,554::task::868::Storage.TaskManager.Task::(_setError) > Task=`97be7d50-ce08-421d-a634-32bd68704f55`::Unexpected error > Traceback (most recent call last): > File "/usr/share/vdsm/storage/task.py", line 875, in _run > return fn(*args, **kargs) > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in > wrapper > res = f(*args, **kwargs) > File "/usr/share/vdsm/storage/hsm.py", line 2381, in connectStorageServer > conObj = storageServer.ConnectionFactory.createConnection(conInfo) > File "/usr/share/vdsm/storage/storageServer.py", line 791, in > createConnection > return ctor(**params) > File "/usr/share/vdsm/storage/storageServer.py", line 213, in __init__ > self._remotePath = fileUtils.normalize_remote_path(spec) > File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line > 103, in normalize_remote_path > host, tail = address.hosttail_split(remote_path) > File "/usr/lib/python2.7/site-packages/vdsm/common/network/address.py", > line 43, in hosttail_split > raise HosttailError('%s is not a valid hosttail address:' % hosttail) > HosttailError: /var/lib/ovirt-hosted-engine-setup/tmpcDfLcz is not a valid > hosttail address: Just to answer and clear my needinfo: I don't know, but it seems the issue was already resolved :)
Can you please add on how the HE deployment being done please? Are you using an ISO image from which you're installing the NGN on host or over PXE?
I've hit the same issue also on latest HE over RHEL7.2 clean deployment. Attaching sosreport from my host. [ INFO ] Configuring the management bridge [ ERROR ] Failed to execute stage 'Misc configuration': Error storage server connection: (u"domType=6, spUUID=00000000-0000-0000-0000-000000000000, conList=[{u'vfsType': u'ext3', u'connection': u'/var/lib/ovirt-hosted-engine-setup/tmpnIAJgK', u'spec': u'/var/lib/ovirt-hosted-engine-setup/tmpnIAJgK', u'id': u'b1581d1b-e40a-4a6d-917a-dcc94ebb6963'}]",) [ INFO ] Stage: Clean up [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160614153937.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue, fix and redeploy Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160614152953-hj0zqi.log Components on host: libvirt-client-1.2.17-13.el7_2.5.x86_64 vdsm-4.18.2-0.el7ev.x86_64 ovirt-host-deploy-1.5.0-1.el7ev.noarch qemu-kvm-rhev-2.3.0-31.el7_2.15.x86_64 sanlock-3.2.4-2.el7_2.x86_64 mom-0.5.4-1.el7ev.noarch ovirt-vmconsole-1.0.3-1.el7ev.noarch ovirt-engine-sdk-python-3.6.5.0-2.el7ev.noarch ovirt-vmconsole-host-1.0.3-1.el7ev.noarch ovirt-hosted-engine-setup-2.0.0.1-1.el7ev.noarch ovirt-hosted-engine-ha-2.0.0-1.el7ev.noarch ovirt-setup-lib-1.0.2-1.el7ev.noarch rhevm-appliance-20160526.0-1.el7ev.noarch Linux version 3.10.0-327.22.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Thu Jun 9 10:09:10 EDT 2016 Linux 3.10.0-327.22.2.el7.x86_64 #1 SMP Thu Jun 9 10:09:10 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.2 (Maipo)
We merged it only this morning and it's not OS dependent so you hit it for sure till we rebuild VDSM.
Created attachment 1167862 [details] sosreport from host alma03
(In reply to Nikolai Sednev from comment #22) > Can you please add on how the HE deployment being done please? 1. Login cockpit UI. 2. click "ovirt" -> "hosted engine" to start deploy HE. 3. Press "Enter" key during pop-up below confirm info. Are you sure you want to continue? (Yes, No)[Yes]: 4. Press "Enter" key during pop-up below confirm info. Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: 5. Input correct the full shared storage connection path to use (example: host:/path): 6. Press "Enter" key until pop-up specify the device to boot the VM option. 7. Enter "disk" and press Enter key 8. Please specify path to OVF archive you would like to use. 9. Start deploy HE with OVA step by step. > Are you using an ISO image from which you're installing the NGN on host or > over PXE? Yes, also encounter this issue during using an ISO image or over PXE.
Test version: rhev-hypervisor7-ng-4.0-20160616.0.x86_64 ovirt-hosted-engine-ha-2.0.0-1.el7ev.noarch ovirt-hosted-engine-setup-2.0.0.2-1.el7ev.noarch imgbased-0.7.0-0.1.el7ev.noarch redhat-release-rhev-hypervisor-4.0-0.7.el7.x86_64 Deploy HE can successful with above build, after reboot, HE still can work well, just leave msg to here for the recorder.
this fix is in, see: https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=shortlog;h=refs/tags/v4.18.3 the bug wasn't moved to ON_QA I guess due to missing target release. I'm not sure if this is dependent on other bugs/components, but if not please move to ON_QA.
Deployment of hosted-engine over NFS had succeeded on these components: Host: rhevm-appliance-20160619.0-2.el7ev.noarch ovirt-vmconsole-1.0.3-1.el7ev.noarch vdsm-4.18.3-0.el7ev.x86_64 ovirt-host-deploy-1.5.0-1.el7ev.noarch sanlock-3.2.4-2.el7_2.x86_64 ovirt-engine-sdk-python-3.6.7.0-1.el7ev.noarch libvirt-client-1.2.17-13.el7_2.5.x86_64 ovirt-hosted-engine-setup-2.0.0.2-1.el7ev.noarch qemu-kvm-rhev-2.3.0-31.el7_2.16.x86_64 mom-0.5.4-1.el7ev.noarch ovirt-vmconsole-host-1.0.3-1.el7ev.noarch ovirt-hosted-engine-ha-2.0.0-1.el7ev.noarch ovirt-setup-lib-1.0.2-1.el7ev.noarch Linux version 3.10.0-327.22.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Thu Jun 9 10:09:10 EDT 2016 Linux 3.10.0-327.22.2.el7.x86_64 #1 SMP Thu Jun 9 10:09:10 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.2 (Maipo) Engine: rhevm-doc-4.0.0-2.el7ev.noarch rhev-release-4.0.0-17-001.noarch rhevm-setup-plugins-4.0.0.1-1.el7ev.noarch rhevm-spice-client-x64-msi-4.0-2.el7ev.noarch rhevm-branding-rhev-4.0.0-1.el7ev.noarch rhevm-guest-agent-common-1.0.12-2.el7ev.noarch rhevm-dependencies-4.0.0-1.el7ev.noarch rhevm-4.0.0.5-0.1.el7ev.noarch rhevm-spice-client-x86-msi-4.0-2.el7ev.noarch rhev-guest-tools-iso-4.0-2.el7ev.noarch Linux version 3.10.0-327.18.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Fri Apr 8 05:09:53 EDT 2016 Linux 3.10.0-327.18.2.el7.x86_64 #1 SMP Fri Apr 8 05:09:53 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.2 (Maipo)
oVirt 4.0.0 has been released, closing current release.