Bug 1268302 - vdsm on getStorageDomainInfo returns an NFS remotePath without the host info
vdsm on getStorageDomainInfo returns an NFS remotePath without the host info
Status: CLOSED WONTFIX
Product: vdsm
Classification: oVirt
Component: Core (Show other bugs)
---
Unspecified Unspecified
unspecified Severity medium (vote)
: ovirt-4.0.0-rc
: ---
Assigned To: Idan Shaby
Aharon Canan
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-02 09:21 EDT by Piotr Kliczewski
Modified: 2016-05-30 07:35 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-30 07:35:35 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
ylavi: ovirt‑4.0.0?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
setup log file (196.84 KB, text/plain)
2015-10-02 09:21 EDT, Piotr Kliczewski
no flags Details

  None (edit)
Description Piotr Kliczewski 2015-10-02 09:21:39 EDT
Created attachment 1079435 [details]
setup log file

Description of problem:

I wanted to install hosted engine using my nfs. I added empty directory exported on the same machine but installation failed with:

list index out of range

How reproducible:


Steps to Reproduce:
1. I have several directories exported
- /export/data old storage domain with files from previous installation
- /export/data2 empty directory which I wanted to use for hosted engine setup
2. hosted-engine --deploy
3. using /export/data2

Actual results:
Failed with above error message

Expected results:
Setup should continue without any issues.

Additional info:
Comment 1 Piotr Kliczewski 2015-10-02 09:22:24 EDT
When I removed files from old storage domain the sutup continue.
Comment 2 Simone Tiraboschi 2015-10-02 09:43:05 EDT
The issue seams to be there:

2015-10-02 15:09:50 DEBUG otopi.plugins.ovirt_hosted_engine_setup.storage.storage storage._getStorageDomainInfo:597 {'status': {'message': 'OK', 'code': 0}, 'info': {'uuid': 'ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', 'version': '3', 'role': 'Master', 'remotePath': '/export/data', 'type': 'NFS', 'class': 'Data', 'pool': ['00000001-0001-0001-0001-00000000008a'], 'name': 'data'}}
2015-10-02 15:09:50 DEBUG otopi.context context._executeMethod:156 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 1245, in _late_customization
    self._getExistingDomain()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 515, in _getExistingDomain
    domain_info['remotePath']
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 450, in _removeNFSTrailingSlash
    if len(nfspath[1]) > 1:
IndexError: list index out of range

for some strange reasons VDSM was reporting an existing NFS storage domain with 'remotePath': '/export/data' which is not in the canonical form host:path

Piotr, could you please attach also vdsm.log?
Comment 3 Piotr Kliczewski 2015-10-02 09:47:48 EDT
The log is big now but I think this should be enough:

Thread-292::DEBUG::2015-10-02 15:09:50,498::__init__::298::IOProcessClient::(_run) Starting IOProcess...
Thread-280::INFO::2015-10-02 15:09:50,501::logUtils::51::dispatcher::(wrapper) Run and protect: getStorageDomainsList, Return response: {'domlist': [u'ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e']}
Thread-280::DEBUG::2015-10-02 15:09:50,501::task::1191::Storage.TaskManager.Task::(prepare) Task=`81d85995-171e-4a9e-8a4b-ef682e9223a0`::finished: {'domlist': [u'ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e']}
Thread-280::DEBUG::2015-10-02 15:09:50,501::task::595::Storage.TaskManager.Task::(_updateState) Task=`81d85995-171e-4a9e-8a4b-ef682e9223a0`::moving from state preparing -> state finished
Thread-280::DEBUG::2015-10-02 15:09:50,501::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-280::DEBUG::2015-10-02 15:09:50,502::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-280::DEBUG::2015-10-02 15:09:50,502::task::993::Storage.TaskManager.Task::(_decref) Task=`81d85995-171e-4a9e-8a4b-ef682e9223a0`::ref 0 aborting False
Thread-280::INFO::2015-10-02 15:09:50,504::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:55920 stopped
Reactor thread::INFO::2015-10-02 15:09:50,511::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from 127.0.0.1:55921
Reactor thread::DEBUG::2015-10-02 15:09:50,516::protocoldetector::82::ProtocolDetector.Detector::(__init__) Using required_size=11
Reactor thread::INFO::2015-10-02 15:09:50,517::protocoldetector::118::ProtocolDetector.Detector::(handle_read) Detected protocol xml from 127.0.0.1:55921
Reactor thread::DEBUG::2015-10-02 15:09:50,517::bindingxmlrpc::1297::XmlDetector::(handle_socket) xml over http detected from ('127.0.0.1', 55921)
BindingXMLRPC::INFO::2015-10-02 15:09:50,517::xmlrpc::73::vds.XMLRPCServer::(handle_request) Starting request handler for 127.0.0.1:55921
Thread-293::INFO::2015-10-02 15:09:50,517::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:55921 started
Thread-293::DEBUG::2015-10-02 15:09:50,517::bindingxmlrpc::325::vds::(wrapper) client [127.0.0.1]
Thread-293::DEBUG::2015-10-02 15:09:50,518::task::595::Storage.TaskManager.Task::(_updateState) Task=`96311236-9763-4b5a-aaa0-6770b9967d82`::moving from state init -> state preparing
Thread-293::INFO::2015-10-02 15:09:50,518::logUtils::48::dispatcher::(wrapper) Run and protect: getStorageDomainInfo(sdUUID='ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', options=None)
Thread-293::INFO::2015-10-02 15:09:50,518::fileSD::357::Storage.StorageDomain::(validate) sdUUID=ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e
Thread-293::DEBUG::2015-10-02 15:09:50,519::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=data', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=1', 'POOL_DESCRIPTION=Default', 'POOL_DOMAINS=ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e:Active', 'POOL_SPM_ID=-1', 'POOL_SPM_LVER=-1', 'POOL_UUID=00000001-0001-0001-0001-00000000008a', 'REMOTE_PATH=192.168.1.8:/export/data', 'ROLE=Master', 'SDUUID=ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', 'TYPE=NFS', 'VERSION=3', '_SHA_CKSUM=9b17e27ed1275d33f83e9038ee87b2b3b2c4d2a8']
Thread-293::DEBUG::2015-10-02 15:09:50,519::resourceManager::198::Storage.ResourceManager.Request::(__init__) ResName=`Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e`ReqID=`d64f28f8-8aa0-4673-9378-498643af3c3a`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '2830' at 'getStorageDomainInfo'
Thread-293::DEBUG::2015-10-02 15:09:50,519::resourceManager::542::Storage.ResourceManager::(registerResource) Trying to register resource 'Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e' for lock type 'shared'
Thread-293::DEBUG::2015-10-02 15:09:50,519::resourceManager::601::Storage.ResourceManager::(registerResource) Resource 'Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e' is free. Now locking as 'shared' (1 active user)
Thread-293::DEBUG::2015-10-02 15:09:50,519::resourceManager::238::Storage.ResourceManager.Request::(grant) ResName=`Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e`ReqID=`d64f28f8-8aa0-4673-9378-498643af3c3a`::Granted request
Thread-293::DEBUG::2015-10-02 15:09:50,519::task::827::Storage.TaskManager.Task::(resourceAcquired) Task=`96311236-9763-4b5a-aaa0-6770b9967d82`::_resourcesAcquired: Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e (shared)
Thread-293::DEBUG::2015-10-02 15:09:50,520::task::993::Storage.TaskManager.Task::(_decref) Task=`96311236-9763-4b5a-aaa0-6770b9967d82`::ref 1 aborting False
Thread-293::INFO::2015-10-02 15:09:50,520::logUtils::51::dispatcher::(wrapper) Run and protect: getStorageDomainInfo, Return response: {'info': {'uuid': u'ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', 'version': '3', 'role': 'Master', 'remotePath': u'/export/data', 'type': 'NFS', 'class': 'Data', 'pool': ['00000001-0001-0001-0001-00000000008a'], 'name': 'data'}}
Thread-293::DEBUG::2015-10-02 15:09:50,520::task::1191::Storage.TaskManager.Task::(prepare) Task=`96311236-9763-4b5a-aaa0-6770b9967d82`::finished: {'info': {'uuid': u'ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', 'version': '3', 'role': 'Master', 'remotePath': u'/export/data', 'type': 'NFS', 'class': 'Data', 'pool': ['00000001-0001-0001-0001-00000000008a'], 'name': 'data'}}
Thread-293::DEBUG::2015-10-02 15:09:50,520::task::595::Storage.TaskManager.Task::(_updateState) Task=`96311236-9763-4b5a-aaa0-6770b9967d82`::moving from state preparing -> state finished
Thread-293::DEBUG::2015-10-02 15:09:50,520::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e': < ResourceRef 'Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', isValid: 'True' obj: 'None'>}
Thread-293::DEBUG::2015-10-02 15:09:50,520::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-293::DEBUG::2015-10-02 15:09:50,520::resourceManager::616::Storage.ResourceManager::(releaseResource) Trying to release resource 'Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e'
Thread-293::DEBUG::2015-10-02 15:09:50,520::resourceManager::635::Storage.ResourceManager::(releaseResource) Released resource 'Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e' (0 active users)
Thread-293::DEBUG::2015-10-02 15:09:50,520::resourceManager::641::Storage.ResourceManager::(releaseResource) Resource 'Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e' is free, finding out if anyone is waiting for it.
Thread-293::DEBUG::2015-10-02 15:09:50,520::resourceManager::649::Storage.ResourceManager::(releaseResource) No one is waiting for resource 'Storage.ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', Clearing records.
Thread-293::DEBUG::2015-10-02 15:09:50,520::task::993::Storage.TaskManager.Task::(_decref) Task=`96311236-9763-4b5a-aaa0-6770b9967d82`::ref 0 aborting False
Comment 4 Simone Tiraboschi 2015-10-02 10:05:43 EDT
Here it was correct:

15:09:50,519::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=data', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=1', 'POOL_DESCRIPTION=Default', 'POOL_DOMAINS=ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e:Active', 'POOL_SPM_ID=-1', 'POOL_SPM_LVER=-1', 'POOL_UUID=00000001-0001-0001-0001-00000000008a', 'REMOTE_PATH=192.168.1.8:/export/data', 'ROLE=Master', 'SDUUID=ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', 'TYPE=NFS', 'VERSION=3', '_SHA_CKSUM=9b17e27ed1275d33f83e9038ee87b2b3b2c4d2a8']

cause it was: 'REMOTE_PATH=192.168.1.8:/export/data'

while here it already lost the host info:
Thread-293::DEBUG::2015-10-02 15:09:50,520::task::993::Storage.TaskManager.Task::(_decref) Task=`96311236-9763-4b5a-aaa0-6770b9967d82`::ref 1 aborting False
Thread-293::INFO::2015-10-02 15:09:50,520::logUtils::51::dispatcher::(wrapper) Run and protect: getStorageDomainInfo, Return response: {'info': {'uuid': u'ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', 'version': '3', 'role': 'Master', 'remotePath': u'/export/data', 'type': 'NFS', 'class': 'Data', 'pool': ['00000001-0001-0001-0001-00000000008a'], 'name': 'data'}}

being: 'remotePath': u'/export/data' without '192.168.1.8:'
Comment 5 Piotr Kliczewski 2015-10-02 10:20:13 EDT
It seems that vdsm reports old ip address which was assigned to this host some time ago but it changes. The storage domain was not configured for this vdsm and it was a leftover from many installations ago.

I am not sure why bad configuration breaks hosted engine setup.
Comment 6 Simone Tiraboschi 2015-10-02 10:26:49 EDT
(In reply to Piotr Kliczewski from comment #5)
> I am not sure why bad configuration breaks hosted engine setup.

It's not really a bad configuration issue: 192.168.1.8:/export/data could be accessible or not the issue is not there.

I don't see any valid reason to have '192.168.1.8:/export/data' becoming just '/export/data' in the response we got.
Comment 7 Allon Mureinik 2015-10-06 05:24:41 EDT
Does this happen in 3.6.0 too?
Comment 8 Piotr Kliczewski 2015-10-06 07:06:37 EDT
Yes, I was able to reproduce it using vdsm 3.6 and ovirt-engine 3.6 by running following steps:

1. make sure that there are no leftover domains
2. host deploy a vdsm
3. configure a domain using nfs and following directory /export/data
4. remove vdsm from the host
5. clean engine DB
6. setup engine and install vdsm to the same host
7. run hosted-engine --deploy
8. configure "the full shared storage connection path" to /export/data2

Hosted-engine setup fails as above.
Comment 9 Allon Mureinik 2015-10-06 07:43:07 EDT
Thanks Piotr!
Tentatively proposing for 3.6.0 based on that comment.

> 8. configure "the full shared storage connection path" to /export/data2
Simone - is this a valid configuration? Shouldn't the full path be myhost.mydomain:/export/data2 ?
Comment 10 Simone Tiraboschi 2015-10-14 12:01:37 EDT
(In reply to Allon Mureinik from comment #9)
> Thanks Piotr!
> Tentatively proposing for 3.6.0 based on that comment.
> 
> > 8. configure "the full shared storage connection path" to /export/data2
> Simone - is this a valid configuration? Shouldn't the full path be
> myhost.mydomain:/export/data2 ?

From the attached logs I can easily see that the path the user entered was correct:
2015-10-02 15:09:42 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:219 DIALOG:SEND                 Please specify the full shared storage connection path to use (example: host:/path): 
2015-10-02 15:09:50 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:219 DIALOG:RECEIVE    192.168.1.9:/export/data2

The issue is that we received than:
Thread-293::DEBUG::2015-10-02 15:09:50,520::task::993::Storage.TaskManager.Task::(_decref) Task=`96311236-9763-4b5a-aaa0-6770b9967d82`::ref 1 aborting False
Thread-293::INFO::2015-10-02 15:09:50,520::logUtils::51::dispatcher::(wrapper) Run and protect: getStorageDomainInfo, Return response: {'info': {'uuid': u'ad8dfcc9-b91e-4345-9f3c-ad426d1e8e5e', 'version': '3', 'role': 'Master', 'remotePath': u'/export/data', 'type': 'NFS', 'class': 'Data', 'pool': ['00000001-0001-0001-0001-00000000008a'], 'name': 'data'}}

Which is not correct
Comment 11 Red Hat Bugzilla Rules Engine 2015-10-19 06:56:22 EDT
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 12 Sandro Bonazzola 2015-10-26 08:35:17 EDT
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to

- 3.6.1 if severity >= high
- 4.0 if severity < high
Comment 13 Sandro Bonazzola 2016-05-02 05:52:16 EDT
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Comment 14 Yaniv Lavi 2016-05-23 09:15:21 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 15 Yaniv Lavi 2016-05-23 09:20:34 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 16 Yaniv Lavi 2016-05-30 07:35:35 EDT
The use case here is to use NFS loopback with HE that is not supported and has locking issues. If you want to be able to install locally we need to support local storage for HE. Please open a RFE on this or vote up this use case on that RFE.

Note You need to log in before you can comment on or make changes to this bug.