Bug 768930 - importing iso domain fails because of wrong mountpoint
Summary: importing iso domain fails because of wrong mountpoint
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Rami Vaknin
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-19 13:00 UTC by Matthias H
Modified: 2016-04-18 06:43 UTC (History)
8 users (show)

Fixed In Version: v4.9.2-26-g9cdde3e
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-01-09 13:20:55 UTC
oVirt Team: ---


Attachments (Terms of Use)
vdsm log of failed "import domain" (68.37 KB, text/x-log)
2011-12-20 11:02 UTC, Matthias H
no flags Details

Description Matthias H 2011-12-19 13:00:13 UTC
Hi,

as the summary says importing an ISO domain fails.

Comment 1 Matthias H 2011-12-19 13:08:29 UTC
Sorry, was pressed enter when the text editor had not the focus...

Vdsm creates the wrong mountpoint when I try to import an ISO domain. 

The NFS share has the patch 10.10.11.17:/srv/isostorage/iso but vdsm creates the directory with a trailing "_" which then results in an error, because spmprotect.sh is unable to find the mountpoint.

This only happens when importing a domain, not while adding a new one.

Some debug output:

Thread-3639::DEBUG::2011-12-19 12:56:30,699::safelease::81::Storage.Misc.excCmd::(acquire) '/usr/bin/sudo -n /usr/bin/setsid /usr/bin/ionice -c1 -n0 /bin/su vdsm -s /bin/sh -c "/var/lib/vdsm/prog/libexec/vdsm/spmprotect.sh start d1d12aa7-8ead-4a27-9a1a-53079169aa5c 1 5 /var/lib/vdsm/data-center/mnt/10.10.11.17:_srv_isostorage_iso/d1d12aa7-8ead-4a27-9a1a-53079169aa5c/dom_md/leases 5000 1000 3"' (cwd /var/lib/vdsm/prog/libexec/vdsm)
Thread-3639::DEBUG::2011-12-19 12:56:30,715::safelease::81::Storage.Misc.excCmd::(acquire) FAILED: <err> = ''; <rc> = 1
Thread-3639::ERROR::2011-12-19 12:56:30,716::task::855::TaskManager.Task::(_setError) Task=`c57b6f27-3bd5-4718-aa20-957185f106e2`::Unexpected error
Traceback (most recent call last):
  File "/var/lib/vdsm/prog/share/vdsm/storage/task.py", line 863, in _run
    return fn(*args, **kargs)
  File "/var/lib/vdsm/prog/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/var/lib/vdsm/prog/share/vdsm/storage/hsm.py", line 831, in public_attachStorageDomain
    pool.attachSD(sdUUID)
  File "/var/lib/vdsm/prog/share/vdsm/storage/securable.py", line 60, in wrapper
    return f(*args, **kwargs)
  File "/var/lib/vdsm/prog/share/vdsm/storage/sp.py", line 986, in attachSD
    dom.acquireClusterLock(self.id)
  File "/var/lib/vdsm/prog/share/vdsm/storage/sd.py", line 377, in acquireClusterLock
    self._clusterLock.acquire(hostID)
  File "/var/lib/vdsm/prog/share/vdsm/storage/safelease.py", line 83, in acquire
    raise se.AcquireLockFailure(self._sdUUID, rc, out, err)
AcquireLockFailure: Could not obtain lock: "id=d1d12aa7-8ead-4a27-9a1a-53079169aa5c, rc=1, out=['error - lease file does not exist or is not writeable', 'usage: /var/lib/vdsm/prog/libexec/vdsm/spmprotect.sh COMMAND PARAMETERS', 'Commands:', '  start { spUUID hostId renewal_interval_sec lease_path[:offset] lease_time_ms io_op_timeout_ms fail_retries }', 'Parameters:', '  spUUID -                pool uuid', '  hostId -                host id in pool', '  renewal_interval_sec -  intervals for lease renewals attempts', '  lease_path -            path to lease file/volume', '  offset -                offset of lease within file', '  lease_time_ms -         time limit within which lease must be renewed (at least 2*renewal_interval_sec)', '  io_op_timeout_ms -      I/O operation timeout', '  fail_retries -          Maximal number of attempts to retry to renew the lease before fencing (<= lease_time_ms/renewal_interval_sec)'], err=[]"
Thread-3639::DEBUG::2011-12-19 12:56:30,717::task::874::TaskManager.Task::(_run) Task=`c57b6f27-3bd5-4718-aa20-957185f106e2`::Task._run: c57b6f27-3bd5-4718-aa20-957185f106e2 ('d1d12aa7-8ead-4a27-9a1a-53079169aa5c', '9a79918c-2a2a-11e1-a08a-00142272a6db') {} failed - stopping task


and the ls output:
# ls /var/lib/vdsm/data-center/mnt/
10.10.11.17:_srv_isostorage_iso_  blockSD

Comment 2 Dan Kenigsberg 2011-12-19 13:22:33 UTC
heh, that's just another case of bug 768034.

As a workaround, please make sure that you always pass the path 10.10.11.17:/srv/isostorage/iso without a trailing space (grep vdsm log for /srv/isostorage/iso/ to see where it was included).

Comment 3 Matthias H 2011-12-19 14:29:39 UTC
I had this error several times now and I'm pretty sure that there was no trailing space in at least one try.

If you look at the path the spmprotect gets everything seems to be right:
/var/lib/vdsm/data-center/mnt/10.10.11.17:_srv_isostorage_iso/d1d12aa7-8ead-4a27-9a1a-53079169aa5c/dom_md/leases

Just the mountpoint seems not to be created properly.

Comment 4 Rami Vaknin 2011-12-19 14:45:15 UTC
s/trailing space/trailing slash/

Comment 5 Matthias H 2011-12-19 15:18:28 UTC
Ok, I detached and removed a working ISO share from within the webadmin interface and tried to import it.

I took care that I did not enter a trailing slash, but the import fails with the same error. A mountpoint with a trailing underscore character get's created and spmprotect.sh fails because it gets the correct path handed over.

Comment 6 Rami Vaknin 2011-12-19 15:53:47 UTC
1. There is an issue that ovirt adds the trailing slash during import domain even if you didn't add it in the webadmin, I think that the is client side validation that doesn't allow you to put a trailing slash.

2. The trailing slash is substituted with underscore by vdsm during creation of the mount point in the node.

3. When you import an imported-before domain, it already exists in the database (in storage_server_connections table) due to and issue that ovirt does not clean it at removal time.

You need to verify that all connectStorageServer commands in vdsm does not contain a trailing slash in the nfs' path, it would be better if you could attach the full log.

Comment 7 Matthias H 2011-12-20 11:02:08 UTC
Created attachment 548817 [details]
vdsm log of failed "import domain"

It took a while to reproduce, but here it is.

Around line 310 there's the traceback from the failed call of spmprotect.sh.


And this might be of interest, too:
# mount | grep 10.10.11.17
10.10.11.17:/srv/isostorage/iso/ on /var/lib/vdsm/data-center/mnt/10.10.11.17:_srv_isostorage_iso_ type nfs (rw,soft,timeo=600,retrans=6,nosharecache,vers=3,addr=10.10.11.17)

Comment 8 Rami Vaknin 2011-12-21 04:24:52 UTC
Thank you, here it is - the last connectStorageServer before the traceback:

Thread-112::INFO::2011-12-20 10:57:53,341::logUtils::37::dispatcher::(wrapper) Run and protect: public_connectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': '10.10.11.17:/srv/isostorage/iso/', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '1e3d56af-20fd-4609-a784-3d8506352185', 'port': ''}], options=None)


The path does contain a trailing slash ("10.10.11.17:/srv/isostorage/iso/"), hence it was mounted to:
/var/lib/vdsm/data-center/mnt/10.10.11.17:_srv_isostorage_iso_

Where the trailing underscore is there because of the trailing slash in the nfs path.

The problem is that the spmprotect.sh gets a path to the "leases" file as parameter, this path is wrong because it missing the trailing underscore:
/var/lib/vdsm/data-center/mnt/10.10.11.17:_srv_isostorage_iso/fec63632-26d4-4e49-98dc-97e6b0b48f31/dom_md/leases


Thread-113::DEBUG::2011-12-20 10:57:53,397::safelease::81::Storage.Misc.excCmd::(acquire) '/usr/bin/sudo -n /usr/bin/setsid /usr/bin/ionice -c1 -n0 /bin/su vdsm -s /bin/sh -c "/var/lib/vdsm/prog/libexec/vdsm/spmprotect.sh start fec63632-26d4-4e49-98dc-97e6b0b48f31 1 5 /var/lib/vdsm/data-center/mnt/10.10.11.17:_srv_isostorage_iso/fec63632-26d4-4e49-98dc-97e6b0b48f31/dom_md/leases 5000 1000 3"' (cwd /var/lib/vdsm/prog/libexec/vdsm)

Comment 9 Matthias H 2011-12-21 09:45:31 UTC
Ok, then oVirt really adds a trailing / to the path.

I think this should be fixed in oVirt, but also in vdsm. A trailing / is a common error and vdsm should just truncate it. This shouldn't be to hard to implement, if you can tell me the where to look, I'll write a patch for vdsm.

Comment 10 Rami Vaknin 2011-12-22 07:00:15 UTC
You're absolutly right, I assume that it will be fixed soon, I think that there is a patch in review which should handle those slashes.

Anyway, you're always most welcome to contribute, for this issue I would take a look at the following api functions in $git/vdsm/storage/hsm.py:

public_connectStorageServer
public_disconnectStorageServer
public_validateStorageDomain

They use the StorageServerConnection class in storage_connection.py (same dir), which is probably a good place to strip the trailing slash from each dict in the conList of file-type storages using os.path.normpath.

Comment 11 Matthias H 2011-12-22 10:41:33 UTC
I took a look at gerrit, but I haven't found any open report that deals with this.

The patch is much shorter that expected and it looks as if it has no side effects. I grant you all rights on this.

diff --git a/vdsm/storage/fileUtils.py b/vdsm/storage/fileUtils.py
index 45f1ce4..40ae656 100644
--- a/vdsm/storage/fileUtils.py
+++ b/vdsm/storage/fileUtils.py
@@ -185,6 +185,7 @@ def transformPath(remotePath):
     """
     Transform remote path to new one for local mount
     """
+    remotePath = os.path.normpath(remotePath) #normalize the path beforehand
     return remotePath.replace('_','__').replace('/','_')
 
 def validateAccess(targetPath, perms=(os.R_OK | os.W_OK | os.X_OK)):

Comment 12 Rami Vaknin 2011-12-25 09:37:47 UTC
Thanks, here is the patch I've talked about (I see it's already pushed), it refactors the whole mount logic, could you please tell if it solves the bug?

http://gerrit.ovirt.org/gitweb?p=vdsm.git;a=commitdiff;h=9cdde3e82fa056ed3c664365655c8d8f0ea033ee

Comment 13 Ayal Baron 2011-12-27 14:44:10 UTC
(In reply to comment #12)
> Thanks, here is the patch I've talked about (I see it's already pushed), it
> refactors the whole mount logic, could you please tell if it solves the bug?
> 
> http://gerrit.ovirt.org/gitweb?p=vdsm.git;a=commitdiff;h=9cdde3e82fa056ed3c664365655c8d8f0ea033ee

It does not.
transformPath should still normalize the path and the function hasn't moved so above patch still applies.
Rami, if you could verify the patch, that would be great.

Comment 14 Matthias H 2012-01-09 12:44:42 UTC
Finally I got the chance to test this and it works now.

This bug can be marked as resolved.

Comment 15 Yaniv Kaul 2012-01-09 13:20:55 UTC
(In reply to comment #14)
> Finally I got the chance to test this and it works now.
> 
> This bug can be marked as resolved.

Per above comment, closing BZ. Thanks!


Note You need to log in before you can comment on or make changes to this bug.