This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 968891 - Attaching NFS storage domain failed with 'Sanlock lockspace add failure'
Attaching NFS storage domain failed with 'Sanlock lockspace add failure'
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: ---
: 3.2.0
Assigned To: Nobody's working on this, feel free to take it
Depends On:
  Show dependency treegraph
Reported: 2013-05-30 04:33 EDT by Katarzyna Jachim
Modified: 2016-02-10 15:21 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2013-05-30 08:32:02 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
logs (vdsm, sanlock, engine, server, /var/log/messages...) (608.86 KB, application/x-bzip2)
2013-05-30 04:33 EDT, Katarzyna Jachim
no flags Details

  None (edit)
Description Katarzyna Jachim 2013-05-30 04:33:02 EDT
Created attachment 754716 [details]
logs (vdsm, sanlock, engine, server, /var/log/messages...)

Description of problem:

Attaching an NFS storage domain failed with the following error in vdsm.log:

Thread-111::ERROR::2013-05-30 01:20:18,469::task::850::TaskManager.Task::(_setError) Task=`98ba799f-078e-418b-832c-cb03181f328e`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/", line 41, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/", line 1098, in attachStorageDomain
  File "/usr/share/vdsm/storage/", line 68, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/", line 932, in attachSD
  File "/usr/share/vdsm/storage/", line 445, in acquireHostId
    self._clusterLock.acquireHostId(hostId, async)
  File "/usr/share/vdsm/storage/", line 188, in acquireHostId
    raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: ('795c691b-0aaf-4068-bb29-8ad969bc936a', SanlockException(-202, 'Sanlock lockspace add failure', 'Sanlock exception'))

Version-Release number of selected component (if applicable):

How reproducible:
happened once in automated tests:

Steps to Reproduce:
1. add, attach & activate master NFS storage domain 
2. add another NFS storage domain (NFS v3) 
3. try to attach the storage domain

I have no idea if the fact that it is a non-master domain or NFSv3 matters. It is definitely non-deterministic, re-run test passed.

Actual results:
Attach failed with sanlock exception

Expected results:
Storage domain should be attached

Additional info:
Comment 1 Dafna Ron 2013-05-30 08:32:02 EDT
its not a bug. 
sanlock is doing what its suppose to do. 
we have a race here. 
the create and connect of the Pool was still holding the lock on the master domain. 
when a second domain will be added sanlock cannot acquire lock and add domain will fail. 
closing as not a bug. 

below is the thread from sanlock log showing the race on the locks where this is the first domain: 28a3f1a9-7e8d-4b4b-8c69-1410f769aedc and this is the second domain: 795c691b-0aaf-4068-bb29-8ad969bc936a. 

2013-05-30 01:16:50+0300 22 [2592]: sanlock daemon started 2.6 host 104b6509-03c1-43c3-8d80-488158a1809e.puma19.scl
2013-05-30 01:18:42+0300 133 [2597]: s1 lockspace 28a3f1a9-7e8d-4b4b-8c69-1410f769aedc:250:/rhev/data-center/mnt/
2013-05-30 01:19:03+0300 154 [2598]: s1:r1 resource 28a3f1a9-7e8d-4b4b-8c69-1410f769aedc:SDM:/rhev/data-center/mnt/ for 3,12,3729
2013-05-30 01:19:10+0300 161 [2598]: s2 lockspace 28a3f1a9-7e8d-4b4b-8c69-1410f769aedc:1:/rhev/data-center/mnt/
2013-05-30 01:19:31+0300 182 [2597]: s2:r2 resource 28a3f1a9-7e8d-4b4b-8c69-1410f769aedc:SDM:/rhev/data-center/mnt/ for 3,12,3729
2013-05-30 01:19:43+0300 194 [2597]: s3 lockspace 795c691b-0aaf-4068-bb29-8ad969bc936a:1:/rhev/data-center/mnt/
2013-05-30 01:20:13+0300 224 [6557]: 795c691b aio timeout 0 0x7f33e80008c0:0x7f33e80008d0:0x7f33e8101000 ioto 10 to_count 1
2013-05-30 01:20:13+0300 224 [6557]: read_sectors delta_leader offset 0 rv -202 /rhev/data-center/mnt/
2013-05-30 01:20:18+0300 229 [6557]: 795c691b aio collect 0x7f33e80008c0:0x7f33e80008d0:0x7f33e8101000 result 512:0 close free
2013-05-30 01:20:18+0300 229 [2597]: s3 add_lockspace fail result -202
2013-05-30 01:20:43+0300 254 [2598]: s4 lockspace 28a3f1a9-7e8d-4b4b-8c69-1410f769aedc:1:/rhev/data-center/mnt/
2013-05-30 01:21:05+0300 276 [2597]: s4:r3 resource 28a3f1a9-7e8d-4b4b-8c69-1410f769aedc:SDM:/rhev/data-center/mnt/ for 3,12,3729

Note You need to log in before you can comment on or make changes to this bug.