On the provided log i can see the following -
1. a domain is created - we perform the above
Thread-13::DEBUG::2014-12-05 11:45:02,591::resourceManager::421::ResourceManager::(registerNamespace) Registering namespace '00968ef6-441e-4313-9fbe-e49b08be657c_volumeNS'
Thread-13::DEBUG::2014-12-05 11:45:02,591::clusterlock::144::initSANLock::(initSANLock) Initializing SANLock for domain 00968ef6-441e-4313-9fbe-e49b08be657c
Thread-13::DEBUG::2014-12-05 11:45:02,705::sd::434::Storage.StorageDomain::(initSPMlease) lease initialized successfully
Thread-13::DEBUG::2014-12-05 11:45:02,705::hsm::2647::Storage.HSM::(createStorageDomain) knownSDs: {00968ef6-441e-4313-9fbe-e49b08be657c: storage.nfsSD.findDomain}
which performs the following:
sanlock.init_lockspace(sdUUID, idsPath)
sanlock.init_resource(sdUUID, SDM_LEASE_NAME,
[(leasesPath, SDM_LEASE_OFFSET)])
2. we disconnect from the domains storage server
3. we connect to the domain storage server
4. we immediately attempt to createStoragePool with that domain as the master and fail
Thread-13::INFO::2014-12-05 11:45:03,963::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='195047ba-93e1-4835-828
7-61fa5b7fd1be', poolName='DC_NEW', masterDom='00968ef6-441e-4313-9fbe-e49b08be657c', domList=['00968ef6-441e-4313-9fbe-e49b08be657c'], masterVersion=1, lockP
olicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None)
Thread-13::INFO::2014-12-05 11:45:04,036::clusterlock::184::SANLock::(acquireHostId) Acquiring host id for domain 00968ef6-441e-4313-9fbe-e49b08be657c (id: 250)
Thread-13::ERROR::2014-12-05 11:45:05,037::task::866::TaskManager.Task::(_setError) Task=`ba7f2e70-5313-417e-be34-40b4fb4ffc85`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 873, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool
leaseParams)
File "/usr/share/vdsm/storage/sp.py", line 573, in create
self._acquireTemporaryClusterLock(msdUUID, leaseParams)
File "/usr/share/vdsm/storage/sp.py", line 515, in _acquireTemporaryClusterLock
msd.acquireHostId(self.id)
File "/usr/share/vdsm/storage/sd.py", line 468, in acquireHostId
self._clusterLock.acquireHostId(hostId, async)
File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId
raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: ('00968ef6-441e-4313-9fbe-e49b08be657c', SanlockException(19, 'Sanlock lockspace add failure', 'No such device'))
On the sanlock log i see the following error-
2014-12-05 11:45:04+0000 4032 [35940]: s1 lockspace 00968ef6-441e-4313-9fbe-e49b08be657c:250:/rhev/data-center/mnt/10.34.63.202:_mnt_export_nfs_lv3_lsvaty_ppc-nfs/00968ef6-441e-4313-9fbe-e49b08be657c/dom_md/ids:0
2014-12-05 11:45:04+0000 4032 [68886]: open error -13 /rhev/data-center/mnt/10.34.63.202:_mnt_export_nfs_lv3_lsvaty_ppc-nfs/00968ef6-441e-4313-9fbe-e49b08be657c/dom_md/ids
2014-12-05 11:45:04+0000 4032 [68886]: s1 open_disk /rhev/data-center/mnt/10.34.63.202:_mnt_export_nfs_lv3_lsvaty_ppc-nfs/00968ef6-441e-4313-9fbe-e49b08be657c/dom_md/ids error -13
2014-12-05 11:45:05+0000 4033 [35940]: s1 add_lockspace fail result -19
Lukas, can you please check the files under /dom_md and attach the output?
Does that fail constantly? I mean, after the failure you should still have the domain - what happens if you try to create a storage pool again? does it happen always with the domain?
Nir, any suggestion based on other related issues you've handled?
Comment 22Michal Skrivanek
2015-01-12 10:31:07 UTC
Right. 3.4.3 ppc is not compatible with 3.4.4, and vice versa