Bug 1170202 - [PPC] Failed to attach NFS storage: Error while executing action Attach Storage Domain: AcquireHostIdFailure
Summary: [PPC] Failed to attach NFS storage: Error while executing action Attach Stora...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.3
Hardware: ppc64
OS: Linux
medium
urgent
Target Milestone: ---
: 3.4.5
Assignee: Liron Aravot
QA Contact: Aharon Canan
URL:
Whiteboard: storage
Depends On: 1160204
Blocks: 1122979
TreeView+ depends on / blocked
 
Reported: 2014-12-03 13:47 UTC by Lukas Svaty
Modified: 2016-04-01 21:05 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1160204
Environment:
Last Closed: 2015-01-12 10:31:07 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Comment 18 Liron Aravot 2014-12-14 11:17:46 UTC
On the provided log i can see the following -
1. a domain is created - we perform the above
Thread-13::DEBUG::2014-12-05 11:45:02,591::resourceManager::421::ResourceManager::(registerNamespace) Registering namespace '00968ef6-441e-4313-9fbe-e49b08be657c_volumeNS'
Thread-13::DEBUG::2014-12-05 11:45:02,591::clusterlock::144::initSANLock::(initSANLock) Initializing SANLock for domain 00968ef6-441e-4313-9fbe-e49b08be657c
Thread-13::DEBUG::2014-12-05 11:45:02,705::sd::434::Storage.StorageDomain::(initSPMlease) lease initialized successfully
Thread-13::DEBUG::2014-12-05 11:45:02,705::hsm::2647::Storage.HSM::(createStorageDomain) knownSDs: {00968ef6-441e-4313-9fbe-e49b08be657c: storage.nfsSD.findDomain}

which performs the following:
        sanlock.init_lockspace(sdUUID, idsPath)
        sanlock.init_resource(sdUUID, SDM_LEASE_NAME,
                              [(leasesPath, SDM_LEASE_OFFSET)])

2. we disconnect from the domains storage server


3. we connect to the domain storage server

4. we immediately attempt to createStoragePool with that domain as the master and fail

Thread-13::INFO::2014-12-05 11:45:03,963::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='195047ba-93e1-4835-828
7-61fa5b7fd1be', poolName='DC_NEW', masterDom='00968ef6-441e-4313-9fbe-e49b08be657c', domList=['00968ef6-441e-4313-9fbe-e49b08be657c'], masterVersion=1, lockP
olicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None)


Thread-13::INFO::2014-12-05 11:45:04,036::clusterlock::184::SANLock::(acquireHostId) Acquiring host id for domain 00968ef6-441e-4313-9fbe-e49b08be657c (id: 250)
Thread-13::ERROR::2014-12-05 11:45:05,037::task::866::TaskManager.Task::(_setError) Task=`ba7f2e70-5313-417e-be34-40b4fb4ffc85`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool
    leaseParams)
  File "/usr/share/vdsm/storage/sp.py", line 573, in create
    self._acquireTemporaryClusterLock(msdUUID, leaseParams)
  File "/usr/share/vdsm/storage/sp.py", line 515, in _acquireTemporaryClusterLock
    msd.acquireHostId(self.id)
  File "/usr/share/vdsm/storage/sd.py", line 468, in acquireHostId
    self._clusterLock.acquireHostId(hostId, async)
  File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId
    raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: ('00968ef6-441e-4313-9fbe-e49b08be657c', SanlockException(19, 'Sanlock lockspace add failure', 'No such device'))


On the sanlock log i see the following error-
2014-12-05 11:45:04+0000 4032 [35940]: s1 lockspace 00968ef6-441e-4313-9fbe-e49b08be657c:250:/rhev/data-center/mnt/10.34.63.202:_mnt_export_nfs_lv3_lsvaty_ppc-nfs/00968ef6-441e-4313-9fbe-e49b08be657c/dom_md/ids:0
2014-12-05 11:45:04+0000 4032 [68886]: open error -13 /rhev/data-center/mnt/10.34.63.202:_mnt_export_nfs_lv3_lsvaty_ppc-nfs/00968ef6-441e-4313-9fbe-e49b08be657c/dom_md/ids
2014-12-05 11:45:04+0000 4032 [68886]: s1 open_disk /rhev/data-center/mnt/10.34.63.202:_mnt_export_nfs_lv3_lsvaty_ppc-nfs/00968ef6-441e-4313-9fbe-e49b08be657c/dom_md/ids error -13
2014-12-05 11:45:05+0000 4033 [35940]: s1 add_lockspace fail result -19


Lukas, can you please check the files under /dom_md and attach the output?
Does that fail constantly? I mean, after the failure you should still have the domain - what happens if you try to create a storage pool again? does it happen always with the domain?

Nir, any suggestion based on other related issues you've handled?

Comment 22 Michal Skrivanek 2015-01-12 10:31:07 UTC
Right. 3.4.3 ppc is not compatible with 3.4.4, and vice versa


Note You need to log in before you can comment on or make changes to this bug.