Bug 1302745

Summary: [engine-setup] creation of iso domain path is not rolled back, a next attempt leaves an empty domain
Product: [oVirt] ovirt-engine Reporter: Yedidyah Bar David <didi>
Component: Setup.EngineAssignee: Yedidyah Bar David <didi>
Status: CLOSED WONTFIX QA Contact: Pavel Stehlik <pstehlik>
Severity: medium Docs Contact:
Priority: low    
Version: 3.6.1CC: bugs, didi, lveyde, rmartins, sbonazzo, stirabos, ylavi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-02 08:12:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yedidyah Bar David 2016-01-28 14:06:41 UTC
Description of problem:

If during engine-setup we create an iso domain, and later fail and rollback, we remove the files we created there, but do not remove directories. So these are left behind.

A next attempt to run engine-setup with the answer file created by the first one will try to create an iso domain in the same location. It checks if it exists, and if so, uses the directory's name found there as the uuid of the domain. Later, if such a uuid is defined, we do not populate it with actual files (including metadata), thus leaving it empty.

engine-setup will finish successfully, but when actually trying to attach it, this will fail, with:

engine.log:

2016-01-28 15:27:34,235 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.AttachStorageDomainVDSCommand] (org.ovirt.thread.pool-7-thread-26) [fd1aad0] Command 'AttachStorageDomainVDSCommand( AttachStorageDomainVDSCommandParameters:{runAsync='true', storagePoolId='00000001-0001-0001-0001-00000000023b', ignoreFailoverLimit='false', storageDomainId='3b1a2f5b-f807-4e09-86fa-a31c9e59305c'})' execution failed: IRSGenericException: IRSErrorException: Failed to AttachStorageDomainVDS, error = Error in storage domain action: (u'sdUUID=3b1a2f5b-f807-4e09-86fa-a31c9e59305c, spUUID=00000001-0001-0001-0001-00000000023b',), code = 350
2016-01-28 15:27:34,235 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.AttachStorageDomainVDSCommand] (org.ovirt.thread.pool-7-thread-26) [fd1aad0] FINISH, AttachStorageDomainVDSCommand, log id: 2d2f5ac6
2016-01-28 15:27:34,235 ERROR [org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand] (org.ovirt.thread.pool-7-thread-26) [fd1aad0] Command 'org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.irsbroker.IrsOperationFailedNoFailoverException: IRSGenericException: IRSErrorException: Failed to AttachStorageDomainVDS, error = Error in storage domain action: (u'sdUUID=3b1a2f5b-f807-4e09-86fa-a31c9e59305c, spUUID=00000001-0001-0001-0001-00000000023b',), code = 350 (Failed with error StorageDomainActionError and code 350)

vdsm.log:

jsonrpc.Executor/7::DEBUG::2016-01-28 15:27:33,168::fileSD::159::Storage.StorageDomainManifest::(__init__) Reading domain in path /rhev/data-center/mnt/didi-f19-engine.eng.lab.tlv.redhat.com:_var_lib_exports_iso/3b1a2f5b-f807-4e09-86fa-a31c9e59305c
jsonrpc.Executor/7::DEBUG::2016-01-28 15:27:33,169::__init__::318::IOProcessClient::(_run) Starting IOProcess...
jsonrpc.Executor/7::DEBUG::2016-01-28 15:27:33,198::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend
jsonrpc.Executor/7::DEBUG::2016-01-28 15:27:33,202::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=[]
jsonrpc.Executor/7::DEBUG::2016-01-28 15:27:33,203::persistentDict::252::Storage.PersistentDict::(refresh) Empty metadata
jsonrpc.Executor/7::ERROR::2016-01-28 15:27:33,203::task::867::Storage.TaskManager.Task::(_setError) Task=`130a4b02-138c-401b-b618-31f50760c333`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 874, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1210, in attachStorageDomain
    pool.attachSD(sdUUID)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 884, in attachSD
    dom = sdCache.produce(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 100, in produce
    domain.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 124, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 143, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/nfsSD.py", line 122, in findDomain
    return NfsStorageDomain(NfsStorageDomain.findDomainPath(sdUUID))
  File "/usr/share/vdsm/storage/fileSD.py", line 330, in __init__
    manifest = self.manifestClass(domainPath)
  File "/usr/share/vdsm/storage/fileSD.py", line 168, in __init__
    sd.StorageDomainManifest.__init__(self, sdUUID, domaindir, metadata)
  File "/usr/share/vdsm/storage/sd.py", line 310, in __init__
    self._domainLock = self._makeDomainLock()
  File "/usr/share/vdsm/storage/sd.py", line 439, in _makeDomainLock
    domVersion = self.getVersion()
  File "/usr/share/vdsm/storage/sd.py", line 359, in getVersion
    return self.getMetaParam(DMDK_VERSION)
  File "/usr/share/vdsm/storage/sd.py", line 356, in getMetaParam
    return self._metadata[key]
  File "/usr/share/vdsm/storage/persistentDict.py", line 89, in __getitem__
    return dec(self._dict[key])
  File "/usr/share/vdsm/storage/persistentDict.py", line 201, in __getitem__
    raise KeyError(key)
KeyError: 'VERSION'

The web interface will show:

VDSM command failed: Error in storage domain action: (u'sdUUID=3b1a2f5b-f807-4e09-86fa-a31c9e59305c, spUUID=00000001-0001-0001-0001-00000000023b',)

Failed to attach Storage Domain ISO_DOMAIN to Data Center Default. (User: admin@internal)

Version-Release number of selected component (if applicable):

Current master. engine-setup behavior is way older, not sure about ui/engine/vdsm.

How reproducible:

Always I think

Steps to Reproduce:
1. engine-setup, choose to create an nfs iso domain
2. fail engine-setup after it prepared the iso domain
3. run engine-setup again with the answerfile generated by 1.

Actual results:

engine-setup finishes successfully, leaving an empty domain, attaching it fails

Expected results:

I guess engine-setup should try to also remove the empty directories it created during rollback. It's probably not very easy to do properly with current transaction rollback and current filetransaction.

Additional info:

Workarounds:

If between (2.) and (3.) above, remove the iso domain

If after (3.), destroy the iso domain from the web interface and create a new one.

Comment 1 Sandro Bonazzola 2016-05-02 09:48:36 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 2 Yaniv Lavi 2016-05-23 13:13:22 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 3 Yaniv Lavi 2016-06-02 08:12:22 UTC
We will not longer allow users to define a ISO domain in the setup, therefore closing this bug.

Comment 4 Yaniv Lavi 2016-06-02 08:12:46 UTC
This change is from 4.0 and on.