This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 845955 - ovirt-engine-backend: storage domain remains in locked status in case createStoragePool fails
ovirt-engine-backend: storage domain remains in locked status in case createS...
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.1.0
x86_64 Linux
high Severity high
: ---
: 3.1.5
Assigned To: Liron Aravot
Haim
storage
: ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-06 05:35 EDT by Omri Hochman
Modified: 2016-02-10 11:59 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-11-05 03:09:05 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
engine.log (28.30 KB, application/octet-stream)
2012-08-06 05:35 EDT, Omri Hochman
no flags Details
console.log (6.83 KB, application/octet-stream)
2012-08-06 05:37 EDT, Omri Hochman
no flags Details

  None (edit)
Description Omri Hochman 2012-08-06 05:35:51 EDT
Created attachment 602471 [details]
engine.log

ovirt-engine-backend [scalability]: first storage-domain that has been added to Data-center is stuck in status 'Locked' forever. 

Description:
************
I created iSCSI Data-Center with three hosts in its Cluster. When I tried to create the first SD build from 100 LUNs on that DC, the action failed with internal rhevm error and the Storage domain got stuck in status 'Locked' forever.  

apparently one of the cluster hosts  iscsi.initiator wasn't configured properly on the storage machine side. So when I created 'the first' Storage domain on the Data-Center, I chose to use one of the host in the cluster which its iscsi.initiator was configured properly, so the 'AddSANStorageDomainCommand' command passed successfully, but the 'ConnectStorageServerVDSCommand' failed on "Storage domain does not exist" - rhevm-engine couldn't handle and did rollback on the action and that caused the Storage domain to be stuck in status "Locked" .

Note: the error was thrown during 'CreateStoragePoolVDS method'

engine.log:
************
2012-08-05 14:38:06,168 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp-/127.0.0.1:8009-6) [29db3a75] Vds: tigris02.scl.lab.tlv.redhat.com
2012-08-05 14:38:06,176 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (ajp-/127.0.0.1:8009-6) [29db3a75] Command CreateStoragePoolVDS execution failed. Exception
: VDSErrorException: VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Storage domain does not exist: ('d8751fe5-926c-47bb-8ec7-e30e0e1104b7
',)
2012-08-05 14:38:06,176 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] (ajp-/127.0.0.1:8009-6) [29db3a75] FINISH, CreateStoragePoolVDSComma
nd, log id: 220255a0
2012-08-05 14:38:06,176 ERROR [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (ajp-/127.0.0.1:8009-6) [29db3a75] Command org.ovirt.engine.core.bll.s
torage.AddStoragePoolWithStoragesCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGene
ricException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Storage domain does not exist: ('d8751fe5-926c-47bb-8ec7-e30e0e1104b7',)
2012-08-05 14:38:06,186 INFO  [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (ajp-/127.0.0.1:8009-6) [29db3a75] Lock freed to object EngineLock [ex
clusiveLocks= key: 07913226-ead1-49fa-9243-d8db4b402926 value: POOL



ea-448b-aa28-9b07d9aa6127, connection: 10.35.160.7 };{ id: 9aacd663-afc4-4936-9d8d-0e636cddb4e0, connection: 10.35.160.10 };{ id: 87f42b20-5bea-448b-aa28-9b07d9aa6127, co
nnection: 10.35.160.7 };{ id: 9aacd663-afc4-4936-9d8d-0e636cddb4e0, connection: 10.35.160.10 };{ id: 87f42b20-5bea-448b-aa28-9b07d9aa6127, connection: 10.35.160.7 };{ id:
 9aacd663-afc4-4936-9d8d-0e636cddb4e0, connection: 10.35.160.10 };{ id: 87f42b20-5bea-448b-aa28-9b07d9aa6127, connection: 10.35.160.7 };{ id: 9aacd663-afc4-4936-9d8d-0e63
6cddb4e0, connection: 10.35.160.10 };{ id: 87f42b20-5bea-448b-aa28-9b07d9aa6127, connection: 10.35.160.7 };{ id: 9aacd663-afc4-4936-9d8d-0e636cddb4e0, connection: 10.35.1
60.10 };{ id: 87f42b20-5bea-448b-aa28-9b07d9aa6127, connection: 10.35.160.7 };]), log id: fdd5591
2012-08-05 14:40:53,930 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-4-thread-49) [63277c85] FINISH, ConnectStorageServerVDSComm
and, return: {9aacd663-afc4-4936-9d8d-0e636cddb4e0=0, 87f42b20-5bea-448b-aa28-9b07d9aa6127=0}, log id: fdd5591
2012-08-05 14:40:53,942 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] (pool-4-thread-49) [63277c85] START, CreateStoragePoolVDSCommand(vds
Id = 9178cfb8-def1-11e1-9793-0be3898beab9, storagePoolId=07913226-ead1-49fa-9243-d8db4b402926, storageType=ISCSI, storagePoolName=xtreamio_dc, masterDomainId=d8751fe5-926
c-47bb-8ec7-e30e0e1104b7, domainsIdList=[d8751fe5-926c-47bb-8ec7-e30e0e1104b7], masterVersion=2), log id: 2ae85fc2
2012-08-05 14:40:55,958 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-49) [63277c85] Failed in CreateStoragePoolVDS method
2012-08-05 14:40:55,959 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-49) [63277c85] Error code StorageDomainDoesNotExist and error m
essage VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Storage domain does not exist: ('d8751fe5-926c-47bb-8ec7-e30e0e1104b7',)
2012-08-05 14:40:55,959 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-49) [63277c85] Command org.ovirt.engine.core.vdsbroker.vdsbroke
r.CreateStoragePoolVDSCommand return value 
 Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
mStatus                       Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode                         358
mMessage                      Storage domain does not exist: ('d8751fe5-926c-47bb-8ec7-e30e0e1104b7',)



2012-08-05 14:40:55,959 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-49) [63277c85] Vds: tigris02.scl.lab.tlv.redhat.com
2012-08-05 14:40:55,959 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-49) [63277c85] Command CreateStoragePoolVDS execution failed. Exception: VDS
ErrorException: VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Storage domain does not exist: ('d8751fe5-926c-47bb-8ec7-e30e0e1104b7',)
2012-08-05 14:40:55,960 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] (pool-4-thread-49) [63277c85] FINISH, CreateStoragePoolVDSCommand, l
og id: 2ae85fc2
2012-08-05 14:40:55,960 ERROR [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (pool-4-thread-49) [63277c85] Command org.ovirt.engine.core.bll.storag
e.AddStoragePoolWithStoragesCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericEx
ception: VDSErrorException: Failed to CreateStoragePoolVDS, error = Storage domain does not exist: ('d8751fe5-926c-47bb-8ec7-e30e0e1104b7',)
2012-08-05 14:40:55,964 INFO  [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (pool-4-thread-49) [63277c85] Lock freed to object EngineLock [exclusi
veLocks= key: 07913226-ead1-49fa-9243-d8db4b402926 value: POOL
, sharedLocks= ]
Comment 1 Omri Hochman 2012-08-06 05:37:03 EDT
Created attachment 602472 [details]
console.log

attach console.log - not sure if the console log can help.
Comment 2 Ayal Baron 2012-08-08 08:24:24 EDT
Does this happen with 10 LUNs as well? 50?
Comment 3 Haim 2012-08-08 12:35:57 EDT
(In reply to comment #2)
> Does this happen with 10 LUNs as well? 50?

its not related to scale, we had it with simple scenario such as createStoragePool with one storage domain on NFS.
its a pure engine issue.
Comment 4 Maor 2012-08-26 19:04:49 EDT
All domains statuses are changed to lock status at the beginning of the execute.

We can set the storage domain/domains as INACTIVE on failure.
this can be done at the AddStoragePoolWithStoragesCommand@executeCommand right before throwing an exception (line 107).
Comment 5 Ayal Baron 2012-08-27 19:16:40 EDT
(In reply to comment #4)
> All domains statuses are changed to lock status at the beginning of the
> execute.
> 
> We can set the storage domain/domains as INACTIVE on failure.
> this can be done at the AddStoragePoolWithStoragesCommand@executeCommand
> right before throwing an exception (line 107).

No, iiuc, the domain is not attached to the pool at all, hence it should be in the same state any domain is if you detach it from a pool without deleting it.
Comment 6 Liron Aravot 2012-10-15 04:44:54 EDT
http://gerrit.ovirt.org/#/c/8536/
Comment 7 Liron Aravot 2012-10-17 15:17:30 EDT
the following bug doesn't reproduce, verified on upstream/downstream with latest commit 478e3df87269b673eaa0c071e437e0f285ec1fa7.

the patch submitted will prevent the domain from getting stuck on lock when failing to activate it while executing CreateStoragePoolWithStorages command that may be the cause to the issue described by Haim in previous comment.
Comment 8 Ayal Baron 2012-10-18 10:38:33 EDT
Haim, this doesn't reproduce for us, can you verify?
Thanks.
Comment 9 Haim 2012-10-21 07:52:03 EDT
(In reply to comment #8)
> Haim, this doesn't reproduce for us, can you verify?
> Thanks.

no, if you think that patch resolves the issue, move the bug to modified >> ON_QA and we will verify it then.
Comment 10 Liron Aravot 2012-10-28 06:21:02 EDT
Omri, please try to reproduce. can't get it to reproduce when having a failure in createStoragePool method.
Comment 11 Daniel Paikov 2012-11-01 05:44:36 EDT
(In reply to comment #10)
> Omri, please try to reproduce. can't get it to reproduce when having a
> failure in createStoragePool method.

Doesn't reproduce on current upstream.
Comment 12 Ayal Baron 2012-11-05 03:09:05 EST
According to comment 11 it works.
Comment 13 Haim 2012-11-11 02:41:06 EST
(In reply to comment #12)
> According to comment 11 it works.

I don't trust this call, let's hope we won't get this from field.

Note You need to log in before you can comment on or make changes to this bug.