Bug 836168 - Cannot remove iscsi master storage right after removing the storage pool
Summary: Cannot remove iscsi master storage right after removing the storage pool
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.1.5
Assignee: Ayal Baron
QA Contact: Haim
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-28 09:31 UTC by Ilanit Stein
Modified: 2016-02-10 17:07 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-10-21 22:03:39 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vdsm log (around 2012-06-27 16:25) (737.90 KB, application/x-compressed-tar)
2012-06-28 09:31 UTC, Ilanit Stein
no flags Details
rhevm log (around 2012-06-27 16:25) (56.81 KB, application/x-gzip)
2012-06-28 09:33 UTC, Ilanit Stein
no flags Details
storage domain remove events screenshot (182.23 KB, image/png)
2012-06-28 11:42 UTC, Ilanit Stein
no flags Details

Description Ilanit Stein 2012-06-28 09:31:46 UTC
Created attachment 594980 [details]
vdsm log (around  2012-06-27 16:25)

Description of problem:

Using GUI on SI7, remove storage domain fail, because it is still connected to storage pool.


Version-Release number of selected component (if applicable):

Setup: SI7, host: RHEL6.3, vdsm: vdsm-4.9.6-16.0.el6.x86_64.

How reproducible:

Happened once, I didn't try to reproduce. 

Steps to Reproduce:
Via GUI:
1. Move master storage domain to maintenance, and wait for it to turn to "inactive" status.
2. Remove data center and wait for the storage domain marked as unattached 
3. Try to remove the storage domain and fail.

Error from vdsm.log(attached):

Thread-10225::ERROR::2012-06-27 16:25:14,882::task::853::TaskManager.Task::(_setError) Task=`b08e6b68-ecd0-4fa7-9af6-f7903c351dba`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 2167, in formatStorageDomain
    raise se.CannotFormatStorageDomainInConnectedPool(sdUUID)
CannotFormatStorageDomainInConnectedPool: Cannot format storage domain in connected pool: ('dea000eb-88d1-4865-a316-28e0afb5dfbd',)

Error from engine.log(attached):
2012-06-27 16:34:49,148 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (ajp--0.0.0.0-8009-6) [ba2502] Command FormatS
torageDomainVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to FormatStora
geDomainVDS, error = Cannot format storage domain in connected pool: ('dea000eb-88d1-4865-a316-28e0afb5dfbd',)
2012-06-27 16:34:49,149 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FormatStorageDomainVDSCommand] (ajp--0.0.0.0-8009-6)
 [ba2502] FINISH, FormatStorageDomainVDSCommand, log id: 19450fdb
2012-06-27 16:34:49,149 ERROR [org.ovirt.engine.core.bll.storage.RemoveStorageDomainCommand] (ajp--0.0.0.0-8009-6) [ba2502] C
ommand org.ovirt.engine.core.bll.storage.RemoveStorageDomainCommand throw Vdc Bll exception. With error message VdcBLLExcepti
on: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to FormatStor
ageDomainVDS, error = Cannot format storage domain in connected pool: ('dea000eb-88d1-4865-a316-28e0afb5dfbd',)
2012-06-27 16:34:49,172 ERROR [org.ovirt.engine.core.bll.storage.RemoveStorageDomainCommand] (ajp--0.0.0.0-8009-6) [ba2502] T
ransaction rolled-back for command: org.ovirt.engine.core.bll.storage.RemoveStorageDomainCommand.

Comment 1 Ilanit Stein 2012-06-28 09:33:10 UTC
Created attachment 594982 [details]
rhevm log (around 2012-06-27 16:25)

Comment 2 Ilanit Stein 2012-06-28 09:35:41 UTC
more info:

The storage domain was removed successfully, when trying to remove it few seconds after the removal failure.

Comment 3 Haim 2012-06-28 09:47:16 UTC
(In reply to comment #2)
> more info:
> 
> The storage domain was removed successfully, when trying to remove it few
> seconds after the removal failure.

which correlates perfectly with my Theory its a race in multipath.

Comment 4 Ilanit Stein 2012-06-28 11:42:02 UTC
Created attachment 594998 [details]
storage domain remove events screenshot

Comment 5 Haim 2012-06-28 11:51:41 UTC
moving component to engine-core and not vdsm as per the following analysis:

on engine-core logs, there is DestroyStoragePool command, but actually, command was never sent to VDS (all other commands are there and I managed to correlate them). 


2012-06-27 16:33:55,959 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-3-thread-48) [4f863a4f] FINISH, SpmStopVDSCommand, log id: 72ffdae5
2012-06-27 16:33:55,984 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (pool-3-thread-48) [4f863a4f] FINISH, ResetIrsVDSCommand, log id: 3d6ca278
2012-06-27 16:33:55,985 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand] (pool-3-thread-48) [4f863a4f] FINISH, SpmStopOnIrsVDSCommand, log id: 7edad762
2012-06-27 16:33:56,015 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (pool-3-thread-48) [4f863a4f] START, DisconnectStoragePoolVDSCommand(vdsId = b929d0c0-c04c-11e1-ade7-001
a4a169778, storagePoolId = 24f5e17d-2dd0-4139-ae1c-5c663f34d67a, vds_spm_id = 1, masterDomainId = 00000000-0000-0000-0000-000000000000, masterVersion = 0), log id: 4e84b1ec
2012-06-27 16:33:56,714 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (pool-3-thread-48) [4f863a4f] FINISH, DisconnectStoragePoolVDSCommand, log id: 4e84b1ec
2012-06-27 16:33:56,719 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (pool-3-thread-48) [4f863a4f] START, DisconnectStorageServerVDSCommand(vdsId = b929d0c0-c04c-11e1-ade7
-001a4a169778, storagePoolId = 24f5e17d-2dd0-4139-ae1c-5c663f34d67a, storageType = ISCSI, connectionList = [{ id: 3529ac88-2f1a-4063-8f57-c9199ac158f0, connection: wolf.qa.lab.tlv.redhat.com };]), log id: 681c1
9e2
2012-06-27 16:33:59,292 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (pool-3-thread-48) [4f863a4f] FINISH, DisconnectStorageServerVDSCommand, return: {3529ac88-2f1a-4063-8
f57-c9199ac158f0=0}, log id: 681c19e2
2012-06-27 16:34:03,941 INFO  [org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand] (pool-3-thread-48) [6606f1a5] Running command: RemoveStoragePoolCommand internal: false. Entities affected :  ID: 24f5e
17d-2dd0-4139-ae1c-5c663f34d67a Type: StoragePool
2012-06-27 16:34:04,038 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-3-thread-46) [6606f1a5] START, ConnectStorageServerVDSCommand(vdsId = bbac1b64-c04c-11e1-97c0-001a4
a169778, storagePoolId = 24f5e17d-2dd0-4139-ae1c-5c663f34d67a, storageType = ISCSI, connectionList = [{ id: 3529ac88-2f1a-4063-8f57-c9199ac158f0, connection: wolf.qa.lab.tlv.redhat.com };]), log id: 2963a672
2012-06-27 16:34:04,038 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-3-thread-45) [6606f1a5] START, ConnectStorageServerVDSCommand(vdsId = b929d0c0-c04c-11e1-ade7-001a4
a169778, storagePoolId = 24f5e17d-2dd0-4139-ae1c-5c663f34d67a, storageType = ISCSI, connectionList = [{ id: 3529ac88-2f1a-4063-8f57-c9199ac158f0, connection: wolf.qa.lab.tlv.redhat.com };]), log id: 53433fc9
2012-06-27 16:34:04,955 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-3-thread-46) [6606f1a5] FINISH, ConnectStorageServerVDSCommand, return: {3529ac88-2f1a-4063-8f57-c9
199ac158f0=0}, log id: 2963a672
2012-06-27 16:34:04,958 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (pool-3-thread-46) [6606f1a5] START, ConnectStoragePoolVDSCommand(vdsId = bbac1b64-c04c-11e1-97c0-001a4a169
778, storagePoolId = 24f5e17d-2dd0-4139-ae1c-5c663f34d67a, vds_spm_id = 2, masterDomainId = dea000eb-88d1-4865-a316-28e0afb5dfbd, masterVersion = 1), log id: 7e5b93f9
2012-06-27 16:34:05,029 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-3-thread-45) [6606f1a5] FINISH, ConnectStorageServerVDSCommand, return: {3529ac88-2f1a-4063-8f57-c9
199ac158f0=0}, log id: 53433fc9
2012-06-27 16:34:05,031 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (pool-3-thread-45) [6606f1a5] START, ConnectStoragePoolVDSCommand(vdsId = b929d0c0-c04c-11e1-ade7-001a4a169
778, storagePoolId = 24f5e17d-2dd0-4139-ae1c-5c663f34d67a, vds_spm_id = 1, masterDomainId = dea000eb-88d1-4865-a316-28e0afb5dfbd, masterVersion = 1), log id: 48f72b68
2012-06-27 16:34:10,729 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (pool-3-thread-46) [6606f1a5] FINISH, ConnectStoragePoolVDSCommand, log id: 7e5b93f9
2012-06-27 16:34:13,047 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (pool-3-thread-45) [6606f1a5] FINISH, ConnectStoragePoolVDSCommand, log id: 48f72b68
2012-06-27 16:34:13,251 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DestroyStoragePoolVDSCommand] (pool-3-thread-48) [6606f1a5] START, DestroyStoragePoolVDSCommand(storagePoolId = 24f5e17d-2dd0-4139-ae1c-5c663f34d67a, ignoreFailoverLimit = false, compatabilityVersion = null), log id: 1f06ffaf
2012-06-27 16:34:13,260 INFO  [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (pool-3-thread-48) [3335e75a] Running command: SetStoragePoolStatusCommand internal: true. Entities affected :  ID: 24f5e17d-2dd0-4139-ae1c-5c663f34d67a Type: StoragePool
2012-06-27 16:34:13,311 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-3-thread-48) [3335e75a] hostFromVds::selectedVds - cyan-vdsh.qa.lab.tlv.redhat.com, spmStatus Free, storage pool Migrations_DC1
2012-06-27 16:34:13,326 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-3-thread-48) [3335e75a] starting spm on vds cyan-vdsh.qa.lab.tlv.redhat.com, storage pool Migrations_DC1, prevId -1, LVER 0
2012-06-27 16:34:13,328 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (pool-3-thread-48) [3335e75a] START, SpmStartVDSCommand(vdsId = bbac1b64-c04c-11e1-97c0-001a4a169778, storagePoolId = 24f5e17d-2dd0-4139-ae1c-5c663f34d67a, prevId=-1, prevLVER=0, storagePoolFormatType=V2, recoveryMode=Manual, SCSIFencing=false), log id: 715c67c7

Comment 7 mkublin 2012-10-04 14:48:40 UTC
http://gerrit.ovirt.org/#/c/8360/ - upstream patch

Comment 8 mkublin 2012-10-11 07:50:15 UTC
Pushed upstream

Comment 12 Ayal Baron 2012-10-21 22:03:39 UTC
Apparently this does not reproduce on latest builds.
Closing worksforme.
Reopen if you manage to reproduce.


Note You need to log in before you can comment on or make changes to this bug.