Bug 975003 - engine: failure to activate a non-master storage domain which is inaccessible will trigger spmStop
Summary: engine: failure to activate a non-master storage domain which is inaccessible...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.3.0
Assignee: Nobody's working on this, feel free to take it
QA Contact:
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-17 11:03 UTC by Dafna Ron
Modified: 2016-02-10 17:14 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-08 07:05:35 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (1.16 MB, application/x-gzip)
2013-06-17 11:03 UTC, Dafna Ron
no flags Details

Description Dafna Ron 2013-06-17 11:03:05 UTC
Created attachment 762001 [details]
logs

Description of problem:

I tried activating a storage domain which is inaccessible and after the failure to activate the domain engine sends spmStop even though its not a master domain and its inaccessible from all hosts. 

Version-Release number of selected component (if applicable):

sf18
vdsm-4.10.2-23.0.el6ev.x86_64

How reproducible:

100%

Steps to Reproduce:
1. create two iscsi storage domain located on two different storage servers
2. put the non-master domain in maintenance 
3. from all hosts block connectivity to the non-master storage domain 
4. activate the non-master storage domain

Actual results:

engine sends SpmStop even though all hosts cannot see the storage domain 

Expected results:

we should not send SpmStop

Additional info: logs

2013-06-17 13:52:42,074 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-46) [2ec39c6d] Error code StorageDomainDoesNotExist and error message
 IRSGenericException: IRSErrorException: Failed to ActivateStorageDomainVDS, error = Storage domain does not exist: ('38755249-4bb3-4841-bf5b-05f4a521514d',)

2013-06-17 13:52:42,119 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-4-thread-46) [2ec39c6d] SpmStopVDSCommand::Stopping SPM on vds cougar01, pool 
id 7fd33b43-a9f4-4eb7-a885-e9583a929ceb
2013-06-17 13:52:43,165 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-4-thread-46) [2ec39c6d] FINISH, SpmStopVDSCommand, log id: 465bea95
2013-06-17 13:52:43,165 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-4-thread-46) [2ec39c6d] Irs placed on server 4497d431-7c5e-4924-96e0-3f9cdbf826
e5 failed. Proceed Failover

Comment 1 Liron Aravot 2013-07-07 18:08:05 UTC
As the domain is in MAINTENANCE, no monitoring should be done for that domain by the hosts/engine - therefore, we don't know beforehand if it's seen at all by any of the hosts prior to the activation execution - the question whether we don't to perform failover in this case is debatable IMO.

we might "get some idea" about the activation result by performing different checks before running the activate vds command to improve the chances of predicting the result - it seems to me like an RFE which might be partially contained by other upcoming features.
Allon, what's your take on it?

Comment 2 Ayal Baron 2013-07-08 07:05:35 UTC
In general I agree that this is not very nice but the scenario is that user is activating a domain which she believes is now ok and activation fails.  In this case it is reasonable to assume that the problem is specific to the host and not to the domain.  As Liron mentioned, we do not monitor domains in maintenance mode and solving this use case requires a lot of code for very little gain.


Note You need to log in before you can comment on or make changes to this bug.