Bug 975003 - engine: failure to activate a non-master storage domain which is inaccessible will trigger spmStop
engine: failure to activate a non-master storage domain which is inaccessible...
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
x86_64 Linux
unspecified Severity medium
: ---
: 3.3.0
Assigned To: Nobody's working on this, feel free to take it
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-17 07:03 EDT by Dafna Ron
Modified: 2016-02-10 12:14 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-08 03:05:35 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
logs (1.16 MB, application/x-gzip)
2013-06-17 07:03 EDT, Dafna Ron
no flags Details

  None (edit)
Description Dafna Ron 2013-06-17 07:03:05 EDT
Created attachment 762001 [details]
logs

Description of problem:

I tried activating a storage domain which is inaccessible and after the failure to activate the domain engine sends spmStop even though its not a master domain and its inaccessible from all hosts. 

Version-Release number of selected component (if applicable):

sf18
vdsm-4.10.2-23.0.el6ev.x86_64

How reproducible:

100%

Steps to Reproduce:
1. create two iscsi storage domain located on two different storage servers
2. put the non-master domain in maintenance 
3. from all hosts block connectivity to the non-master storage domain 
4. activate the non-master storage domain

Actual results:

engine sends SpmStop even though all hosts cannot see the storage domain 

Expected results:

we should not send SpmStop

Additional info: logs

2013-06-17 13:52:42,074 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-46) [2ec39c6d] Error code StorageDomainDoesNotExist and error message
 IRSGenericException: IRSErrorException: Failed to ActivateStorageDomainVDS, error = Storage domain does not exist: ('38755249-4bb3-4841-bf5b-05f4a521514d',)

2013-06-17 13:52:42,119 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-4-thread-46) [2ec39c6d] SpmStopVDSCommand::Stopping SPM on vds cougar01, pool 
id 7fd33b43-a9f4-4eb7-a885-e9583a929ceb
2013-06-17 13:52:43,165 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-4-thread-46) [2ec39c6d] FINISH, SpmStopVDSCommand, log id: 465bea95
2013-06-17 13:52:43,165 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-4-thread-46) [2ec39c6d] Irs placed on server 4497d431-7c5e-4924-96e0-3f9cdbf826
e5 failed. Proceed Failover
Comment 1 Liron Aravot 2013-07-07 14:08:05 EDT
As the domain is in MAINTENANCE, no monitoring should be done for that domain by the hosts/engine - therefore, we don't know beforehand if it's seen at all by any of the hosts prior to the activation execution - the question whether we don't to perform failover in this case is debatable IMO.

we might "get some idea" about the activation result by performing different checks before running the activate vds command to improve the chances of predicting the result - it seems to me like an RFE which might be partially contained by other upcoming features.
Allon, what's your take on it?
Comment 2 Ayal Baron 2013-07-08 03:05:35 EDT
In general I agree that this is not very nice but the scenario is that user is activating a domain which she believes is now ok and activation fails.  In this case it is reasonable to assume that the problem is specific to the host and not to the domain.  As Liron mentioned, we do not monitor domains in maintenance mode and solving this use case requires a lot of code for very little gain.

Note You need to log in before you can comment on or make changes to this bug.