Bug 966037 - [engine-backend] in a case of missing device, the domain is inaccessible but engine reports it as up
Summary: [engine-backend] in a case of missing device, the domain is inaccessible but ...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.4.0
Assignee: Liron Aravot
QA Contact: Elad
URL:
Whiteboard: storage
Depends On:
Blocks: rhev3.4beta 1142926
TreeView+ depends on / blocked
 
Reported: 2013-05-22 10:38 UTC by Elad
Modified: 2016-02-10 17:38 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-12-10 16:38:19 UTC
oVirt Team: Storage
Target Upstream Version:


Attachments (Terms of Use)
logs (1.04 MB, application/x-gzip)
2013-05-22 10:38 UTC, Elad
no flags Details

Description Elad 2013-05-22 10:38:02 UTC
Created attachment 751654 [details]
logs

Description of problem:

Despite host cannot perform connectStorageServer, engine still reports it as 'up' state.

Version-Release number of selected component (if applicable):

rhevm-3.2.0-10.26.rc.el6ev.noarch
vdsm-4.10.2-19.0.el6ev.x86_64

How reproducible:
100%

Steps to Reproduce: 
On 1 host and one iscsi domain:
1. Try to extend the domain and during the extension, remove the device (pv) that you extended the domain with from the host with:

'multipath -f 1elad1313678616'

2. vdsm will fail to extend the domain and than will fail in connectStorageServer:


2013-05-22 13:15:01,794 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (pool-4-thread-49) START, ConnectStoragePoolVDSCommand(HostName = nott-vds1, HostId = 61ada6ee-b58a-11e2-b34e-
001a4a169734, storagePoolId = e5ab1ab3-f38e-4aef-9dfa-b4ebcad11ed4, vds_spm_id = 1, masterDomainId = da07317a-eaa1-4cf8-aaae-ac41c4b1fd87, masterVersion = 1), log id: 2e09ca40
2013-05-22 13:15:03,214 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (QuartzScheduler_Worker-77) No string for UNASSIGNED type. Use default Log
2013-05-22 13:15:04,218 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-49) Command org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand return value
 StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=304, mMessage=Cannot find master domain: 'spUUID=e5ab1ab3-f38e-4aef-9dfa-b4ebcad11ed4, msdUUID=da07317a-eaa1-4cf8-aaae-ac41c4b1fd87']]
2013-05-22 13:15:04,218 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-49) HostName = nott-vds1
2013-05-22 13:15:04,218 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-49) Command ConnectStoragePoolVDS execution failed. Exception: IRSNoMasterDomainException: IRSGenericException: IRSErro
rException: IRSNoMasterDomainException: Cannot find master domain: 'spUUID=e5ab1ab3-f38e-4aef-9dfa-b4ebcad11ed4, msdUUID=da07317a-eaa1-4cf8-aaae-ac41c4b1fd87'
2013-05-22 13:15:04,218 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (pool-4-thread-49) FINISH, ConnectStoragePoolVDSCommand, log id: 2e09ca40
2013-05-22 13:15:04,219 ERROR [org.ovirt.engine.core.bll.InitVdsOnUpCommand] (pool-4-thread-49) Could not connect host nott-vds1 to pool iscsi

3. The pool becomes non-responsive and the host non-operational


Actual results:
Engine reports that the domain is up even though there is no active hosts in the pool and the pool is non-responsive. there is nothing that user can do in order to remove the damaged domain.

Expected results:
The domain should become unknown

Additional info: logs

Comment 1 Liron Aravot 2013-12-05 09:43:44 UTC
Elad, the attached logs aren't match the one that you quoted.
please try to reproduce, that shouldn't happen.
if it does, please attach correct logs.

Comment 2 Elad 2013-12-10 16:38:19 UTC
No reproduction so far, checked on 3.2.5. Closing for now as WORKSFORME, will re-open if necessary.


Note You need to log in before you can comment on or make changes to this bug.