This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 975846 - Can't bring hosts up after power outage.
Can't bring hosts up after power outage.
Status: CLOSED DUPLICATE of bug 969640
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
Unspecified Unspecified
urgent Severity unspecified
: ---
: 3.3.0
Assigned To: Liron Aravot
storage
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-19 08:55 EDT by Ohad Basan
Modified: 2016-02-10 12:56 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-06-25 03:37:05 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ohad Basan 2013-06-19 08:55:38 EDT
Description of problem:
After power outage we have reached a state that storage pool was not responsive and hosts were down.
all storage domains were in maintenance
when trying to activate the host connectstoragepool failed and host moved to non operational.
Comment 3 Liron Aravot 2013-06-23 08:40:26 EDT
Ohad,
the first occurence of the issue in the logs is already after the bug occurred
can you please provide earlier logs or please provide steps to reproduce? it seems like it's not related to the outage..seems like the issue was present already before - pool 430abeff first occurrence in the logs is - 

this is the first occurrence of the given pool id starting with 430abeff.

2013-06-19 10:53:39,995 INFO  [org.ovirt.engine.core.bll.InitVdsOnUpCommand] (QuartzScheduler_Worker-23) [2693e676] Running command: InitVdsOnUpCommand internal: true.
2013-06-19 10:53:40,059 INFO  [org.ovirt.engine.core.bll.storage.ConnectHostToStoragePoolServersCommand] (QuartzScheduler_Worker-23) [2693e676] Running command: ConnectHostToStoragePoolServersCommand internal: true. Entities affected :  ID: 430abeff-bfeb-49c6-9638-7b02b6e71223 Type: StoragePool
2013-06-19 10:53:40,155 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (QuartzScheduler_Worker-23) [2693e676] START, ConnectStorageServerVDSCommand(HostName = white-vdsc.ci.lab.tlv.redhat.com, HostId = 39e7a216-75b8-11e2-af2b-00145e8327d8, storagePoolId = 430abeff-bfeb-49c6-9638-7b02b6e71223, storageType = NFS, connectionList = [{ id: fd8ce5ae-562c-42d9-9487-052e35f5cd9c, connection: shual.eng.lab.tlv.redhat.com:/volumes/shual/integration/rhevm-31-integ-iso, iqn: null, vfsType: null, mountOptions: null, nfsVersion: null, nfsRetrans: null, nfsTimeo: null };]), log id: 3bb28a28
2013-06-19 10:53:42,947 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (QuartzScheduler_Worker-29) Command GetCapabilitiesVDS execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused
2013-06-19 10:53:43,240 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (QuartzScheduler_Worker-23) [2693e676] FINISH, ConnectStorageServerVDSCommand, return: {fd8ce5ae-562c-42d9-9487-052e35f5cd9c=0}, log id: 3bb28a28
2013-06-19 10:53:43,240 INFO  [org.ovirt.engine.core.bll.storage.ConnectHostToStoragePoolServersCommand] (QuartzScheduler_Worker-23) [2693e676] Host white-vdsc.ci.lab.tlv.redhat.com storage connection was succeeded 
2013-06-19 10:53:43,274 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (pool-3-thread-25) START, ConnectStoragePoolVDSCommand(HostName = white-vdsc.ci.lab.tlv.redhat.com, HostId = 39e7a216-75b8-11e2-af2b-00145e8327d8, storagePoolId = 430abeff-bfeb-49c6-9638-7b02b6e71223, vds_spm_id = 1, masterDomainId = 71baec3a-a456-49a9-99fd-4b297725b08d, masterVersion = 6603), log id: 3859d78b
2013-06-19 10:53:46,178 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (QuartzScheduler_Worker-32) Command GetCapabilitiesVDS execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused
2013-06-19 10:53:46,703 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-25) Command org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand return value 
 StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=304, mMessage=Cannot find master domain: 'spUUID=430abeff-bfeb-49c6-9638-7b02b6e71223, msdUUID=71baec3a-a456-49a9-99fd-4b297725b08d']]
2013-06-19 10:53:46,704 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-25) HostName = white-vdsc.ci.lab.tlv.redhat.com
2013-06-19 10:53:46,766 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-3-thread-25) Command ConnectStoragePoolVDS execution failed. Exception: IRSNoMasterDomainException: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Cannot find master domain: 'spUUID=430abeff-bfeb-49c6-9638-7b02b6e71223, msdUUID=71baec3a-a456-49a9-99fd-4b297725b08d'
2013-06-19 10:53:46,781 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (pool-3-thread-25) FINISH, ConnectStoragePoolVDSCommand, log id: 3859d78b
2013-06-19 10:53:46,819 ERROR [org.ovirt.engine.core.bll.InitVdsOnUpCommand] (pool-3-thread-25) Could not connect host white-vdsc.ci.lab.tlv.redhat.com to pool Integration-Stable-FC
2013-06-19 10:53:46,894 INFO  [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (QuartzScheduler_Worker-23) [2693e676] Running command: SetNonOperationalVdsCommand internal: true. Entities affected :  ID: 39e7a216-75b8-11e2-af2b-00145e8327d8 Type: VDS
Comment 6 Liron Aravot 2013-06-25 01:59:21 EDT
It appears that the domain status were been maniuplated manually (db manual update) after it was in LOCKED status in order to resolve an issue in other pool .therefore this issue isn't a bug as the situation has occurred by manual manipulation of the DB, probably in order to fix bug 975742 - as the query was general, the domain statuses in that pool were changed as well.

the issue of the domain being locked is a duplicate of
https://bugzilla.redhat.com/show_bug.cgi?id=969640

therefore it seems like this one can be closed.
Comment 7 Allon Mureinik 2013-06-25 03:37:05 EDT
Agreed, closing this one.

*** This bug has been marked as a duplicate of bug 969640 ***

Note You need to log in before you can comment on or make changes to this bug.