Bug 1313306

Summary: SPM host that's defined with -1 spm priority after a vdsm restart continues to be SPM (when all other hosts are also defined with spm priority -1)
Product: [oVirt] ovirt-engine Reporter: Natalie Gavrielov <ngavrilo>
Component: BLL.StorageAssignee: Liron Aravot <laravot>
Status: CLOSED WONTFIX QA Contact: Aharon Canan <acanan>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.6.3.3CC: amureini, bugs, laravot, tnisan
Target Milestone: ---Flags: sbonazzo: ovirt-4.1-
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-21 15:41:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs: engine, vdsm, supervdsm none

Description Natalie Gavrielov 2016-03-01 11:20:17 UTC
Created attachment 1131879 [details]
logs: engine, vdsm, supervdsm

Description of problem:
SPM host that's defined with -1 spm priority after a vdsm restart continues to be SPM (when other hosts are also defined with spm priority -1). 

Version-Release number of selected component (if applicable):
rhevm-3.6.3.3-0.1.el6.noarch
vdsm-4.17.23-0.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
Environment with 3 hosts, two of them defined with -1 spm priority, and another with spm priority 2 (SPM). All hosts are active.

1. using REST client, change the value to -1 for the host with priority 2.
2. restart vdsm or stop vdsm on that host.

Actual results:

The SPM host with spm priority -1 goes back to being SPM after the restart.

engine.log:

2016-02-28 16:36:34,946 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [50f30802] Unable to process messages
2016-02-28 16:36:34,947 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [50f30802] Unable to process messages
2016-02-28 16:36:34,953 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-73) [29827b34] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM aqua-vds4 command failed: Connection reset by peer
2016-02-28 16:36:34,954 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-73) [29827b34] Command 'SpmStatusVDSCommand(HostName = aqua-vds4, SpmStatusVDSCommandParameters:{runAsync='true', hostId='eadffa5b-5a5b-4fb3-aa7e-843988491061', storagePoolId='00000001-0001-0001-0001-000000000216'})' execution failed: VDSGenericException: VDSNetworkException: Connection reset by peer
2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] IrsBroker::Failed::GetStoragePoolInfoVDS
2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] ERROR, org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand, exception: Connection issues during send request, log id: 61a6d8c9
2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] Exception: org.ovirt.engine.core.vdsbroker.xmlrpc.XmlRpcRunTimeException: Connection issues during send request
2016-02-28 16:36:35,045 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] Command 'GetStoragePoolInfoVDSCommand( GetStoragePoolInfoVDSCommandParameters:{runAsync='true', storagePoolId='00000001-0001-0001-0001-000000000216', ignoreFailoverLimit='true'})' execution failed: IRSProtocolException: 
2016-02-28 16:36:47,172 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-100) [637ea4ed] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM aqua-vds4 command failed: Not SPM
2016-02-28 16:36:47,173 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-100) [637ea4ed] Command 'HSMGetAllTasksStatusesVDSCommand(HostName = aqua-vds4, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='eadffa5b-5a5b-4fb3-aa7e-843988491061'})' execution failed: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM

Expected results:
The host that was changed from spm priority 2 to -1, after the restart should not go back to being SPM.
None of the host should be SPM.

Additional info:

Comment 1 Liron Aravot 2016-06-21 15:41:12 UTC
1. The scenario is relevant only when there was a host that was SPM and its priority was changed to -1.
2. Rarely you'll want to have a data center with no host that can become the SPM.

As the scenario is unlikely to happen and changing it means that a very sensitive code needs to be updated  seems to me like this bug is WONT FIX.

Allon/Tal - any opinion on that? Please reopen if relevant.

Comment 2 Tal Nisan 2016-06-26 14:58:57 UTC
Indeed a corner case, if this bug will ever be encountered by a customer/community we can reopen, for now I'm good with closing it.