Created attachment 1131879 [details] logs: engine, vdsm, supervdsm Description of problem: SPM host that's defined with -1 spm priority after a vdsm restart continues to be SPM (when other hosts are also defined with spm priority -1). Version-Release number of selected component (if applicable): rhevm-3.6.3.3-0.1.el6.noarch vdsm-4.17.23-0.el7ev.noarch How reproducible: 100% Steps to Reproduce: Environment with 3 hosts, two of them defined with -1 spm priority, and another with spm priority 2 (SPM). All hosts are active. 1. using REST client, change the value to -1 for the host with priority 2. 2. restart vdsm or stop vdsm on that host. Actual results: The SPM host with spm priority -1 goes back to being SPM after the restart. engine.log: 2016-02-28 16:36:34,946 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [50f30802] Unable to process messages 2016-02-28 16:36:34,947 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [50f30802] Unable to process messages 2016-02-28 16:36:34,953 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-73) [29827b34] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM aqua-vds4 command failed: Connection reset by peer 2016-02-28 16:36:34,954 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-73) [29827b34] Command 'SpmStatusVDSCommand(HostName = aqua-vds4, SpmStatusVDSCommandParameters:{runAsync='true', hostId='eadffa5b-5a5b-4fb3-aa7e-843988491061', storagePoolId='00000001-0001-0001-0001-000000000216'})' execution failed: VDSGenericException: VDSNetworkException: Connection reset by peer 2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] IrsBroker::Failed::GetStoragePoolInfoVDS 2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] ERROR, org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand, exception: Connection issues during send request, log id: 61a6d8c9 2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] Exception: org.ovirt.engine.core.vdsbroker.xmlrpc.XmlRpcRunTimeException: Connection issues during send request 2016-02-28 16:36:35,045 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] Command 'GetStoragePoolInfoVDSCommand( GetStoragePoolInfoVDSCommandParameters:{runAsync='true', storagePoolId='00000001-0001-0001-0001-000000000216', ignoreFailoverLimit='true'})' execution failed: IRSProtocolException: 2016-02-28 16:36:47,172 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-100) [637ea4ed] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM aqua-vds4 command failed: Not SPM 2016-02-28 16:36:47,173 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-100) [637ea4ed] Command 'HSMGetAllTasksStatusesVDSCommand(HostName = aqua-vds4, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='eadffa5b-5a5b-4fb3-aa7e-843988491061'})' execution failed: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM Expected results: The host that was changed from spm priority 2 to -1, after the restart should not go back to being SPM. None of the host should be SPM. Additional info:
1. The scenario is relevant only when there was a host that was SPM and its priority was changed to -1. 2. Rarely you'll want to have a data center with no host that can become the SPM. As the scenario is unlikely to happen and changing it means that a very sensitive code needs to be updated seems to me like this bug is WONT FIX. Allon/Tal - any opinion on that? Please reopen if relevant.
Indeed a corner case, if this bug will ever be encountered by a customer/community we can reopen, for now I'm good with closing it.