Bug 1313306 - SPM host that's defined with -1 spm priority after a vdsm restart continues to be SPM (when all other hosts are also defined with spm priority -1)
SPM host that's defined with -1 spm priority after a vdsm restart continues t...
Status: CLOSED WONTFIX
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage (Show other bugs)
3.6.3.3
Unspecified Unspecified
unspecified Severity medium (vote)
: ovirt-4.1.0-alpha
: ---
Assigned To: Liron Aravot
Aharon Canan
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-01 06:20 EST by Natalie Gavrielov
Modified: 2016-06-26 10:58 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-06-21 11:41:12 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
amureini: ovirt‑4.1?
ngavrilo: planning_ack?
ngavrilo: devel_ack?
ngavrilo: testing_ack?


Attachments (Terms of Use)
logs: engine, vdsm, supervdsm (3.29 MB, application/x-gzip)
2016-03-01 06:20 EST, Natalie Gavrielov
no flags Details

  None (edit)
Description Natalie Gavrielov 2016-03-01 06:20:17 EST
Created attachment 1131879 [details]
logs: engine, vdsm, supervdsm

Description of problem:
SPM host that's defined with -1 spm priority after a vdsm restart continues to be SPM (when other hosts are also defined with spm priority -1). 

Version-Release number of selected component (if applicable):
rhevm-3.6.3.3-0.1.el6.noarch
vdsm-4.17.23-0.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
Environment with 3 hosts, two of them defined with -1 spm priority, and another with spm priority 2 (SPM). All hosts are active.

1. using REST client, change the value to -1 for the host with priority 2.
2. restart vdsm or stop vdsm on that host.

Actual results:

The SPM host with spm priority -1 goes back to being SPM after the restart.

engine.log:

2016-02-28 16:36:34,946 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [50f30802] Unable to process messages
2016-02-28 16:36:34,947 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [50f30802] Unable to process messages
2016-02-28 16:36:34,953 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-73) [29827b34] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM aqua-vds4 command failed: Connection reset by peer
2016-02-28 16:36:34,954 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-73) [29827b34] Command 'SpmStatusVDSCommand(HostName = aqua-vds4, SpmStatusVDSCommandParameters:{runAsync='true', hostId='eadffa5b-5a5b-4fb3-aa7e-843988491061', storagePoolId='00000001-0001-0001-0001-000000000216'})' execution failed: VDSGenericException: VDSNetworkException: Connection reset by peer
2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] IrsBroker::Failed::GetStoragePoolInfoVDS
2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] ERROR, org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand, exception: Connection issues during send request, log id: 61a6d8c9
2016-02-28 16:36:35,044 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] Exception: org.ovirt.engine.core.vdsbroker.xmlrpc.XmlRpcRunTimeException: Connection issues during send request
2016-02-28 16:36:35,045 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand] (DefaultQuartzScheduler_Worker-73) [41b6c75b] Command 'GetStoragePoolInfoVDSCommand( GetStoragePoolInfoVDSCommandParameters:{runAsync='true', storagePoolId='00000001-0001-0001-0001-000000000216', ignoreFailoverLimit='true'})' execution failed: IRSProtocolException: 
2016-02-28 16:36:47,172 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-100) [637ea4ed] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM aqua-vds4 command failed: Not SPM
2016-02-28 16:36:47,173 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-100) [637ea4ed] Command 'HSMGetAllTasksStatusesVDSCommand(HostName = aqua-vds4, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='eadffa5b-5a5b-4fb3-aa7e-843988491061'})' execution failed: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM

Expected results:
The host that was changed from spm priority 2 to -1, after the restart should not go back to being SPM.
None of the host should be SPM.

Additional info:
Comment 1 Liron Aravot 2016-06-21 11:41:12 EDT
1. The scenario is relevant only when there was a host that was SPM and its priority was changed to -1.
2. Rarely you'll want to have a data center with no host that can become the SPM.

As the scenario is unlikely to happen and changing it means that a very sensitive code needs to be updated  seems to me like this bug is WONT FIX.

Allon/Tal - any opinion on that? Please reopen if relevant.
Comment 2 Tal Nisan 2016-06-26 10:58:57 EDT
Indeed a corner case, if this bug will ever be encountered by a customer/community we can reopen, for now I'm good with closing it.

Note You need to log in before you can comment on or make changes to this bug.