Description of problem: After trying out VM leases on 4.1.0.4 I tried to put the SPM into maintenance which failed with error: Message: Failed to change status of host ovirt-host2 due to a failure to stop the spm Version-Release number of selected component (if applicable): vdsm-4.19.4-1.el7.centos ovirt-engine-4.1.0.4-1.el7.centos How reproducible: sometimes Steps to Reproduce: 1. do some SPM related tasks (add-remove VM leases in my case) 2. try to set maintenance on SPM Actual results: Failed to change status of host ovirt-host2 due to a failure to stop the spm. Expected results: SPM changes to another host Additional info: tasks reported by SPM (both finished) # vdsClient -s 0 getAllTasks 62497948-ad87-4907-856c-56d26ccdb8bd : verb = create_lease code = 0 state = finished tag = spm result = message = 1 jobs completed successfully id = 62497948-ad87-4907-856c-56d26ccdb8bd e6dd3ea5-c280-4e47-bd68-e0b14c962579 : verb = delete_lease code = 0 state = finished tag = spm result = message = 1 jobs completed successfully id = e6dd3ea5-c280-4e47-bd68-e0b14c962579
Related engine.log snippet showing maintenance attempt and an attempt to set another host as SPM manually 2017-02-07 10:29:37,920-05 INFO [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected : ID: e9d7d3ca-0ddd-4264-b126-4ec7720990b3 Type: VDSAction group MANIPULATE_HOST with role type ADMIN 2017-02-07 10:29:37,922-05 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] START, SetVdsStatusVDSCommand(HostName = ovirt-host2, SetVdsStatusVDSCommandParameters:{runAsync='true', hostId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', status='PreparingForMaintenance', nonOperationalReason='NONE', stopSpmFailureLogged='true', maintenanceReason='null'}), log id: 5d2330ef 2017-02-07 10:29:37,922-05 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] VDS 'ovirt-host2' is spm and moved from up calling resetIrs. 2017-02-07 10:29:37,923-05 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] START, ResetIrsVDSCommand( ResetIrsVDSCommandParameters:{runAsync='true', storagePoolId='5898859c-00a8-02e2-008f-00000000016d', ignoreFailoverLimit='false', vdsId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', ignoreStopFailed='false'}), log id: 3cd0c1bc 2017-02-07 10:29:37,925-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] START, SpmStopVDSCommand(HostName = ovirt-host2, SpmStopVDSCommandParameters:{runAsync='true', hostId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', storagePoolId='5898859c-00a8-02e2-008f-00000000016d'}), log id: 7781430c 2017-02-07 10:29:38,934-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] SpmStopVDSCommand::Not stopping SPM on vds 'ovirt-host2', pool id '5898859c-00a8-02e2-008f-00000000016d' as there are uncleared tasks 2017-02-07 10:29:38,935-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] FINISH, SpmStopVDSCommand, log id: 7781430c 2017-02-07 10:29:38,935-05 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] FINISH, ResetIrsVDSCommand, log id: 3cd0c1bc 2017-02-07 10:29:38,952-05 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] EVENT_ID: VDS_STATUS_CHANGE_FAILED_DUE_TO_STOP_SPM_FAILURE(27), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to change status of host ovirt-host2 due to a failure to stop the spm. 2017-02-07 10:29:38,952-05 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] FINISH, SetVdsStatusVDSCommand, log id: 5d2330ef 2017-02-07 10:29:38,952-05 INFO [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] Lock freed to object 'EngineLock:{exclusiveLocks='null', sharedLocks='[5898859c-00a8-02e2-008f-00000000016d=<POOL, ACTION_TYPE_FAILED_OBJECT_LOCKED>]'}' 2017-02-07 10:31:29,553-05 INFO [org.ovirt.engine.core.bll.storage.pool.ForceSelectSPMCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] Running command: ForceSelectSPMCommand internal: false. Entities affected : ID: b87a0113-199d-4c72-a3b8-4989ac9e3e06 Type: VDSAction group MANIPULATE_HOST with role type ADMIN 2017-02-07 10:31:29,556-05 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] START, SpmStopOnIrsVDSCommand( SpmStopOnIrsVDSCommandParameters:{runAsync='true', storagePoolId='5898859c-00a8-02e2-008f-00000000016d', ignoreFailoverLimit='false'}), log id: 7f29a9bd 2017-02-07 10:31:29,557-05 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] START, ResetIrsVDSCommand( ResetIrsVDSCommandParameters:{runAsync='true', storagePoolId='5898859c-00a8-02e2-008f-00000000016d', ignoreFailoverLimit='false', vdsId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', ignoreStopFailed='false'}), log id: 6cac4cdb 2017-02-07 10:31:29,561-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] START, SpmStopVDSCommand(HostName = ovirt-host2, SpmStopVDSCommandParameters:{runAsync='true', hostId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', storagePoolId='5898859c-00a8-02e2-008f-00000000016d'}), log id: 3c031f8b 2017-02-07 10:31:30,571-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] SpmStopVDSCommand::Not stopping SPM on vds 'ovirt-host2', pool id '5898859c-00a8-02e2-008f-00000000016d' as there are uncleared tasks 2017-02-07 10:31:30,571-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] FINISH, SpmStopVDSCommand, log id: 3c031f8b 2017-02-07 10:31:30,571-05 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] FINISH, ResetIrsVDSCommand, log id: 6cac4cdb 2017-02-07 10:31:30,571-05 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] FINISH, SpmStopOnIrsVDSCommand, log id: 7f29a9bd 2017-02-07 10:31:30,597-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] EVENT_ID: USER_FORCE_SELECTED_SPM_STOP_FAILED(4,096), Correlation ID: 77ed8647-17d2-4c3b-8988-17fcc0830dd7, Job ID: 4d54d6ac-712a-40cd-babd-e94d745fac23, Call Stack: null, Custom Event ID: -1, Message: Failed to force select ovirt-host1 as the SPM due to a failure to stop the current SPM.
Created attachment 1248445 [details] SPM sosreport attached full SPM sosreport in case other logs are needed. This is an HE environment upgraded from a fresh 4.0.6 install
Checked the engine DB, there are not tasks or jobs that aren't auto cleared: engine=# select count(*) from async_tasks; count ------- 0 (1 row) engine=# select count(*) from job where is_auto_cleared != 't'; count ------- 0 (1 row)
*** This bug has been marked as a duplicate of bug 1408982 ***