Bug 1420023 - cannot put SPM to maintenance due to uncleared tasks
Summary: cannot put SPM to maintenance due to uncleared tasks
Keywords:
Status: CLOSED DUPLICATE of bug 1408982
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.19.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Dan Kenigsberg
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-07 15:49 UTC by Evgheni Dereveanchin
Modified: 2017-02-07 16:59 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-07 16:59:07 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
SPM sosreport (13.61 MB, application/x-xz)
2017-02-07 16:02 UTC, Evgheni Dereveanchin
no flags Details

Description Evgheni Dereveanchin 2017-02-07 15:49:05 UTC
Description of problem:
After trying out VM leases on 4.1.0.4 I tried to put the SPM into maintenance which failed with error:

Message: Failed to change status of host ovirt-host2 due to a failure to stop the spm

Version-Release number of selected component (if applicable):
vdsm-4.19.4-1.el7.centos
ovirt-engine-4.1.0.4-1.el7.centos

How reproducible:
sometimes

Steps to Reproduce:
1. do some SPM related tasks (add-remove VM leases in my case)
2. try to set maintenance on SPM

Actual results:
Failed to change status of host ovirt-host2 due to a failure to stop the spm.

Expected results:
SPM changes to another host

Additional info:
tasks reported by SPM (both finished)

# vdsClient -s 0 getAllTasks
62497948-ad87-4907-856c-56d26ccdb8bd :
         verb = create_lease
         code = 0
         state = finished
         tag = spm
         result = 
         message = 1 jobs completed successfully
         id = 62497948-ad87-4907-856c-56d26ccdb8bd
e6dd3ea5-c280-4e47-bd68-e0b14c962579 :
         verb = delete_lease
         code = 0
         state = finished
         tag = spm
         result = 
         message = 1 jobs completed successfully
         id = e6dd3ea5-c280-4e47-bd68-e0b14c962579

Comment 1 Evgheni Dereveanchin 2017-02-07 15:49:57 UTC
Related engine.log snippet showing maintenance attempt and an attempt to set another host as SPM manually

2017-02-07 10:29:37,920-05 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: e9d7d3ca-0ddd-4264-b126-4ec7720990b3 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2017-02-07 10:29:37,922-05 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] START, SetVdsStatusVDSCommand(HostName = ovirt-host2, SetVdsStatusVDSCommandParameters:{runAsync='true', hostId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', status='PreparingForMaintenance', nonOperationalReason='NONE', stopSpmFailureLogged='true', maintenanceReason='null'}), log id: 5d2330ef
2017-02-07 10:29:37,922-05 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] VDS 'ovirt-host2' is spm and moved from up calling resetIrs.
2017-02-07 10:29:37,923-05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] START, ResetIrsVDSCommand( ResetIrsVDSCommandParameters:{runAsync='true', storagePoolId='5898859c-00a8-02e2-008f-00000000016d', ignoreFailoverLimit='false', vdsId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', ignoreStopFailed='false'}), log id: 3cd0c1bc
2017-02-07 10:29:37,925-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] START, SpmStopVDSCommand(HostName = ovirt-host2, SpmStopVDSCommandParameters:{runAsync='true', hostId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', storagePoolId='5898859c-00a8-02e2-008f-00000000016d'}), log id: 7781430c
2017-02-07 10:29:38,934-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] SpmStopVDSCommand::Not stopping SPM on vds 'ovirt-host2', pool id '5898859c-00a8-02e2-008f-00000000016d' as there are uncleared tasks
2017-02-07 10:29:38,935-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] FINISH, SpmStopVDSCommand, log id: 7781430c
2017-02-07 10:29:38,935-05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] FINISH, ResetIrsVDSCommand, log id: 3cd0c1bc
2017-02-07 10:29:38,952-05 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] EVENT_ID: VDS_STATUS_CHANGE_FAILED_DUE_TO_STOP_SPM_FAILURE(27), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to change status of host ovirt-host2 due to a failure to stop the spm.
2017-02-07 10:29:38,952-05 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] FINISH, SetVdsStatusVDSCommand, log id: 5d2330ef
2017-02-07 10:29:38,952-05 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (default task-32) [4c637820-832f-4149-a98a-b0607cf57ccc] Lock freed to object 'EngineLock:{exclusiveLocks='null', sharedLocks='[5898859c-00a8-02e2-008f-00000000016d=<POOL, ACTION_TYPE_FAILED_OBJECT_LOCKED>]'}'
2017-02-07 10:31:29,553-05 INFO  [org.ovirt.engine.core.bll.storage.pool.ForceSelectSPMCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] Running command: ForceSelectSPMCommand internal: false. Entities affected :  ID: b87a0113-199d-4c72-a3b8-4989ac9e3e06 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2017-02-07 10:31:29,556-05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] START, SpmStopOnIrsVDSCommand( SpmStopOnIrsVDSCommandParameters:{runAsync='true', storagePoolId='5898859c-00a8-02e2-008f-00000000016d', ignoreFailoverLimit='false'}), log id: 7f29a9bd
2017-02-07 10:31:29,557-05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] START, ResetIrsVDSCommand( ResetIrsVDSCommandParameters:{runAsync='true', storagePoolId='5898859c-00a8-02e2-008f-00000000016d', ignoreFailoverLimit='false', vdsId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', ignoreStopFailed='false'}), log id: 6cac4cdb
2017-02-07 10:31:29,561-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] START, SpmStopVDSCommand(HostName = ovirt-host2, SpmStopVDSCommandParameters:{runAsync='true', hostId='e9d7d3ca-0ddd-4264-b126-4ec7720990b3', storagePoolId='5898859c-00a8-02e2-008f-00000000016d'}), log id: 3c031f8b
2017-02-07 10:31:30,571-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] SpmStopVDSCommand::Not stopping SPM on vds 'ovirt-host2', pool id '5898859c-00a8-02e2-008f-00000000016d' as there are uncleared tasks
2017-02-07 10:31:30,571-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] FINISH, SpmStopVDSCommand, log id: 3c031f8b
2017-02-07 10:31:30,571-05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] FINISH, ResetIrsVDSCommand, log id: 6cac4cdb
2017-02-07 10:31:30,571-05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] FINISH, SpmStopOnIrsVDSCommand, log id: 7f29a9bd
2017-02-07 10:31:30,597-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [77ed8647-17d2-4c3b-8988-17fcc0830dd7] EVENT_ID: USER_FORCE_SELECTED_SPM_STOP_FAILED(4,096), Correlation ID: 77ed8647-17d2-4c3b-8988-17fcc0830dd7, Job ID: 4d54d6ac-712a-40cd-babd-e94d745fac23, Call Stack: null, Custom Event ID: -1, Message: Failed to force select ovirt-host1 as the SPM due to a failure to stop the current SPM.

Comment 2 Evgheni Dereveanchin 2017-02-07 16:02:06 UTC
Created attachment 1248445 [details]
SPM sosreport

attached full SPM sosreport in case other logs are needed. This is an HE environment upgraded from a fresh 4.0.6 install

Comment 3 Evgheni Dereveanchin 2017-02-07 16:12:18 UTC
Checked the engine DB, there are not tasks or jobs that aren't auto cleared:

engine=# select count(*) from async_tasks;
 count 
-------
     0
(1 row)

engine=# select count(*) from job where is_auto_cleared != 't';
 count 
-------
     0
(1 row)

Comment 4 Tal Nisan 2017-02-07 16:59:07 UTC

*** This bug has been marked as a duplicate of bug 1408982 ***


Note You need to log in before you can comment on or make changes to this bug.