Bug 974829

Summary: [engine-backend] spmStop is marked as FINISH even though the host has never got the request and there is a running task
Product: [Retired] oVirt Reporter: Elad <ebenahar>
Component: ovirt-engine-coreAssignee: Liron Aravot <laravot>
Status: CLOSED WORKSFORME QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: abaron, acathrow, amureini, ebenahar, iheim, jkt
Target Milestone: ---   
Target Release: 3.3.4   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-22 12:52:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Elad 2013-06-16 09:44:34 UTC
Created attachment 761802 [details]
logs

Description of problem:
i've encountered this scenario:
maintenance to spm reported as successful even though the host has never got the request of spmStop.

 2013-06-13 18:53:08,287 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (pool-8-thread-49) [34f15822] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: 3eeb
f545-8974-4d1b-9710-dacd2f0b642e Type: VDS
2013-06-13 18:53:08,290 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-8-thread-49) [34f15822] START, SetVdsStatusVDSCommand(HostName = nott-vds1, HostId = 3eebf545-8974-4d1b-9710-dacd2f0b642
e, status=PreparingForMaintenance, nonOperationalReason=NONE), log id: 1e23f4f4
2013-06-13 18:53:08,295 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-8-thread-49) [34f15822] VDS nott-vds1 is spm and moved from up calling ResetIrs.
2013-06-13 18:53:08,297 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (pool-8-thread-49) [34f15822] START, ResetIrsVDSCommand( storagePoolId = 4625ab3e-39de-4cdf-846b-b94c8f584444, ignoreFai
loverLimit = false, compatabilityVersion = null, vdsId = 3eebf545-8974-4d1b-9710-dacd2f0b642e, ignoreStopFailed = false), log id: 291d8a98
2013-06-13 18:53:08,300 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-8-thread-49) [34f15822] START, SpmStopVDSCommand(HostName = nott-vds1, HostId = 3eebf545-8974-4d1b-9710-dacd2f0b642e, storagePoolId = 4625ab3e-39de-4cdf-846b-b94c8f584444), log id: 21ab676b
2013-06-13 18:53:08,313 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-8-thread-49) [34f15822] FINISH, SpmStopVDSCommand, log id: 21ab676b
2013-06-13 18:53:08,314 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (pool-8-thread-49) [34f15822] FINISH, ResetIrsVDSCommand, log id: 291d8a98



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce: on block pool with 2 hosts:
1. have 1 host in maintenace and one spm
2. have running asynchronous tasks on spm
3. activate to the maintenanced host and right after maintenance to the spm


Actual results:
host will become maintenance and remain spm

Expected results:
engine should fail spmStop if host does not receive the request.

Additional info:logs

Comment 1 Elad 2013-06-16 09:48:41 UTC
Version-Release number of selected component (if applicable):
vdsm-4.10.2-22.0.el6ev.x86_64 (vdsm)
commit ef7833f8c6631c519d69f16d83a2081dfe54a4ce (engine)


How reproducible:
happened to me once

Comment 2 Liron Aravot 2013-10-07 08:42:13 UTC
Elad, the engine/vdsm logs aren't syncted on times. please attach synced logs. thanks.

Comment 3 Elad 2013-10-22 12:52:53 UTC
Closing for now, does not reproduce on downstream.