Created attachment 761802 [details] logs Description of problem: i've encountered this scenario: maintenance to spm reported as successful even though the host has never got the request of spmStop. 2013-06-13 18:53:08,287 INFO [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (pool-8-thread-49) [34f15822] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected : ID: 3eeb f545-8974-4d1b-9710-dacd2f0b642e Type: VDS 2013-06-13 18:53:08,290 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-8-thread-49) [34f15822] START, SetVdsStatusVDSCommand(HostName = nott-vds1, HostId = 3eebf545-8974-4d1b-9710-dacd2f0b642 e, status=PreparingForMaintenance, nonOperationalReason=NONE), log id: 1e23f4f4 2013-06-13 18:53:08,295 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-8-thread-49) [34f15822] VDS nott-vds1 is spm and moved from up calling ResetIrs. 2013-06-13 18:53:08,297 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (pool-8-thread-49) [34f15822] START, ResetIrsVDSCommand( storagePoolId = 4625ab3e-39de-4cdf-846b-b94c8f584444, ignoreFai loverLimit = false, compatabilityVersion = null, vdsId = 3eebf545-8974-4d1b-9710-dacd2f0b642e, ignoreStopFailed = false), log id: 291d8a98 2013-06-13 18:53:08,300 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-8-thread-49) [34f15822] START, SpmStopVDSCommand(HostName = nott-vds1, HostId = 3eebf545-8974-4d1b-9710-dacd2f0b642e, storagePoolId = 4625ab3e-39de-4cdf-846b-b94c8f584444), log id: 21ab676b 2013-06-13 18:53:08,313 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-8-thread-49) [34f15822] FINISH, SpmStopVDSCommand, log id: 21ab676b 2013-06-13 18:53:08,314 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (pool-8-thread-49) [34f15822] FINISH, ResetIrsVDSCommand, log id: 291d8a98 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: on block pool with 2 hosts: 1. have 1 host in maintenace and one spm 2. have running asynchronous tasks on spm 3. activate to the maintenanced host and right after maintenance to the spm Actual results: host will become maintenance and remain spm Expected results: engine should fail spmStop if host does not receive the request. Additional info:logs
Version-Release number of selected component (if applicable): vdsm-4.10.2-22.0.el6ev.x86_64 (vdsm) commit ef7833f8c6631c519d69f16d83a2081dfe54a4ce (engine) How reproducible: happened to me once
Elad, the engine/vdsm logs aren't syncted on times. please attach synced logs. thanks.
Closing for now, does not reproduce on downstream.