Hide Forgot
Created attachment 1519041 [details] engine and vdsm logs Description of problem: An attempt to deactivate the SPM host while it has running tasks fails and the failure is not propagated to the user. Version-Release number of selected component (if applicable): ovirt-engine-4.3.0-0.6.alpha2.el7.noarch vdsm-4.30.4-1.el7ev.x86_64 How reproducible: Always Steps to Reproduce: 1. Have a task in state 'finished' on SPM 2. Try to deactivate the SPM host Actual results: Host maintenance fails: 2019-01-07 16:57:22,077+02 INFO [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (default task-204) [hosts_syncAction_137c8740-e9aa-4b8d] Running command: MaintenanceNumberOfVdssCommand internal: fals e. Entities affected : ID: f288cfa3-a78f-4d70-91ed-607fb197d47a Type: VDSAction group MANIPULATE_HOST with role type ADMIN 2019-01-07 16:57:22,086+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (default task-204) [hosts_syncAction_137c8740-e9aa-4b8d] START, SetVdsStatusVDSCommand(HostName = host_mixed_1, SetVdsSta tusVDSCommandParameters:{hostId='f288cfa3-a78f-4d70-91ed-607fb197d47a', status='PreparingForMaintenance', nonOperationalReason='NONE', stopSpmFailureLogged='true', maintenanceReason='null'}), log id: 3b8e3077 2019-01-07 16:57:22,086+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (default task-204) [hosts_syncAction_137c8740-e9aa-4b8d] VDS 'host_mixed_1' is spm and moved from up calling resetIrs. 2019-01-07 16:57:22,088+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default task-204) [hosts_syncAction_137c8740-e9aa-4b8d] START, ResetIrsVDSCommand( ResetIrsVDSCommandParameters:{s toragePoolId='2288b8b4-06e6-44cc-8294-3f6ceec565f5', ignoreFailoverLimit='false', vdsId='f288cfa3-a78f-4d70-91ed-607fb197d47a', ignoreStopFailed='false'}), log id: 6dcea640 2019-01-07 16:57:22,098+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-204) [hosts_syncAction_137c8740-e9aa-4b8d] START, SpmStopVDSCommand(HostName = host_mixed_1, SpmStopVD SCommandParameters:{hostId='f288cfa3-a78f-4d70-91ed-607fb197d47a', storagePoolId='2288b8b4-06e6-44cc-8294-3f6ceec565f5'}), log id: 1cb04123 2019-01-07 16:57:22,114+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-204) [hosts_syncAction_137c8740-e9aa-4b8d] SpmStopVDSCommand::Not stopping SPM on vds 'host_mixed_1', pool id '2288b8b4-06e6-44cc-8294-3f6ceec565f5' as there are uncleared tasks 'Task 'af6d25f9-85ae-43fa-bbe8-55de984f215f', status 'finished'' But it's not reported to the user. Via Webadmin, no error appears in the events nor error message. Via REST API, deactivation response code is OK: url:/ovirt-engine/api/hosts/f288cfa3-a78f-4d70-91ed-607fb197d47a/deactivate body:<action> <async>false</async> <grace_period> <expiry>10</expiry> </grace_period> </action> 2019-01-07 16:57:21,816 - MainThread - hosts - INFO - Using Correlation-Id: hosts_syncAction_137c8740-e9aa-4b8d 2019-01-07 16:57:22,205 - MainThread - hosts - DEBUG - Cleaning Correlation-Id: hosts_syncAction_137c8740-e9aa-4b8d 2019-01-07 16:57:22,206 - MainThread - hosts - DEBUG - Response code is valid: [200, 201] Expected results: In case of a failure to deactivate the SPM due to uncleared tasks, this should be propagated to the user. Additional info:
Hi Ahmad, We need a way to reach this state (deactivate the SPM host while it has running tasks) in order to verify this bug. Currently, the issue does not reproduce and we need your help to trigger such an event. please help.
Hi Avihai, I reproduced this issue on the development environment, by forcing a value change to access the error branch. I'll attach images for after status.
Created attachment 1576299 [details] screenshot for the change screenshot for the change
(In reply to Ahmad Khiet from comment #3) > Created attachment 1576299 [details] > screenshot for the change > > screenshot for the change Unfortunately, I can't verify it according to your verification on your dev env. Either you verify this bug - which is fine by me :) For QE to verify this bug we need a way to hit the same issue(deactivate the SPM host while it has running tasks) and see the fix at the official build 4.3.4.2. Can you connect to my env 'storage-ge-04.scl.lab.tlv.redhat.com' and try it there or tell me how to do it via a scenario or change in the VDSM code somehow?
(In reply to Avihai from comment #4) > (In reply to Ahmad Khiet from comment #3) > > Created attachment 1576299 [details] > > screenshot for the change > > > > screenshot for the change > > Unfortunately, I can't verify it according to your verification on your dev > env. > Either you verify this bug - which is fine by me :) > > For QE to verify this bug we need a way to hit the same issue(deactivate the > SPM host while it has running tasks) and see the fix at the official build > 4.3.4.2. > Can you connect to my env 'storage-ge-04.scl.lab.tlv.redhat.com' and try it > there or tell me how to do it via a scenario or change in the VDSM code > somehow? This bug reported by QE, and I tried to reproduce it as usual, but the bug did not re-produce and forced it to enter a branch to get this result. I tried to reproduce within the last several days and before but I have no idea how to get it. what do you think?
(In reply to Ahmad Khiet from comment #5) > (In reply to Avihai from comment #4) > > (In reply to Ahmad Khiet from comment #3) > > > Created attachment 1576299 [details] > > > screenshot for the change > > > > > > screenshot for the change > > > > Unfortunately, I can't verify it according to your verification on your dev > > env. > > Either you verify this bug - which is fine by me :) > > > > For QE to verify this bug we need a way to hit the same issue(deactivate the > > SPM host while it has running tasks) and see the fix at the official build > > 4.3.4.2. > > Can you connect to my env 'storage-ge-04.scl.lab.tlv.redhat.com' and try it > > there or tell me how to do it via a scenario or change in the VDSM code > > somehow? > > This bug reported by QE, and I tried to reproduce it as usual, but the bug > did not re-produce and forced it to enter a branch to get this result. Yes, it was reported by Elad running a random test suit so steps are unknown. Also, manual/automation efforts did not yield a reproduction so verification assurance is low. > I tried to reproduce within the last several days and before but I have no > idea how to get it. > > what do you think? As we discussed, verifying this as you did before but instead of master build use a 4.3.4.2 DS build is the closest thing to verification in this case. Please move it to verify once you're done.
This bugzilla is included in oVirt 4.3.4 release, published on June 11th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.4 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.