Hide Forgot
Description of problem: Sometimes the master domain cannot be moved to maintenance mode, with this error Failed to deactivate storage domain nfs_0 (Data Center golden_env_mixed) Version-Release number of selected component (if applicable): How reproducible: 30% Steps to Reproduce: 1.add master domain wait for everything to be up 2. try to put it into maintenance 3. Actual results: Failed to deactivate storage domain nfs_0 (Data Center golden_env_mixed) Expected results: should be able to put it into maintenance Additional info: looking at the logs I saw that there are tasks causing the failure: SpmStopVDSCommand::Not stopping SPM on vds 'host_mixed_4', pool id '457833c9-adf9-4939-ae00-9dc198c50039' as there are uncleared tasks ..... DeactivateStorageDomainCommand] (org.ovirt.thread.pool-6-thread-5) [253137b9] Aborting execution due to failure to stop SPM when I checked the tasks table I saw a lot of them (123) with: status | action_type --------+------------- 2 | 1010 vdsm_task_id = 00000000-0000-0000-0000-000000000000 I have kept an environment in this state and we see it happening in dev ci as well
Created attachment 1134397 [details] engine logs
Created attachment 1134398 [details] hosts logs
*** Bug 1315959 has been marked as a duplicate of this bug. ***
Hi Nelly, Please note that it is impossible to put domain into maintenance if there are running tasks. Action type 1010 and status 2 mean that there are 123 *running* Live Migrate Disks tasks. You (ci env) have to make sure there are no running tasks before trying to put the domain into maintenance.
The tasks were stuck. there was no indication for any live migration anywhere except for the DB
As bug 1312741 is ON_QA and there seems to be no other issue here, setting this one to ON_QA too for QA to verify.
(In reply to Allon Mureinik from comment #6) > As bug 1312741 is ON_QA and there seems to be no other issue here, setting > this one to ON_QA too for QA to verify. Correction - moving to MODIFIED, as there is no 3.6.6 build yet. When there will be, this one should be moved to ON_QA and verified against it.
Ala, are the verification steps here similar to the ones of bug 1312741? and if so, as bug 1312741 is CLOSED CURRENTRELEASE and got verified, can we move this one to VERIFIED as well?
Elad, the BZs are different. I'd suggest to verify this one too. I'd recommend first to run live migration and then try moving the domain to maintenance and see what happens. Keep in mind that if there are running jobs, domain cannot be moved to maintenance. So, as long as the migration is running, user cannot move domain to maintenance; however, once operation completes, the user should be able to move domain to maintenance.
During live migration, moving the master domain to maintenance is not allowed. Once the live migration tasks are completed, moving the domain to maintenance is allowed and works well. Verified using: rhevm-3.6.6-0.1.el6.noarch vdsm-4.17.27-0.el7ev.noarch