There are cases where paused tasks fail to resume and foreman-maintain can not proceed with the upgrade. In most of these cases, the user will just delete the task with foreman-rake but this provides a significant barrier upgrades. We need to automate this and build it directly into foreman-maintain. There are situations where the task is unable to be resumed and should just be deleted, eg: # foreman-maintain upgrade run --target-version=6.7.z -y ... -------------------------------------------------------------------------------- Check for paused tasks: [FAIL] There are currently 1 paused tasks in the system -------------------------------------------------------------------------------- There are multiple steps to proceed: 1) Resume paused tasks 2) Investigate the tasks via UI (assuming first option) Resume paused tasks: [OK] Total tasks found paused in error state: 1 Total tasks resumed: 1 Resumed tasks: 1) Task identifier: d775afce-232b-4310-9ff6-475f64648e79 Task action: Publish Task errors: ERROR -------------------------------------------------------------------------------- Rerunning the check after fix procedure Check for paused tasks: [FAIL] There are currently 1 paused tasks in the system -------------------------------------------------------------------------------- ... Scenario [Checks before upgrading to Satellite 6.7.z] failed. The following steps ended up in failing state: [foreman-tasks-not-paused] Resolve the failed steps and rerun the command. In case the failures are false positives, use --whitelist="foreman-tasks-not-paused" The steps in warning state itself might not mean there is an error, but it should be reviewed to ensure the behavior is expected # We should update foreman-maintain to offer 3 choices: 1) Resume paused tasks <RECOMMENDED> 2) Investigate the tasks via UI 3) Delete paused tasks and if (1) fails, then proceed to (3) delete and not require any user interaction. This will allow for a -y flag to upgrade to critical NOT FAIL the upgrade if there are paused tasks. A run with -y and the above updated routine would like like: -------------------------------------------------------------------------------- Check for paused tasks: [FAIL] There are currently 1 paused tasks in the system -------------------------------------------------------------------------------- There are multiple steps to proceed: 1) Resume paused tasks 2) Investigate the tasks via UI 3) Delete paused tasks (assuming first option) Resume paused tasks: [OK] Total tasks found paused in error state: 1 Total tasks resumed: 1 Resumed tasks: 1) Task identifier: d775afce-232b-4310-9ff6-475f64648e79 Task action: Publish Task errors: ERROR -------------------------------------------------------------------------------- Rerunning the check after fix procedure Check for paused tasks: [FAIL] There are currently 1 paused tasks in the system -------------------------------------------------------------------------------- Deleting Paused tasks: *** WARNING: 1 task deleted -------------------------------------------------------------------------------- Rerunning the check after fix procedure Check for old tasks in paused/stopped state: [OK] -------------------------------------------------------------------------------- Check for tasks in planning state: [OK] -------------------------------------------------------------------------------- Check to verify if any hotfix installed on system: / Checking for presence of hotfix(es). It may take some time to verify. ...
Created redmine issue https://projects.theforeman.org/issues/30870 from this bug
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/30870 has been resolved.
This worked perfectly the first time I tried it, looks great! # satellite-maintain upgrade run --target-version=6.9.z -y ... -------------------------------------------------------------------------------- Check for running tasks: [OK] -------------------------------------------------------------------------------- Check for old tasks in paused/stopped state: [FAIL] Found 4 paused or stopped task(s) older than 30 days -------------------------------------------------------------------------------- Continue with step [Delete old tasks]? (assuming yes) Delete tasks: | Deleted old tasks: 4 [OK] -------------------------------------------------------------------------------- Rerunning the check after fix procedure Check for old tasks in paused/stopped state: [OK] -------------------------------------------------------------------------------- Check for pending tasks which are safe to delete: [OK] -------------------------------------------------------------------------------- Check for tasks in planning state: [OK] -------------------------------------------------------------------------------- ... Running Checks after upgrading to Satellite 6.9.z ================================================================================ Clean old Kernel and initramfs files from tftp-boot: [OK] -------------------------------------------------------------------------------- Check number of fact names in database: [OK] -------------------------------------------------------------------------------- Check whether all services are running: [OK] -------------------------------------------------------------------------------- Check whether all services are running using the ping call: [OK] -------------------------------------------------------------------------------- Check for paused tasks: [OK] -------------------------------------------------------------------------------- Check to verify no empty CA cert requests exist: [OK] -------------------------------------------------------------------------------- Check whether system is self-registered or not: [OK] -------------------------------------------------------------------------------- Check if only installed assets are present on the system: | Checking for presence of non-original assets... [OK] -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Upgrade finished.
Verified on 6.8.3 upgrading to 6.9.0 snap 1.0. 1.) Manually mark a completed task as paused / error: # foreman-rake console Loading production environment (Rails 5.2.1) > t = ForemanTasks::Task.where(id: ["XXX"]).first > t.result = "error" > t.state = "paused" > t.save(:validate => false) 2.) Upgrade the Satellite: # yum update rubygem-foreman_maintain # satellite-maintain upgrade run --target-version 6.9 --whitelist="repositories-validate,repositories-setup" [or] # satellite-maintain upgrade run --target-version 6.9 --whitelist="repositories-validate,repositories-setup" --assumeyes 3.) Verify that the errored task is found, and the user is prompted to resume, delete, or view the task in the UI. Or, if --assumeyes was specified, then the task is deleted automatically. 4.) After upgrade, repeat step #1 for another task, and run a health check, and repeat step #3 to verify. # satellite-maintain health check --label foreman-tasks-not-paused [or] # satellite-maintain health check --label foreman-tasks-not-paused --assumeyes
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Satellite 6.9 Satellite Maintenance Release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1312