Bug 1877466 - Upgrade: Paused tasks should not fail an upgrade - add delete routine
Summary: Upgrade: Paused tasks should not fail an upgrade - add delete routine
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Satellite Maintain
Version: 6.8.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: 6.9.0
Assignee: Suraj Patil
QA Contact: Tasos Papaioannou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-09 17:03 UTC by Mike McCune
Modified: 2021-12-01 08:40 UTC (History)
7 users (show)

Fixed In Version: rubygem-foreman_maintain-0.7.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-04-21 14:48:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 30870 0 Normal Closed Upgrade: Paused tasks should not fail an upgrade - add delete routine 2021-02-11 07:16:51 UTC
Red Hat Product Errata RHBA-2021:1312 0 None None None 2021-04-21 14:48:34 UTC

Description Mike McCune 2020-09-09 17:03:38 UTC
There are cases where paused tasks fail to resume and foreman-maintain can not proceed with the upgrade. In most of these cases, the user will just delete the task with foreman-rake but this provides a significant barrier upgrades. We need to automate this and build it directly into foreman-maintain.

There are situations where the task is unable to be resumed and should just be deleted, eg:

# foreman-maintain upgrade run --target-version=6.7.z -y
...
--------------------------------------------------------------------------------
Check for paused tasks:                                               [FAIL]
There are currently 1 paused tasks in the system
--------------------------------------------------------------------------------
There are multiple steps to proceed:
1) Resume paused tasks
2) Investigate the tasks via UI
(assuming first option)
Resume paused tasks:                                                  [OK]      
Total tasks found paused in error state: 1
Total tasks resumed:                     1
Resumed tasks:                           
 1) Task identifier: d775afce-232b-4310-9ff6-475f64648e79
    Task action:     Publish
    Task errors:     ERROR
--------------------------------------------------------------------------------
Rerunning the check after fix procedure
Check for paused tasks:                                               [FAIL]
There are currently 1 paused tasks in the system
--------------------------------------------------------------------------------
...

Scenario [Checks before upgrading to Satellite 6.7.z] failed.

The following steps ended up in failing state:

  [foreman-tasks-not-paused]
Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="foreman-tasks-not-paused"

The steps in warning state itself might not mean there is an error,
but it should be reviewed to ensure the behavior is expected

# 

We should update foreman-maintain to offer 3 choices:

1) Resume paused tasks <RECOMMENDED>
2) Investigate the tasks via UI
3) Delete paused tasks

and if (1) fails, then proceed to (3) delete and not require any user interaction. This will allow for a -y flag to upgrade to critical NOT FAIL the upgrade if there are paused tasks.

A run with -y and the above updated routine would like like:

--------------------------------------------------------------------------------
Check for paused tasks:                                               [FAIL]
There are currently 1 paused tasks in the system
--------------------------------------------------------------------------------
There are multiple steps to proceed:
1) Resume paused tasks
2) Investigate the tasks via UI
3) Delete paused tasks
(assuming first option)
Resume paused tasks:                                                  [OK]      
Total tasks found paused in error state: 1
Total tasks resumed:                     1
Resumed tasks:                           
 1) Task identifier: d775afce-232b-4310-9ff6-475f64648e79
    Task action:     Publish
    Task errors:     ERROR
--------------------------------------------------------------------------------
Rerunning the check after fix procedure
Check for paused tasks:                                               [FAIL]
There are currently 1 paused tasks in the system
--------------------------------------------------------------------------------
Deleting Paused tasks:

*** WARNING: 1 task deleted
--------------------------------------------------------------------------------
Rerunning the check after fix procedure
Check for old tasks in paused/stopped state:                          [OK]
--------------------------------------------------------------------------------
Check for tasks in planning state:                                    [OK]
--------------------------------------------------------------------------------
Check to verify if any hotfix installed on system: 
/ Checking for presence of hotfix(es). It may take some time to verify.        
...

Comment 1 Suraj Patil 2020-09-18 10:21:52 UTC
Created redmine issue https://projects.theforeman.org/issues/30870 from this bug

Comment 2 Bryan Kearney 2021-01-08 13:41:19 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/30870 has been resolved.

Comment 3 Mike McCune 2021-01-14 20:28:09 UTC
This worked perfectly the first time I tried it, looks great!

# satellite-maintain  upgrade run --target-version=6.9.z -y

...
--------------------------------------------------------------------------------
Check for running tasks:                                              [OK]
--------------------------------------------------------------------------------
Check for old tasks in paused/stopped state:                          [FAIL]
Found 4 paused or stopped task(s) older than 30 days
--------------------------------------------------------------------------------
Continue with step [Delete old tasks]? (assuming yes)
Delete tasks:                                                                   
| Deleted old tasks: 4                                                [OK]      
--------------------------------------------------------------------------------
Rerunning the check after fix procedure
Check for old tasks in paused/stopped state:                          [OK]
--------------------------------------------------------------------------------
Check for pending tasks which are safe to delete:                     [OK]
--------------------------------------------------------------------------------
Check for tasks in planning state:                                    [OK]
--------------------------------------------------------------------------------
...

Running Checks after upgrading to Satellite 6.9.z
================================================================================
Clean old Kernel and initramfs files from tftp-boot:                  [OK]
--------------------------------------------------------------------------------
Check number of fact names in database:                               [OK]
--------------------------------------------------------------------------------
Check whether all services are running:                               [OK]
--------------------------------------------------------------------------------
Check whether all services are running using the ping call:           [OK]
--------------------------------------------------------------------------------
Check for paused tasks:                                               [OK]
--------------------------------------------------------------------------------
Check to verify no empty CA cert requests exist:                      [OK]
--------------------------------------------------------------------------------
Check whether system is self-registered or not:                       [OK]
--------------------------------------------------------------------------------
Check if only installed assets are present on the system: 
| Checking for presence of non-original assets...                     [OK]      
--------------------------------------------------------------------------------


--------------------------------------------------------------------------------
Upgrade finished.

Comment 4 Tasos Papaioannou 2021-01-18 18:21:52 UTC
Verified on 6.8.3 upgrading to 6.9.0 snap 1.0.

1.) Manually mark a completed task as paused / error:

# foreman-rake console
Loading production environment (Rails 5.2.1)
> t = ForemanTasks::Task.where(id: ["XXX"]).first
> t.result = "error"
> t.state = "paused"
> t.save(:validate => false)

2.) Upgrade the Satellite:

# yum update rubygem-foreman_maintain

# satellite-maintain upgrade run --target-version 6.9 --whitelist="repositories-validate,repositories-setup"
  [or]
# satellite-maintain upgrade run --target-version 6.9 --whitelist="repositories-validate,repositories-setup" --assumeyes

3.) Verify that the errored task is found, and the user is prompted to resume, delete, or view the task in the UI. Or, if --assumeyes was specified, then the task is deleted automatically.

4.) After upgrade, repeat step #1 for another task, and run a health check, and repeat step #3 to verify.

# satellite-maintain health check --label foreman-tasks-not-paused
  [or]
# satellite-maintain health check --label foreman-tasks-not-paused --assumeyes

Comment 7 errata-xmlrpc 2021-04-21 14:48:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.9 Satellite Maintenance Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1312


Note You need to log in before you can comment on or make changes to this bug.