Bug 1985165

Summary: Pulp2 to Pulp3 Content switchover failed in the satellite upgrade
Product: Red Hat Satellite Reporter: Devendra Singh <desingh>
Component: Satellite MaintainAssignee: Justin Sherrill <jsherril>
Status: CLOSED ERRATA QA Contact: Devendra Singh <desingh>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.10.0CC: apatel, aupadhye, gtalreja, jsherril, kgaikwad, osousa
Target Milestone: 6.10.0Keywords: AutomationBlocker, Regression, Triaged, UpgradeBlocker
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: rubygem-foreman_maintain-0.8.9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1998223 (view as bug list) Environment:
Last Closed: 2021-11-16 13:48:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Devendra Singh 2021-07-23 03:46:23 UTC
Description of problem: Pulp2 to Pulp3 Content switchover failed in the satellite upgrade


Version-Release number of selected component (if applicable):
6.10 Snap10

How reproducible:
always

Steps to Reproduce:
1. Prepare the 6.9.z setup using the upgrade template.
2. Run the pulp migration.
3. Pulp migration and it completes successfully.
4. Run the upgrade from 6.9 to 6.10 snap10 but it failed at pulp2 to pulp3 content switching stage.

foreman-maintain upgrade run --whitelist="disk-performance" --target-version 6.10 -y
Checking for new version of satellite-maintain..
Nothing to update, can't find new version of satellite-maintain.
Running preparation steps required to run the next scenarios
.............
.............
Running Procedures before migrating to Satellite 6.10
...........
...........
Switch support for certain content from Pulp 2 to Pulp 3: 
Performing final content migration before switching content           [31m[1m[FAIL][0m
Failed executing foreman-rake katello:pulp3_migration, exit status 1:
Migration failed, You will want to investigate: https://xyz.com/foreman_tasks/tasks/6441602d-245a-4976-98c1-401c81e10f03
rake aborted!
ForemanTasks::TaskError: Task 6441602d-245a-4976-98c1-401c81e10f03: Katello::Errors::Pulp3Error: Task canceled
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.29/lib/katello/tasks/pulp3_migration.rake:35:in `block (2 levels) in <top (required)>'
/opt/rh/rh-ruby25/root/usr/share/gems/gems/rake-12.3.3/exe/rake:27:in `<top (required)>'
Tasks: TOP => katello:pulp3_migration
(See full trace by running task with --trace)
Checking for valid Katello configuraton.
Starting task.
Content migration starting.


Actual results:
Content switchover failed in the satellite upgrade.

Expected results:
content switchover should be completed successfully.

Additional info:

Comment 3 Justin Sherrill 2021-07-23 21:02:53 UTC
I'm noticing a couple things going on:

1.  the initial migration is failing with:

Some corrupted or missing content found, run 'foreman-maintain content migration-stats' for more information.

I dont' see any call to 'foreman-rake katello:approve_corrupted_migration_content' to approve the corrupted or missing content.

2.  This is causing the 'content prepare' command to leave the worker services running (approve_corrupted_migration_content won't actually fix that, unsure if this really needs fixing)

3.  When the upgrade shuts down all its services, it properly detects that pulpcore-api needs shutting down, but not the pulpcore-workers so they are left running.  At the end however, redis is shut down which causes the workers to die

4.  Eventually redis is started, but the workers are already dead and work is assigned to it (https://pulp.plan.io/issues/5906 may help with this, unsure since pulp hasn't realized the workers are dead).

let me dig into 3) as i think that will help quite a bit

Comment 4 Justin Sherrill 2021-07-23 21:12:50 UTC
one thing to be clear though, you will need to update automation to run 'foreman-rake katello:approve_corrupted_migration_content' if there is missing or corrupt content on the filesystem.  The upgrade will not proceed if you have missing/corrupt content and have not explicitly approved it.

Comment 8 Justin Sherrill 2021-07-26 20:52:21 UTC
Connecting redmine issue https://projects.theforeman.org/issues/33149 from this bug

Comment 9 Justin Sherrill 2021-07-27 15:01:11 UTC
A simple workaround should be to run this prior to running the upgrade:

 systemctl stop pulpcore-worker@1 pulpcore-worker@2 pulpcore-worker@3 pulpcore-worker@4


I'm in the middle of testing the fix to 100% confirm this.

Comment 10 Devendra Singh 2021-08-09 19:51:35 UTC
Verified on 6.10 Snap12.

Verification points:

1. Prepared the 6.9.z setup using the upgrade template.
2. Ran the pulp migration.
3. Pulp migration and it completes successfully.
4. Ran the upgrade from 6.9.z to 6.10 Snap12 and it has been completed successfully.

foreman-maintain upgrade run --whitelist="disk-performance" --target-version 6.10 -y
...........
...........
Running Checks before upgrading to Satellite 6.10
.......
Unlock packages:                                                      [32m[1m[OK][0m
--------------------------------------------------------------------------------
Update package(s) :                                                   [32m[1m[OK][0m                                     
--------------------------------------------------------------------------------
Procedures::Installer::Upgrade:                                       [32m[1m[OK][0m                                     
--------------------------------------------------------------------------------
Execute upgrade:run rake task:                                        [32m[1m[OK][0m                                     
--------------------------------------------------------------------------------  
Running Procedures after migrating to Satellite 6.10
================================================================================
Refresh detected features:                                            [32m[1m[OK][0m                                     
--------------------------------------------------------------------------------
Start applicable services: 
.............
.............
Upgrade finished.


5. Verified the fixed in version.

# rpm -qa|grep rubygem-foreman_maintain
rubygem-foreman_maintain-0.8.10-1.el7sat.noarch

Comment 13 errata-xmlrpc 2021-11-16 13:48:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.10 Satellite Maintenance Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4697