Bug 2093829

Summary: 'foreman-maintain content migration-stats' command stucks and consume all memory
Product: Red Hat Satellite Reporter: Hao Chang Yu <hyu>
Component: Satellite MaintainAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Gaurav Talreja <gtalreja>
Severity: medium Docs Contact:
Priority: high    
Version: 6.9.9CC: ahumbe, ajambhul, apatel, kgaikwad, mkalyat, nshaik, osousa, pdwyer, redhat.nrl7030, sadas, satellite6-bugs, saydas, wclark
Target Milestone: 6.9.10Keywords: Patch, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tfm-rubygem-katello-3.18.1.55-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-17 17:17:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hao Chang Yu 2022-06-06 07:44:21 UTC
Description of problem:
"foreman-maintain content migration-stats" command will run for extremely long time (several hours to day) and consume all the system memory when unmigratable contents are large, such 10K+.


# foreman-maintain content migration-stats
Running Retrieve Pulp 2 to Pulp 3 migration statistics
================================================================================
Retrieve Pulp 2 to Pulp 3 migration statistics: <=========== Stuck in here for long time


# Memory consumption is increasing quick.
-----------------------------------------------------------
foreman  14301 16.0  1.9 816904 384304 ?       Ssl  17:23   1:32 /opt/rh/rh-ruby25/root/usr/bin/ruby /opt/rh/rh-ruby25/root/usr/bin/rake katello:pulp3_migration_stats

foreman  14301 15.1  4.5 1337164 901500 ?      Rsl  17:23   1:36 /opt/rh/rh-ruby25/root/usr/bin/ruby /opt/rh/rh-ruby25/root/usr/bin/rake katello:pulp3_migration_stats

foreman  14301 11.7 14.3 3311908 2864024 ?     Ssl  17:23   1:50 /opt/rh/rh-ruby25/root/usr/bin/ruby /opt/rh/rh-ruby25/root/usr/bin/rake katello:pulp3_migration_stats

foreman  14301 12.4 33.5 7130700 6684896 ?     Rsl  17:23   2:18 /opt/rh/rh-ruby25/root/usr/bin/ruby /opt/rh/rh-ruby25/root/usr/bin/rake katello:pulp3_migration_stats
-----------------------------------------------------------


Steps to Reproduce:
1.Prepare a Satellite 6.9.9 with about 20k or more of rpms and many content views and many content view versions.

2. To simulate unmigratable rpms we can run the following command to flag all rpms as missing from migration.

foreman-rake console
Katello::Rpm.update_all(migrated_pulp3_href: nil, missing_from_migration: true)
exit

3. Run "foreman-maintain content migration-stats" command

Actual results:
Stuck and memory consumption is increasing overtime until OOM is triggered.


Expected results:
Run successfully and consume reasonable amount of system memory.


In Addition to the memory issue, the output files also printed many duplicate rows which is the reason that the script can take several hours to days to run.

In my case, it wrote 500 duplicated exactly the same rows: 
---------------------------------------------------------
# grep "opa-fm-10.0.0.0-444.el7.x86_64.rpm,1,Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server,Default Organization View,1.0" Rpm | wc -l
500
----------------------------------------------------------

Comment 13 Gaurav Talreja 2022-11-11 16:55:48 UTC
Verified.

Tested on Satellite 6.9.10 Snap 2.0
Version: katello-3.18.1-3.el7sat.noarch

Steps:
1. Prepare a Satellite 6.9.10 with 20 K or more of rpms and many content views and many content view versions
2. # date;foreman-maintain content migration-stats;date --> Takes less than one minute
3. echo "Katello::Rpm.update_all(migrated_pulp3_href: nil, missing_from_migration: true)" | foreman-rake console
4. # date;foreman-maintain content migration-stats;date --> Takes less than ~10 minutes

Observation:
Tested both with and without fix, without fix it get stuck, memory consumption is increasing overtime until OOM is triggered with errors in syslog as mentioned, and with fix, same command returns "Missing/Corrupted Content Summary" after ~10 minutes and no such OOM errors are observed in syslog

Comment 18 errata-xmlrpc 2022-11-17 17:17:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.9.10 Async Security Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8532