Bug 2093829 - 'foreman-maintain content migration-stats' command stucks and consume all memory
Summary: 'foreman-maintain content migration-stats' command stucks and consume all memory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Satellite Maintain
Version: 6.9.9
Hardware: Unspecified
OS: Unspecified
high
medium with 3 votes
Target Milestone: 6.9.10
Assignee: satellite6-bugs
QA Contact: Gaurav Talreja
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-06 07:44 UTC by Hao Chang Yu
Modified: 2022-11-17 17:17 UTC (History)
13 users (show)

Fixed In Version: tfm-rubygem-katello-3.18.1.55-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-17 17:17:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 35142 0 Normal Ready For Testing 'foreman-maintain content migration-stats' command stucks and consume all memory 2022-11-01 10:38:36 UTC
Github Katello katello pull 10180 0 None Merged Fixes #35142 - Migration stats rake triggers OOM 2022-07-27 11:49:41 UTC
Red Hat Knowledge Base (Solution) 6964909 0 None None None 2022-06-27 15:51:05 UTC
Red Hat Product Errata RHSA-2022:8532 0 None None None 2022-11-17 17:17:30 UTC

Description Hao Chang Yu 2022-06-06 07:44:21 UTC
Description of problem:
"foreman-maintain content migration-stats" command will run for extremely long time (several hours to day) and consume all the system memory when unmigratable contents are large, such 10K+.


# foreman-maintain content migration-stats
Running Retrieve Pulp 2 to Pulp 3 migration statistics
================================================================================
Retrieve Pulp 2 to Pulp 3 migration statistics: <=========== Stuck in here for long time


# Memory consumption is increasing quick.
-----------------------------------------------------------
foreman  14301 16.0  1.9 816904 384304 ?       Ssl  17:23   1:32 /opt/rh/rh-ruby25/root/usr/bin/ruby /opt/rh/rh-ruby25/root/usr/bin/rake katello:pulp3_migration_stats

foreman  14301 15.1  4.5 1337164 901500 ?      Rsl  17:23   1:36 /opt/rh/rh-ruby25/root/usr/bin/ruby /opt/rh/rh-ruby25/root/usr/bin/rake katello:pulp3_migration_stats

foreman  14301 11.7 14.3 3311908 2864024 ?     Ssl  17:23   1:50 /opt/rh/rh-ruby25/root/usr/bin/ruby /opt/rh/rh-ruby25/root/usr/bin/rake katello:pulp3_migration_stats

foreman  14301 12.4 33.5 7130700 6684896 ?     Rsl  17:23   2:18 /opt/rh/rh-ruby25/root/usr/bin/ruby /opt/rh/rh-ruby25/root/usr/bin/rake katello:pulp3_migration_stats
-----------------------------------------------------------


Steps to Reproduce:
1.Prepare a Satellite 6.9.9 with about 20k or more of rpms and many content views and many content view versions.

2. To simulate unmigratable rpms we can run the following command to flag all rpms as missing from migration.

foreman-rake console
Katello::Rpm.update_all(migrated_pulp3_href: nil, missing_from_migration: true)
exit

3. Run "foreman-maintain content migration-stats" command

Actual results:
Stuck and memory consumption is increasing overtime until OOM is triggered.


Expected results:
Run successfully and consume reasonable amount of system memory.


In Addition to the memory issue, the output files also printed many duplicate rows which is the reason that the script can take several hours to days to run.

In my case, it wrote 500 duplicated exactly the same rows: 
---------------------------------------------------------
# grep "opa-fm-10.0.0.0-444.el7.x86_64.rpm,1,Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server,Default Organization View,1.0" Rpm | wc -l
500
----------------------------------------------------------

Comment 13 Gaurav Talreja 2022-11-11 16:55:48 UTC
Verified.

Tested on Satellite 6.9.10 Snap 2.0
Version: katello-3.18.1-3.el7sat.noarch

Steps:
1. Prepare a Satellite 6.9.10 with 20 K or more of rpms and many content views and many content view versions
2. # date;foreman-maintain content migration-stats;date --> Takes less than one minute
3. echo "Katello::Rpm.update_all(migrated_pulp3_href: nil, missing_from_migration: true)" | foreman-rake console
4. # date;foreman-maintain content migration-stats;date --> Takes less than ~10 minutes

Observation:
Tested both with and without fix, without fix it get stuck, memory consumption is increasing overtime until OOM is triggered with errors in syslog as mentioned, and with fix, same command returns "Missing/Corrupted Content Summary" after ~10 minutes and no such OOM errors are observed in syslog

Comment 18 errata-xmlrpc 2022-11-17 17:17:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.9.10 Async Security Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8532


Note You need to log in before you can comment on or make changes to this bug.