Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
DescriptionPradeep Kumar Surisetty
2017-01-29 02:05:30 UTC
Created attachment 1245470[details]
ruby mem growth
Description of problem:
Started ReX on 6k nodes. (simple date command). During Remote execution Ruby started growing from 1 G to 18GB as shown in attachment this causes swapping & slowness.
My satellite has : 48G mem
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Start ReX on 6k nodes.
2.
3.
Actual results:
Expected results:
Additional info:
Comment 1Pradeep Kumar Surisetty
2017-01-29 02:07:19 UTC
Comment 4Pradeep Kumar Surisetty
2017-01-29 04:43:18 UTC
Attached mem growth over a period of time.
while true; do (date && ps aux --sort -rss | head -n20) >> /var/log/foreman/ps-aux1.log; sleep 60; done
foreman, qpidd, pgsql using higher
Comment 5Pradeep Kumar Surisetty
2017-01-29 05:34:31 UTC
Passenger is biggest contributing factor for this ruby mem growth
Comment 6Pradeep Kumar Surisetty
2017-01-29 05:35:19 UTC
Comment 14Pradeep Kumar Surisetty
2017-02-24 06:31:59 UTC
Ruby memory is growing higher during remote execution at scale
For 1K+ Rex: Ruby jumped from few MBs to 5GB, passeng-foreman jumped from few MB to 4GB
for 2k+ Rex: Ruby jumped from few MBs to 8GB, passeng-foreman jumped from few MB to 7GB
If this continues like this, we might need huge memory for 40K hosts.
This issue will become another major memory concering issue like qpid mem issue.
Comment 16Pradeep Kumar Surisetty
2017-04-05 03:46:21 UTC
These numbers from a different setup (30k scale setup)
Started Rex job `subscripton-manager repos --list` on 22k hosts
Ruby mem shooted upto 98G
passenger-foreman upto 90G
postgresql 40G
This is killing most of the katello services
Comment 17Pradeep Kumar Surisetty
2017-04-05 03:49:35 UTC
Pradeep: we need to start distinguishing between different jobs: those interacting with satellite and those that don't, as it might not be clear if it isn't connected with https://bugzilla.redhat.com/show_bug.cgi?id=1434040.
For this bug, only scripts non-interacting with satellite are valid. For the scripts interacting with satellite, we need to track it against different components.
Comment 21Pradeep Kumar Surisetty
2017-04-05 15:27:54 UTC
sure. i will move this to different bz or check if its connected to 1434040
IMHO we are creating a load test for /rhsm/ endpoints:
by running `subscripton-manager repos --list`, each host is generating the following requests to satellite:
/rhsm/consumers/:id/certificates/serials
/rhsm/consumers/:id
/rhsm/consumers/:id/content_overrides
/rhsm/consumers/:id/release
which means Satellite has to deal with 4*(number_of_hosts) requests in a very small time interval. No wonder it's memory is growing up - passenger will probably spawn a huge amount of processes to deal with those requests in parallel.
Comment 23Pradeep Kumar Surisetty
2017-04-06 12:53:16 UTC
Moving `subscripton-manager repos --list` on 22k hosts issue to different bug (1439741) to avoid confusion.
Created attachment 1245470 [details] ruby mem growth Description of problem: Started ReX on 6k nodes. (simple date command). During Remote execution Ruby started growing from 1 G to 18GB as shown in attachment this causes swapping & slowness. My satellite has : 48G mem Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Start ReX on 6k nodes. 2. 3. Actual results: Expected results: Additional info: