Created attachment 1175926 [details] mongodb usage Description of problem: Created 10 content views added 5 repos ( rhel7 x86_64, rhel6 x86_64, rhel 6 i386, rhel5 x86_64, rhel 5 i386) to all of them. Published all of them concurrently. It cause huge spikes in memory usage of mongodb, ruby mongodb: 9G ruby: 6G Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Created 10 content views added 5 repos ( rhel7 x86_64, rhel6 x86_64, rhel 6 i386, rhel5 x86_64, rhel 5 i386). 2. 3. Actual results: Expected results: Additional info:
Created attachment 1175927 [details] ruby mem usage
Noticed high cpu usage from pgsql too PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22448 postgres 20 0 275260 82456 38584 R 97.7 0.2 3:52.81 postgres 23557 postgres 20 0 285144 91440 38240 R 97.7 0.2 4:16.66 postgres 6381 postgres 20 0 280892 87268 38524 R 97.0 0.2 4:09.29 postgres
Mongodb is growing very fast for every next version. 5 iterations reached 18G of mongodb mem usage.
It can be difficult to get an accurate read of mongo's memory usage (http://blog.mongodb.org/post/101911655/mongo-db-memory-usage). I'm not sure if the mongo behavior you see is an issue or not. The ruby memory may be an issue, is that from the foreman-tasks process? Also, was postgres slow query logging enabled? If so, did any queries stand out?
(In reply to Chris Duryee from comment #4) > It can be difficult to get an accurate read of mongo's memory usage > (http://blog.mongodb.org/post/101911655/mongo-db-memory-usage). I'm not sure > if the mongo behavior you see is an issue or not. > > The ruby memory may be an issue, is that from the foreman-tasks process? > Yes. foreman tasks > Also, was postgres slow query logging enabled? If so, did any queries stand > out? 2016-07-05 13:06:11 EDT LOG: duration: 179.050 ms execute <unnamed>: SELECT COUNT(*) FROM "katello_errata" INNER JOIN katello_erratum_packages on katello_erratum_packages.erratum_id = katello_errata.id INNER JOIN katello_repository_errata on katello_repository_errata.erratum_id = katello_errata.id INNER JOIN katello_rpms on katello_rpms.filename = katello_erratum_packages.filename INNER JOIN katello_repository_rpms on katello_repository_rpms.rpm_id = katello_rpms.id WHERE "katello_repository_rpms"."repository_id" = 50 AND "katello_repository_errata"."repository_id" = 50 2016-07-05 13:06:11 EDT LOG: duration: 167.819 ms execute <unnamed>: SELECT katello_errata.id FROM "katello_errata" INNER JOIN katello_erratum_packages on katello_erratum_packages.erratum_id = katello_errata.id INNER JOIN katello_repository_errata on katello_repository_errata.erratum_id = katello_errata.id INNER JOIN katello_rpms on katello_rpms.filename = katello_erratum_packages.filename INNER JOIN katello_repository_rpms on katello_repository_rpms.rpm_id = katello_rpms.id WHERE "katello_repository_rpms"."repository_id" = 50 AND "katello_repository_errata"."repository_id" = 50 cpu usage of pgsql process. 10686 postgres 20 0 269380 76404 38564 R 100.0 0.2 4:25.69 postgres 19150 postgres 20 0 280796 84724 38536 R 98.7 0.2 3:50.12 postgres 20833 postgres 20 0 269172 73780 38384 S 97.3 0.1 4:06.80 postgres I
Hi Pradeep, you may want to create a separate bug to track mongo vs ruby, since they will involve different solutions.
I'm not sure what to file upstream. MongoDB will use as much RAM as it can get its hands on for caching, but as the link Chris shared points out, it is relatively friendly about relinquishing that RAM. The one area pulp does have control of, that may have exacerbated the problem seen, is a known inefficiency during publish of lots of errata. That's fixed upstream in pulp 2.9 by using a different XML library. https://pulp.plan.io/issues/1716 Otherwise, getting off of MongoDB is the best solution. ;)
Michael, that is a totally valid and fair assessment. Thanks! I will switch the component on this back to content views to see if there is anything on the ruby side.
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
Looks like ruby is culprit. We were trying concurrent registration of 290 hosts with increased passenger queue, pgsql max connections. Ruby is using close to 9G of mem
Created attachment 1182716 [details] ruby mem usage while concurent registration of 290 hosts
Created redmine issue http://projects.theforeman.org/issues/16181 from this bug
Unfortunately there is not much we can do at this moment for Content View publishes to make them less resource-intensive. Dynflow does not allow for global throttling limits currently, so if trying to run many different tasks at the same time, this will eat up resources. There are discussions about adding these global throttling limits which will solve this problem. For anything that is a Bulk Action in Dynflow, we can make use of a concurrency limit that is built in to Dynflow. So for one task that has many concurrent actions, they will be limited to run only X at at time. Content View publishes are not Bulk Actions. There is an open upstream issue to add the limit to Bulk Actions to katello which will help some the Bulk Actions tasks with memory usage: http://projects.theforeman.org/issues/16336 We will continue to look at any improvements we can make to the individual task's memory usage.
I see. So we basically depends on bug 1368103 now. Thank you for explanation!
Created attachment 1326898 [details] The increase in Dynflow memory during the publishing of 10 CVs
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.