Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
This has been happening for a while but I keep forgetting to write it up.
Workload is starting 250 pause pods, which boils down to 250 ose-pod containers and 250 gcr.io/google_containers/pause-amd64:3.0 containers. The workload is strictly starting containers.
40 pods are started at a time with a break between each batch.
A pidstat profile of the run shows rhel-push-plugin as the #3 CPU user at ~5%. This seems high for a plugin which should be getting out of the way as quickly as it can, especially when no push is involved.
See an example here: http://perf-infra.ec2.breakage.org/pbench/results/ip-172-31-12-67/pbench-user-benchmark_nodeVert314_2016-11-08_19:08:38/1/reference-result/tools-default/ip-172-31-48-59.us-west-2.compute.internal/pidstat/cpu_usage.html
Click on each of the top 3 lines of the table to see the top users (openshift node, docker daemon and rhel-push-plugin).
This pattern is consistent across runs
Version-Release number of selected component (if applicable): 1.12.3-3 from Extras.
How reproducible: Always
So is this just a matter of how much resources the plugin consumes or is it blocked at some point preventing pushes? (the title may refers to the former but the description the latter)
runcom, sorry for the title switch :-). This refers to the cpu resources consumed by the plugin while starting 250 OCP pods. There is no blockage and there are no pushes in this workload.
I'll try to reproduce this somehow, I suspect this is how Docker manages plugin s and has nothing to do with our plugin (since the plugin it's just an if not pushing then exists, it's really simple and basic).
We have no plans to ship another version of Docker at this time. RHEL7 is in final support stages where only security fixes will get released. Customers should move to use Podman which is available starting in RHEL 7.6.