Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1517559 - using ssh is 3 times faster than same action with Remote Execution distributed over 10 capsules
Summary: using ssh is 3 times faster than same action with Remote Execution distribute...
Keywords:
Status: CLOSED DUPLICATE of bug 1517048
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Remote Execution
Version: 6.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-26 20:11 UTC by Jan Hutař
Modified: 2017-11-29 09:12 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-29 09:12:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jan Hutař 2017-11-26 20:11:55 UTC
Description of problem:
Using ssh in a loop (serialized) is 3 times faster than same action with Remote Execution on clients equally distributed over 10 capsules (with dynflow database set to "in memory" on satellite and capsules).

Satellite is a VM with 20 cores and 47 GB of RAM. 10 capsules (again VMs) have 8 CPUs and 16 GB RAM. All (including hosts) on 10G network.


Version-Release number of selected component (if applicable):
satellite-6.3.0-21.0.beta.el7sat.noarch


How reproducible:
always


Steps to Reproduce:
1. Run ReX job on 30k hosts with command
   `systemctl stop rhsmcertd; systemctl disable rhsmcertd`
2. Try with a subset with simple loop and ssh


Actual results:
Job is now running for 23 hours, 24 minutes and reports 24800 hosts as done so far, i.e. more than 3 seconds per host.

I have tried to run simple loop with ssh (it is this complicated only because of IP ranges we are using:

# time \
    for ip1 in $( seq 0 30 ); do
        for ip2 in 0 1 2 3; do
            ip=$( expr $ip1 \* 8 + $ip2 )
            ssh -o "StrictHostKeyChecking no" \
                -i /root/id_rsa_perf
                root.$ip.100
                "systemctl stop rhsmcertd; systemctl disable rhsmcertd"
        done
    done

This ran the command on 124 hosts and finished 2m4.953s, i.e. slightly above 1 second per host.


Expected results:
I know satellite and capsules are doing much more than just sshing to the clients (e.g. storing results for later auditing), but as the load should be somehow distributed among 10 capsules and as we have already tuned database on satellite and capsules to be inmemory only, I would expect speed of this action to be close to what I'm able to achieve with ssh or faster.

Comment 2 Ivan Necas 2017-11-27 06:46:06 UTC
Is this worth a separate BYz vs a comment + additional data and debug logs to some existing perf bug around rex?

Comment 3 Jan Hutař 2017-11-27 12:57:18 UTC
Feel free to close this as duplicate of whatever bug you think this relates to.

Comment 4 Adam Ruzicka 2017-11-29 09:12:33 UTC

*** This bug has been marked as a duplicate of bug 1517048 ***


Note You need to log in before you can comment on or make changes to this bug.