Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1628145 - Ansible Remote Execution Job Stalls at Scale
Summary: Ansible Remote Execution Job Stalls at Scale
Keywords:
Status: CLOSED DUPLICATE of bug 1628505
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Remote Execution
Version: 6.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact:
URL:
Whiteboard:
Depends On: 1628505 1646745
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-12 11:28 UTC by sbadhwar
Modified: 2018-11-05 22:51 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-18 11:23:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description sbadhwar 2018-09-12 11:28:10 UTC
Description of problem:
While running Ansible package install command at scale (45K hosts with 1 package to be installed), the job got stalled with no progress happening. On taking a further look, it seems like the smart_proxy service on 1 of the capsules wasn't processing any task.

Version-Release number of selected component (if applicable):
Satellite 6.4 Snap 18

How reproducible:
Not sure

Steps to Reproduce:
1. Create a new remote execution job with Ansible package command
2. Execute the job on large scale (35k or more hosts)

Actual results:
The job gets stalled with no progress at certain time

Expected results:
The job execution finishes successfully

Additional info:
Foreman debug satellite: http://debugs.theforeman.org/foreman-debug-XHcxQ.tar.xz
Foreman debug capsule: http://debugs.theforeman.org/foreman-debug-a5qE7.tar.xz

Comment 1 Adam Ruzicka 2018-09-12 11:37:58 UTC
Additional info:
There was a bunch of tasks on the capsule, dynflow status showed +-850 events in the queue and 0/5 free workers, however the process was completely idle.


Note You need to log in before you can comment on or make changes to this bug.