Description of problem:
Started ReX operation ( yum install -y aide) on 1024 nodes.
Package is installed on node. But its taking longer time to update status for the host in Remote execution job UI
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. start Remote execution (install any package) at scale
2. monitor https://<satellite>/template_invocations/<anymachine> if package is installed
3. Check status on remote execution job UI
update status UI after installing package takes longer .
update status UI after installing package.
Created attachment 1244594 [details]
Created attachment 1244595 [details]
accounting of Succeeded is wrong.
Package updated completed more than > 10 nodes. UI continue to list only
Package updated completed more than > 10 nodes. UI continue to list 2 only
Created attachment 1244597 [details]
yum install aide on 1290
package completion updated successfully on most of them (few in progress)
Dynamically failure rate graph is updating. But not success rate as shown in attachment
Created attachment 1244598 [details]
failure-success rate during ReX in progress
IMHO According to comment 24, It's related to https://bugzilla.redhat.com/show_bug.cgi?id=1416542#c24.
Should it be marked as duplicate?
These two are different.
Running a command via REX on 6100 hosts. Even 10% of them completed, (foreman task stopped with success,, but the main UI
https://<sat-server>/job_invocations/<job-id> shows 100% Pending.
This relates to the amount of messages the we need to process during the rex at scale. In order to enhance this, we need to have more info about the messages that are being processed:
And eliminate the amount of messages + suggest to run more workers accordingly.
Here is a middleware that can help us collecting data about the thoughput of
the executor while at scale:
See the usage comments in the gist to know how to use that
Also, this BZ is partially related https://bugzilla.redhat.com/show_bug.cgi?id=1417537
I'm marking this as duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1417537 as the description overlaps a lot. If you find different place/occasion that the data would not seem to update properly, please file new, more specific bug.
*** This bug has been marked as a duplicate of bug 1417537 ***
*** Bug 1438656 has been marked as a duplicate of this bug. ***