Description of problem:
Opening this bug about a problem we discovered while running katello-agent at scale, for tracking purpose.
We have observed that katello-agent(10K clients were running katello-agent) does not re-establish a connection to the server once qdrouter and qpid are restarted to Satellite and Capsules even after quite a long time of waiting period.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Increase the Max open files limit for QPID and Qdrouter on Satellite and Capsules.
2. Restart Qdouter and QPID on both Satellite and capsules.
3. Wait for 10-15 minutes before executing some operation.
4. Run an errata install on hosts
Errata install job times out for the hosts even after having the accept action timeout limit set to 120 seconds.
Errata install works fine for clients
After our errata install job failed, we tried to manually restart the katello-agent on the clients. After this was done, we tried to do an errata install again and the process went fine.
One message that we noted on the client side after manually restarting the katello-agent was that katello-agent logged a message about A task being dropped.