Hide Forgot
Description of problem: If you suspend the qpid process (simulating high qpid load), the candlepin event listener process will hang in its event loop. This causes the dynflow executor to not proceed. For example: * on tasks page, view candlepin task process, it should say something like: <pre> {"messages"=>"b190e333-7821-302a-8131-9693e66e2144", "last_message"=>"b190e333-7821-302a-8131-9693e66e2144 - import.created", "error"=>nil, "connection"=>"Connected"} </pre> * now, freeze the qpidd process: kill -19 `pidof qpidd`. Note that the candlepin event listener still thinks its connected. * do a "hammer ping", it will hang due to https://pulp.plan.io/issues/2253. * do a "foreman-rake console" and run Katello::Ping.ping(services: [:foreman_tasks]). Note that the executor failed to respond. Once qpidd is unsuspended via kill -18, things will run normally again. Version-Release number of selected component (if applicable): 6.2
This is a bug that jhutar originally found a few months ago. What happens is if qpid slows down, the Katello connection to qpid may eventually terminate after minutes/hours but Katello is none the wiser. Qpid will become responsive again, but the katello_event_queue will then keep filling as candlepin puts more events on it. I have not reproduced this since it can take some time but I believe that is what happened. The workaround is to restart foreman_tasks if the katello_event_queue appears to not be draining.
Upstream bug component is Content Management
I think this is fixed. I have not seen it in some time, either on my own machines or on other machines. Marking as closed/worksforme.