After executing a "heavy" workload, the resource_manager queue may have unacked messages in it. To reproduce you can: 1. Execute the heavy workload and let it finish. We've triggered this with a `hammer product synchronize --id xx` which would sync ~30 repos (often with no new packages) and then publish those repos. 2. List the qpid queue statistics: `qpid-stat -q --ssl-certificate=/etc/pki/katello/qpid_client_striped.crt -b amqps://localhost:5671` 3. Observe a output like the following: queue dur autoDel excl msg msgIn msgOut bytes bytesIn bytesOut cons bind ================================================================================================================================================================= <snip> resource_manager Y 19 338 319 33.4k 567k 534k 1 2 <snip> 4. Observe that all Pulp tasks have completed with state finished, and yet there are still messages in the 'msg' column. 19 in the example above. You can verify all Pulp tasks as being completed by connecting to mongo, switching to the pulp_database, and running `db.task_status.find({ state:{ $ne: "finished" } } ).count()`. If all pulp tasks are finished that query will return 0. This has been observed on both EL6 and EL7.
These unacked messages are being processed correctly, so the user should not be able to observe the problem while using Satellite. It is not expected to cause incorrect operation. Based on input from the Qpid team, and from my own experiments, this bug will not cause deadlock. Fixing this is important because it makes the possible deadlocking bug[0] more difficult to determine if a system actually is deadlocked. [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1288223
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.
I've tested this upstream, and they builds completed. I then installed it on the reproducer machine and it fully resolved the issue**. Downstream should build python-kombu to be the same as what is inside upstream python-kombu-3.0.24-11.pulp. That was created with this commit[0], which I believe is the only commit that is not already applied to downstream. The downstream python-kombu should contain exactly the same source and patches as [1]. This is for 6.1.z. For 6.2.0, upstream will be building a newer version of python-kombu which will be included in an upcoming beta and will later be built in downstream. I'll ensure that version will also contain this fix. [0]: https://github.com/pulp/pulp/commit/0d2de5d5272bab25c25e392b5b887665b5f89465 [1]: https://github.com/pulp/pulp/tree/0d2de5d5272bab25c25e392b5b887665b5f89465/deps/python-kombu **: 1) 2 task fail, but it ends up in an error state and pulp reports some normal pulp error (ie: a connection error). 2) After the reproducer was run, there were 0 tasks left in the resource_manager queue (or any worker) so everything got processed. This is very different than the what happened before. 3) I verified that the resource_manager emitted a task xxx started and a corresponding task xxx stopped. None were "unrun" according to the logs. This is also different than before.
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
VERIFIED with sat6.1.7 compose1 ~]# qpid-stat -q --ssl-certificate=/etc/pki/katello/qpid_client_striped.crt -b amqps://localhost:5671 | grep -ie queue -ie resource_manager Queues queue dur autoDel excl msg msgIn msgOut bytes bytesIn bytesOut cons bind resource_manager Y 0 207 207 0 343k 343k 1 2 ----------------------------------------------------------------------------- ~]# mongo pulp_database MongoDB shell version: 2.4.9 connecting to: pulp_database Server has startup warnings: Fri Feb 5 04:36:00.647 [initandlisten] Fri Feb 5 04:36:00.647 [initandlisten] ** WARNING: Readahead for /var/lib/mongodb is set to 4096KB Fri Feb 5 04:36:00.647 [initandlisten] ** We suggest setting it to 256KB (512 sectors) or less Fri Feb 5 04:36:00.647 [initandlisten] ** http://dochub.mongodb.org/core/readahead > db.task_status.find({ state:{ $ne: "finished" } } ).count() 0 ----------------------------------------------------------------------------- task_status in finished state is 0 and msg cloumn is 0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:0174
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.