Description of problem: Under unclear circumstances (halfly working reproducer so far), goferd fails to close TCP connections to qdrouterd and is stuck in "connection.close()". That causes no established TCP connection is made to qdrouterd, hence katello-agent functionality is lost. Relevant backtraces: for file /usr/lib64/python2.7/site-packages/proton/__init__.py, line 2458 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 219 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 234 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 220 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py, line 152 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/consumer.py, line 80 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/reliability.py, line 43 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 614 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 39 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 648 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 39 for file /usr/lib/python2.7/site-packages/gofer/messaging/consumer.py, line 88 for file /usr/lib/python2.7/site-packages/gofer/messaging/consumer.py, line 58 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 267 for file /usr/lib64/python2.7/threading.py, line 811 for file /usr/lib64/python2.7/threading.py, line 784 for file /usr/lib64/python2.7/site-packages/proton/__init__.py, line 2458 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 219 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 234 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 220 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py, line 152 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/producer.py, line 79 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/reliability.py, line 43 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 842 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 39 for file /usr/lib/python2.7/site-packages/gofer/agent/rmi.py, line 266 for file /usr/lib/gofer/plugins/katelloplugin.py, line 228 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpm.py, line 61 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 341 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 324 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 397 for file /usr/lib/python2.7/site-packages/yum/__init__.py, line 6472 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 602 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 159 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpm.py, line 100 for file /usr/lib/python2.7/site-packages/pulp/agent/lib/dispatcher.py, line 76 for file /usr/lib/gofer/plugins/katelloplugin.py, line 372 for file /usr/lib/python2.7/site-packages/gofer/rmi/dispatcher.py, line 454 for file /usr/lib/python2.7/site-packages/gofer/rmi/dispatcher.py, line 634 for file /usr/lib/python2.7/site-packages/gofer/agent/plugin.py, line 381 for file /usr/lib/python2.7/site-packages/gofer/agent/rmi.py, line 85 for file /usr/lib/python2.7/site-packages/gofer/threadpool.py, line 138 for file /usr/lib/python2.7/site-packages/gofer/threadpool.py, line 65 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 267 for file /usr/lib64/python2.7/threading.py, line 811 for file /usr/lib64/python2.7/threading.py, line 784 Version-Release number of selected component (if applicable): gofer-2.6.6-2.el7sat.noarch python-qpid-proton-0.9-7.el7.x86_64 How reproducible: ??? Steps to Reproduce: (below reproducer might not work every time - let me know if better is welcomed) 1. In a loop, install and uninstall a package from Satellite to a content host - just to make goferd somehow busy 2. Try to "freeze" qdrouterd where goferd connects to per bz1281947 3. Time to time, check netstat connections Actual results: at a random time, goferd logged lastly: Dec 16 09:45:33 pmoravec-rhel7 goferd: [INFO][worker-0] root:505 - connected to toledo-capsule.gsslab.brq.redhat.com:5648 but having no TCP connection there. While no "root:525 - Disconnected" or "closed: .." log appears there. Expected results: goferd to keep an active connection whenever qdrouterd is running (and goferd not in the process of reconnect attempts) Additional info: Will provide coredumps and tcpdump of nonSSL communication shortly
Another backtrace of gofer (rather proton reactor) stuck in connection.close(): (reactor.py line 142 or 143) for file /usr/lib64/python2.7/site-packages/proton/reactor.py, line 142 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 235 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 220 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py, line 152 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/model.py, line 156 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/reliability.py, line 43 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/model.py, line 293 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 365 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 39 for file /usr/lib/python2.7/site-packages/gofer/agent/plugin.py, line 488 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 267 for file /usr/lib/python2.7/site-packages/gofer/agent/plugin.py, line 316 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 228 for file /usr/lib/python2.7/site-packages/gofer/agent/plugin.py, line 51 for file /usr/lib/python2.7/site-packages/gofer/threadpool.py, line 138 for file /usr/lib/python2.7/site-packages/gofer/threadpool.py, line 65 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 267 for file /usr/lib64/python2.7/threading.py, line 811 for file /usr/lib64/python2.7/threading.py, line 784
Investigating the tcpdump, it is expected behaviour of goferd / qpid proton reactor. If qdrouterd does not react to close AMQP frame, gofer/proton reactor is waiting on it.. This BZ is solely caused by bz1281947, closing it.