Bug 1292026 - goferd stuck in connection close, causing no connection to qdrouterd is made
Summary: goferd stuck in connection close, causing no connection to qdrouterd is made
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: katello-agent
Version: 6.1.4
Hardware: All
OS: Linux
unspecified
high
Target Milestone: Unspecified
Assignee: Katello Bug Bin
QA Contact: Katello QA List
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-16 09:25 UTC by Pavel Moravec
Modified: 2015-12-16 11:25 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-16 11:25:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Pavel Moravec 2015-12-16 09:25:46 UTC
Description of problem:
Under unclear circumstances (halfly working reproducer so far), goferd fails to close TCP connections to qdrouterd and is stuck in "connection.close()".

That causes no established TCP connection is made to qdrouterd, hence katello-agent functionality is lost.

Relevant backtraces:

 for file /usr/lib64/python2.7/site-packages/proton/__init__.py, line 2458
 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 219
 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 234
 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 220
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py, line 152
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/consumer.py, line 80
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/reliability.py, line 43
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 614
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 39
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 648
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 39
 for file /usr/lib/python2.7/site-packages/gofer/messaging/consumer.py, line 88
 for file /usr/lib/python2.7/site-packages/gofer/messaging/consumer.py, line 58
 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 267
 for file /usr/lib64/python2.7/threading.py, line 811
 for file /usr/lib64/python2.7/threading.py, line 784


 for file /usr/lib64/python2.7/site-packages/proton/__init__.py, line 2458
 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 219
 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 234
 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 220
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py, line 152
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/producer.py, line 79
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/reliability.py, line 43
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 842
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 39
 for file /usr/lib/python2.7/site-packages/gofer/agent/rmi.py, line 266
 for file /usr/lib/gofer/plugins/katelloplugin.py, line 228
 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpm.py, line 61
 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 341
 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 324
 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 397
 for file /usr/lib/python2.7/site-packages/yum/__init__.py, line 6472
 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 602
 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpmtools.py, line 159
 for file /usr/lib/python2.7/site-packages/pulp_rpm/handlers/rpm.py, line 100
 for file /usr/lib/python2.7/site-packages/pulp/agent/lib/dispatcher.py, line 76
 for file /usr/lib/gofer/plugins/katelloplugin.py, line 372
 for file /usr/lib/python2.7/site-packages/gofer/rmi/dispatcher.py, line 454
 for file /usr/lib/python2.7/site-packages/gofer/rmi/dispatcher.py, line 634
 for file /usr/lib/python2.7/site-packages/gofer/agent/plugin.py, line 381
 for file /usr/lib/python2.7/site-packages/gofer/agent/rmi.py, line 85
 for file /usr/lib/python2.7/site-packages/gofer/threadpool.py, line 138
 for file /usr/lib/python2.7/site-packages/gofer/threadpool.py, line 65
 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 267
 for file /usr/lib64/python2.7/threading.py, line 811
 for file /usr/lib64/python2.7/threading.py, line 784


Version-Release number of selected component (if applicable):
gofer-2.6.6-2.el7sat.noarch
python-qpid-proton-0.9-7.el7.x86_64


How reproducible:
???


Steps to Reproduce:
(below reproducer might not work every time - let me know if better is welcomed)
1. In a loop, install and uninstall a package from Satellite to a content host - just to make goferd somehow busy
2. Try to "freeze" qdrouterd where goferd connects to per bz1281947
3. Time to time, check netstat connections


Actual results:
at a random time, goferd logged lastly:

Dec 16 09:45:33 pmoravec-rhel7 goferd: [INFO][worker-0] root:505 - connected to toledo-capsule.gsslab.brq.redhat.com:5648

but having no TCP connection there. While no "root:525 - Disconnected" or "closed: .." log appears there.


Expected results:
goferd to keep an active connection whenever qdrouterd is running (and goferd not in the process of reconnect attempts)


Additional info:
Will provide coredumps and tcpdump of nonSSL communication shortly

Comment 1 Pavel Moravec 2015-12-16 09:37:53 UTC
Another backtrace of gofer (rather proton reactor) stuck in connection.close():

(reactor.py line 142 or 143)
 for file /usr/lib64/python2.7/site-packages/proton/reactor.py, line 142
 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 235
 for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 220
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py, line 152
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/model.py, line 156
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/reliability.py, line 43
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/model.py, line 293
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 365
 for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/model.py, line 39
 for file /usr/lib/python2.7/site-packages/gofer/agent/plugin.py, line 488
 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 267
 for file /usr/lib/python2.7/site-packages/gofer/agent/plugin.py, line 316
 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 228
 for file /usr/lib/python2.7/site-packages/gofer/agent/plugin.py, line 51
 for file /usr/lib/python2.7/site-packages/gofer/threadpool.py, line 138
 for file /usr/lib/python2.7/site-packages/gofer/threadpool.py, line 65
 for file /usr/lib/python2.7/site-packages/gofer/common.py, line 267
 for file /usr/lib64/python2.7/threading.py, line 811
 for file /usr/lib64/python2.7/threading.py, line 784

Comment 4 Pavel Moravec 2015-12-16 11:19:26 UTC
Investigating the tcpdump, it is expected behaviour of goferd / qpid proton reactor. If qdrouterd does not react to close AMQP frame, gofer/proton reactor is waiting on it..

This BZ is solely caused by bz1281947, closing it.


Note You need to log in before you can comment on or make changes to this bug.