Bug 547295

Summary: qpid-stat -b threading exception during shutdown 'exceptions.TypeError: 'NoneType' object is not callable'
Product: Red Hat Enterprise MRG Reporter: Frantisek Reznicek <freznice>
Component: python-qpidAssignee: Ken Giusti <kgiusti>
Status: CLOSED ERRATA QA Contact: Frantisek Reznicek <freznice>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.2CC: esammons, gsim, jkolar, kgiusti, ltoscano, tross
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Invoking any command provided by the qpid-tools package caused that tool to crash upon exiting because it incorrectly disconnected from the broker. With this update, commands provided by qpid-tools disconnect cleanly from the broker, and thus do not crash.
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-14 16:12:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 615883    
Bug Blocks: 591427    
Attachments:
Description Flags
The python qmf log example
none
partial fix - cleanup broker add failure handling none

Description Frantisek Reznicek 2009-12-14 10:17:15 UTC
Description of problem:

There was observed qpid-stat -b threading exception in the client shutdown:
Brokers
  broker            cluster                        uptime   conn  sess  exch  queue
  ===================================================================================
  10.34.33.62:5672  mrg_pepe_test_cluster(ACTIVE)  20m 40s     6     3     8    30
  10.34.33.63:5672  mrg_pepe_test_cluster(ACTIVE)  20m 30s     7    11     8    34
  10.34.33.64:5672  mrg_pepe_test_cluster(ACTIVE)  20m 30s     3     3     8    31
  10.34.33.65:5672  mrg_pepe_test_cluster(ACTIVE)  20m 30s     2     2     8    38
Exception in thread Thread-10 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib/python2.4/threading.py", line 442, in __bootstrap
  File "/usr/lib/python2.4/threading.py", line 422, in run
  File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 150, in run
  File "/usr/lib/python2.4/site-packages/qpid/assembler.py", line 79, in read_segment
  File "/usr/lib/python2.4/site-packages/qpid/framer.py", line 154, in read_frame
  File "/usr/lib/python2.4/site-packages/qpid/packer.py", line 29, in unpack
  File "/usr/lib/python2.4/site-packages/qpid/framer.py", line 123, in read
exceptions.TypeError: 'NoneType' object is not callable
Unhandled exception in thread started by
Error in sys.excepthook:

Original exception was:


Version-Release number of selected component (if applicable):
RHEL 5.4 i386 / x86_64 with MRG 1.2
python-2.4.3-27.el5
python-elementtree-1.2.6-5
python-iniparse-0.2.3-4.el5
python-numeric-23.7-2.2.2
python-qpid-0.5.752581-4.el5
python-sqlite-1.1.7-1.2.1
python-urlgrabber-3.1.0-5.el5
qpidc-0.5.752581-34.el5
qpidc-debuginfo-0.5.752581-34.el5
qpidc-devel-0.5.752581-34.el5
qpidc-rdma-0.5.752581-34.el5
qpidc-ssl-0.5.752581-34.el5
qpidd-0.5.752581-34.el5
qpidd-acl-0.5.752581-34.el5
qpidd-cluster-0.5.752581-34.el5
qpidd-devel-0.5.752581-34.el5
qpid-dotnet-0.4.738274-2.el5
qpidd-rdma-0.5.752581-34.el5
qpidd-ssl-0.5.752581-34.el5
qpidd-xml-0.5.752581-34.el5
qpid-java-client-0.5.751061-9.el5
qpid-java-common-0.5.751061-9.el5


How reproducible:
<5% (hard to reproduce)

Steps to Reproduce:
1. start MRG cluster (openais and qpidd as services)
2. run 'while true; do qpid-cluster;qpid-stat -b;qpid-config queues; done'
   on all nodes
3. run on one of the nodes
 'while true; do perftest --user=guest --password=guest --count=10000 -s;done'
4. watch statistic stdout of step 2 and you should see sometimes qpid-stat ending with exception
  
Actual results:
qpid-stat ends with threading exception

Expected results:
qpid-stat should end cleanly

Additional info:

 (none)

Comment 1 Gordon Sim 2010-06-30 09:49:47 UTC
In the case of a clustered broker it adds connections for each address
in the known list but does not close these down on BrokerManager.disconnect().

The following fixes this specific case:

Index: tools/src/py/qpid-stat
===================================================================
--- tools/src/py/qpid-stat (revision 959242)
+++ tools/src/py/qpid-stat (working copy)
@@ -161,6 +161,7 @@
     def Disconnect(self):
         if self.broker:
             self.qmf.delBroker(self.broker)
+        for b in self.brokers: self.qmf.delBroker(b.broker)

     def _getCluster(self):
         packages = self.qmf.getPackages()

See also bug 609035.

Comment 2 Gordon Sim 2010-06-30 16:11:20 UTC
Fixed on trunk (959353) and in release repo (http://mrg1.lab.bos.redhat.com/git/?p=qpid.git;a=commitdiff;h=ef61c70ea10aa6c564943d481bbe349f51e17c89).

Comment 3 Gordon Sim 2010-06-30 21:33:37 UTC
Previous fix introduced new bug against standalone broker, fixed by http://mrg1.lab.bos.redhat.com/git/?p=qpid.git;a=commitdiff;h=481f24643e8d2f83b736565d5ffe4cbbb5658536.

Comment 4 Gordon Sim 2010-07-02 11:54:19 UTC
*** Bug 609035 has been marked as a duplicate of this bug. ***

Comment 5 Frantisek Reznicek 2010-07-19 08:47:25 UTC
The bug is much less visible now (reproducibility lowered) but stil there, as you can see below: (It might be linked with bug 615883):


[root@mrg-qe-10 fcluster]# qpid-stat -b
Exception in thread Thread-30:
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
    self.run()
  File "/usr/lib64/python2.4/threading.py", line 422, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 168, in run
    if self.aborted():
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 2063, in aborted
    raise Timeout("Waiting for connection to be established with broker")
Timeout: Waiting for connection to be established with broker


[1]+  Stopped                 qpid-stat -b
[root@mrg-qe-10 fcluster]# kill %1
[1]+  Terminated              qpid-stat -b
[root@mrg-qe-10 fcluster]# qpid-stat -b ; echo $?
Failed: Exception - Unknown connection exception
Traceback (most recent call last):
  File "/usr/bin/qpid-stat", line 505, in ?
    bm.display()
  File "/usr/bin/qpid-stat", line 448, in display
    b = self.qmf.addBroker(authString + host, _connTimeout)
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 622, in addBroker
    ssl = url.scheme == URL.AMQPS, connTimeout=timeout)
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 1950, in __init__
    raise self.conn_exc
Exception: Unknown connection exception
Exception in thread Thread for broker: 10.34.33.62:5672 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 2546, in run
  File "/usr/lib64/python2.4/Queue.py", line 125, in get
exceptions.TypeError: 'NoneType' object is not callable
Unhandled exception in thread started by
Error in sys.excepthook:

Original exception was:
Exception in thread Thread for broker: 10.34.33.64:5672 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 2546, in run
  File "/usr/lib64/python2.4/Queue.py", line 125, in get
exceptions.TypeError: 'NoneType' object is not callable
Unhandled exception in thread started by
Error in sys.excepthook:

Original exception was:
Exception in thread Thread for broker: 10.34.33.63:5672 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 2546, in run
  File "/usr/lib64/python2.4/Queue.py", line 125, in get
exceptions.TypeError: 'NoneType' object is not callable
Unhandled exception in thread started by
Error in sys.excepthook:

Original exception was:
1

The above issue was seen on RHEL5.5 x86_64 while testing persistent cluster on following packages:
[root@mrg-qe-10 fcluster]# rpm -qa | grep -E '(qpid|qmf)' | sort
python-qmf-0.7.946106-6.el5
python-qpid-0.7.946106-9.el5
qmf-0.7.946106-7.el5
qmf-devel-0.7.946106-7.el5
qpid-cpp-client*-0.7.946106-7.el5
qpid-cpp-mrg-debuginfo-0.7.946106-7.el5
qpid-cpp-server*-0.7.946106-7.el5
qpid-java-client-0.7.946106-6.el5
qpid-java-common-0.7.946106-6.el5
qpid-tools-0.7.946106-7.el5
rh-qpid-cpp-tests-0.7.946106-7.el5
ruby-qmf-0.7.946106-7.el5

-> ASSIGNED

Comment 6 Gordon Sim 2010-07-19 09:24:09 UTC
This occurrence is due to the initial exception (i.e. bug 615883). Unless it can be reproduced without any previous error being signalled I would leave it as modified and mark it dependent on 615883 for verification.

Comment 7 Gordon Sim 2010-07-20 17:57:54 UTC
Verification depends on fix for bug 615883, but otherwise this is believed fixed.

Comment 8 Frantisek Reznicek 2010-08-09 11:37:32 UTC
Retested on:
  python-qmf-0.7.946106-8.el5
  python-qpid-0.7.946106-11.el5
  qmf*-0.7.946106-11.el5
  qpid-cpp-*-0.7.946106-11.el5
  qpid-java-*-0.7.946106-7.el5
  qpid-tools-0.7.946106-8.el5

Scenario 4 node broker, running qpid-stat -b on current localhost and all remote cluster node[s].

I can reproduce such the case as well:

Traceback (most recent call last):
  File "/usr/bin/qpid-stat", line 522, in ?
    bm.display()
  File "/usr/bin/qpid-stat", line 455, in display
    b = self.qmf.addBroker(authString + host, _connTimeout)
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 622, in addBroker
    ssl = url.scheme == URL.AMQPS, connTimeout=timeout)
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 2026, in __init__
    raise self.conn_exc
qpid.exceptions.Timeout
Exception in thread Thread for broker: 10.34.33.62:5672 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 2615, in run
  File "/usr/lib64/python2.4/Queue.py", line 125, in get
exceptions.TypeError: 'NoneType' object is not callable
Exception in thread Thread-30 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
  File "/usr/lib64/python2.4/threading.py", line 422, in run
  File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 167, in run
exceptions.AttributeError: 'NoneType' object has no attribute 'timeout'
Exception in thread Thread for broker: 10.34.33.64:5672 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 2615, in run
  File "/usr/lib64/python2.4/Queue.py", line 125, in get
exceptions.TypeError: 'NoneType' object is not callable
Unhandled exception in thread started by
Error in sys.excepthook:

Original exception was:


There is question which reminds:
Is it acceptable to see such backtrace when python qmf console connection to broker timeouts?

An comparation with similar situation and qpid-config:
Total Exchanges: 8
          topic: 3
        headers: 1
         fanout: 1
         direct: 3

   Total Queues: 2418
        durable: 44
    non-durable: 2374
Failed: Timeout -


Ideally I would like to get exception about timeout, but do not see TypeError: 'NoneType' object is not callable because it might be read by customers are logical problem in the code.

The situation when python qmf console/clients are timeouting is very easy to trigger in my view. (Running cluster, two perftests, one tsxtest and looping qpid-config;qpid-stat -b;qpid-cluster)

Any thoughts on the questions?

Comment 9 Frantisek Reznicek 2010-08-09 13:57:16 UTC
Created attachment 437602 [details]
The python qmf log example

The attached qpid-config / qpid-cluster / qpid-stat -b log from info thread shows at line 635 the case when NoneType error occurs and for instance at the line 3055 how it should look like.

Could you possibly check attachment to doublecheck whether you share my understanding of the issue.

Keeping NEEDINFO.

Comment 10 Ken Giusti 2010-08-11 17:56:09 UTC
Created attachment 438248 [details]
partial fix - cleanup broker add failure handling

The problem is due to a lack of resource cleanup during exceptions.   The "NoneType" errors are being thrown by threads which have not been cleanly shutdown by the exception handler(s) before the python interpreter shuts down due to the exception.

To fully fix this would require refactoring some of the console.py code (see attached - probably incomplete), but also updating the tools code - such as qpid-stat - to find & fix or add exception handlers for addBroker/delBroker cleanup.

I think this may be too risky for 1.3 - especially as it concerns a failure case rather than broken functionality.  I'd recommend pushing it off to 1.3.1.  Marking it as a blocker so PM can weigh in.

Comment 11 Luigi Toscano 2010-08-17 12:08:31 UTC
I am not sure if this is the same issue, but when I run 
qpid-config foo
where foo is any string which is not a valid parameter, the usual "Usage:" message is printed as expected but after that I can see this exception:

-----------------------------------------------
Failed: SystemExit: 1
Exception in thread Thread for broker: localhost:5672 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 2614, in run
  File "/usr/lib64/python2.4/Queue.py", line 125, in get
exceptions.TypeError: 'NoneType' object is not callable
Unhandled exception in thread started by 
Error in sys.excepthook:

Original exception was:
-----------------------------------------------

Packages:
qpid-cpp-server-ssl-0.7.946106-12.el5
qpid-cpp-client-ssl-0.7.946106-12.el5
qpid-cpp-mrg-debuginfo-0.7.946106-12.el5
qpid-cpp-server-0.7.946106-12.el5
qpid-cpp-client-devel-0.7.946106-12.el5
qpid-cpp-server-store-0.7.946106-12.el5
python-qpid-0.7.946106-12.el5
qpid-cpp-server-cluster-0.7.946106-12.el5
qpid-cpp-server-devel-0.7.946106-12.el5
qpid-tools-0.7.946106-8.el5
qpid-cpp-client-0.7.946106-12.el5
qpid-cpp-server-xml-0.7.946106-12.el5
python-qmf-0.7.946106-9.el5
qmf-devel-0.7.946106-12.el5
qmf-0.7.946106-12.el5


Is it the expected behaviour with this partial fix?

Comment 12 Ken Giusti 2010-08-19 22:25:30 UTC
I've commited a series of changes that tightens up the qmf shutdown code, esp. during the handling of exceptions by the qpid-tools:



Committed revision 987330: http://svn.apache.org/viewvc?view=revision&revision=987330

I believe this will fix Luigi's problem as described in comment #11.

I am keeping the merge-blocker flag ? and setting the state to "assigned", as there was some concern among the group as to whether this fix should be considered a blocker for 1.3.

Comment 13 Gordon Sim 2010-08-20 09:18:07 UTC
I don't see this as a *blocker* (its not critical and its not a regression), but as the patch is there, it improves perceptions and appears to be low risk, I could certainly agree to including it in 1.3.

Comment 14 Ken Giusti 2010-08-20 20:26:21 UTC
Merged into 1.3:

http://mrg1.lab.bos.redhat.com/git/?p=qpid.git;a=commitdiff;h=d496644d743bf016c5204374b538cc3632e93879

Respun packages:

qpid-tools-0.7.946106-9
python-qmf-0_7_946106-11

Comment 16 Frantisek Reznicek 2010-09-24 13:02:58 UTC
The issue has been fixed, no such exceptions are detected anymore, tested on RHEL 4.8 / 5.5 i386 / x86_64 on packages:
python-qpid-0.7.946106-14.el5
python-qmf-0.7.946106-13.el5


-> VERIFIED

Comment 17 Ken Giusti 2010-10-05 15:17:18 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
* Cause:  Invoking any of the commands from the qpid-tool package.
* Consequence:  The tools would not correctly disconnect from the broker, resulting in a crash on exit from the tool.
* Fix:  Tools now cleanly disconnect from the remote broker on exit.
* Result: The tools will no longer crash on exit.

Comment 18 Douglas Silas 2010-10-05 18:42:31 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1 @@
-* Cause:  Invoking any of the commands from the qpid-tool package.
+Invoking any command provided by the qpid-tools package caused that tool to crash upon exiting because it incorrectly disconnected from the broker. With this update, commands provided by qpid-tools disconnect cleanly from the broker, and thus do not crash.-* Consequence:  The tools would not correctly disconnect from the broker, resulting in a crash on exit from the tool.
-* Fix:  Tools now cleanly disconnect from the remote broker on exit.
-* Result: The tools will no longer crash on exit.

Comment 20 errata-xmlrpc 2010-10-14 16:12:24 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html