Description of problem: There is an exception in condor_configure_store run. Version-Release number of selected component (if applicable): RHEL 5.5 python-2.4.3-27.el5 condor-wallaby-tools-2.6-0.5.el5.noarch qmf-0.7.935473-1.el5 qpid-cpp-server-0.7.935473-1.el5 How reproducible: less than 5% Steps to Reproduce: 1. configure condor pool by remote configuration 2. condor_configure_store --default-group -l Actual results: # condor_configure_store --default-group -l Group "Internal Default Group": Group ID: 1 Name: Internal Default Group Members: Features (priority: name): 0: ExecuteNode 1: Master 2: NodeAccess Parameters: ALLOW_WRITE = * QMF_BROKER_HOST = host CONDOR_HOST = host ALLOW_READ = * Exception in thread Thread-1 (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap File "/usr/lib64/python2.4/threading.py", line 422, in run File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 164, in run exceptions.AttributeError: 'NoneType' object has no attribute 'timeout' Unhandled exception in thread started by Error in sys.excepthook: Original exception was: # echo $? 0
I get a similar exception almost every time I run the config tools now with: python-qpid-0.7.946106-2 python-qmf-0.7.946106-4 Exception in thread Thread for broker: 127.0.0.1:5672 (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 525, in __bootstrap_inner File "/usr/lib/python2.6/site-packages/qmf/console.py", line 2597, in run File "/usr/lib64/python2.6/Queue.py", line 165, in get <type 'exceptions.TypeError'>: exceptions must be classes or instances, not NoneType
Exceptions like this happen when someone forgets to close the connection and session properly. What happens is the various background threads that service the connection and sessions are still running when the interpreter exits, and due to the way python threading interacts with interpreter shutdown, this can sometimes result in odd exceptions like you see above. They are usually easy to spot because they say "(most likely raised during interpreter shutdown)" in the description. The fix would be to ensure that the connection and any sessions are closed properly. I'm not sure whether they are being left open due to a bug in QMF code or due to improper/incomplete use of the QMF APIs. Ted could probably answer this last question.
Can we get some more detail on reproducing this one, including the specific tools used or example code where the connection is closed and the error still happens? For qpid-stat against a cluster there is a known bug 547295. What other tools give this exception and in what environment/setup?
The tools I use to produce this are the condor remote configuration tools (condor_configure_pool and condor_configure_store). There were some exit points that were not calling delBroker. It seems like there should be a cleaner/less error prone means for the session to close and delete the brokers, but that may not be possible in python. A session.close() where the api cleans up all allocated brokers would be a step in the right direction though. The simplest means to reproduce is to write a python program that creates a QMF session, adds a broker, then exits w/o calling delBroker.
Rob confirmed by irc that the original issues with condor_configure_pool and condor_configure_store are now resolved.
I've tested this 100 times on RHEL 5.5 x x86_64/i386 with wallaby-0.9.4-1, condor-wallaby-tools-3.4-1 and it works without exception. --> VERIFIED