Description of problem: After configuring a broker for ssl communication on port 5671, Python clients that connect (like cumin and qpid-tool) are disconnected a short time later (on the order of tens of seconds). Version-Release number of selected component (if applicable): # rpm -qa | grep -i qpid qpid-cpp-client-ssl-0.14-5.el5 qpid-qmf-0.14-2.el5 qpid-cpp-server-0.14-5.el5 qpid-cpp-server-ssl-0.14-5.el5 python-qpid-0.14-3.el5 qpid-tools-0.14-1.el5 qpid-cpp-client-0.14-5.el5 python-qpid-qmf-0.14-2.el5 # rpm -qa | grep -i saslwrapper python-saslwrapper-0.10-4.el5 saslwrapper-0.10-4.el5 How reproducible: 100% Steps to Reproduce: 1. Set up a broker for ssl 2. Run qpid-tool amqps://localhost 3. Run qpid-stat -c Actual results: The qpid-tool connection will show initially in qpid-stat output (and print statements in the brokerConnected callback in qpid-tool will print). Shortly thereafter, qpid-tool will show "Broker disconnected" and qpid-stat -c will no longer show the connection. Expected results: The connection should persist as long as qpid-tool is running. Additional info:
I used following reproduction scenario: # yum install -y -q qpid-cpp-server qpid-tools nss-tools ecode=0 # echo password > /var/lib/qpidd/passwordfile ecode=0 # chown qpidd:qpidd /var/lib/qpidd/passwordfile ecode=0 # chmod 600 /var/lib/qpidd/passwordfile ecode=0 # cd /var/lib/qpidd ecode=0 # sudo -u qpidd certutil -N -d . -f passwordfile ecode=0 # sudo -u qpidd certutil -S -d . -f passwordfile -n nickname -s "CN=nickname" -t "CT,," -x -z /usr/bin/certutil Generating key. This may take a few moments... ecode=0 # grep --invert-match ^# /etc/qpidd.conf cluster-mechanism=ANONYMOUS auth=yes ssl-cert-password-file=/var/lib/qpidd/passwordfile ssl-cert-db=/var/lib/qpidd ssl-cert-name=nickname ecode=0 # yum install -y -q qpid-cpp-server-ssl ecode=0 # service qpidd restart Stopping Qpid AMQP daemon: [FAILED] Starting Qpid AMQP daemon: [ OK ] ecode=0 # tail /var/log/messages | grep qpidd | cut --delimiter=']' --fields=2 : 2012-02-15 09:44:55 notice Listening on TCP port 5672 : 2012-02-15 09:44:55 notice Listening for SSL connections on TCP port 5671 : 2012-02-15 09:44:55 notice Broker running ecode=0 Following debug messages are produced by qpidd: 2012-02-15 10:10:46 info SASL: Mechanism list: PLAIN LOGIN ANONYMOUS 2012-02-15 10:10:46 info 127.0.0.1:5671-127.0.0.1:44095 SASL: Authentication succeeded for: anonymous@QPID 2012-02-15 10:10:46 info Queue "reply-[hostname].1683.1": Policy created: type=reject; maxCount=0; maxSize=104857600 2012-02-15 10:10:46 info Queue "reply-[hostname].1683.1": Flow limit created: flowStopCount=0, flowResumeCount=0, flowStopSize=83886080, flowResumeSize=73400320 2012-02-15 10:10:46 info Queue "topic-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600 2012-02-15 10:10:46 info Queue "qmfc-v2-[hostname].1683.1": Policy created: type=reject; maxCount=0; maxSize=104857600 2012-02-15 10:10:46 info Queue "qmfc-v2-[hostname].1683.1": Flow limit created: flowStopCount=0, flowResumeCount=0, flowStopSize=83886080, flowResumeSize=73400320 2012-02-15 10:10:46 info Queue "qmfc-v2-ui-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600 2012-02-15 10:10:46 info Queue "qmfc-v2-hb-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600 2012-02-15 10:10:53 info SASL: Mechanism list: PLAIN LOGIN ANONYMOUS
And I can see following AVC error on RHEL6 caused by 'service qpidd start': setroubleshoot: SELinux is preventing qpidd from read access on the directory tmp. For complete SELinux messages. run sealert -l c8bb31c9-6a8e-430c-89f4-2a810ca5c9c4 # sealert -l c8bb31c9-6a8e-430c-89f4-2a810ca5c9c4 SELinux is preventing qpidd from read access on the directory tmp. ***** Plugin catchall (100. confidence) suggests *************************** If you believe that qpidd should be allowed read access on the tmp directory by default. Then you should report this as a bug. You can generate a local policy module to allow this access. Do allow this access for now by executing: # grep qpidd /var/log/audit/audit.log | audit2allow -M mypol # semodule -i mypol.pp ecode=0 # ls -ldZ /tmp drwxrwxrwt. root root system_u:object_r:tmp_t:s0 /tmp ecode=0
My AVC message (read request into /tmp): type=AVC msg=audit(1329299160.002:19706): avc: denied { read } for pid=1509 comm="qpidd" name="tmp" dev=vda1 ino=128028 scontext=unconfined_u:system_r:qpidd_t:s0 tcontext=system_u:object_r:tmp_t:s0 tclass=dir Is similar to Bug 694418 (this is write request /tmp), but I log qpidd information into /var/log/messages.
Selinux problem moved into new Bug 790759
I have repro on this. The behavior is that qpid-tool appears to stay connected, but prints out these messages, for me every ten seconds or so: Broker connected: Broker connected at: 127.0.0.1:6667 Broker disconnected: Disconnected Broker Here is my broker start script : #! /bin/bash # NOTE Run this in cpp/src/tests source ./test_env.sh # Debugging print. -------------------------- debug=1 function print { if [ "$debug" ]; then echo "sasl_fed_ex: $1" fi } TEST_HOSTNAME=127.0.0.1 CLIENT_NAME=guest CERT_DIR=`pwd`/src_test_cert_db CERT_PW_FILE=`pwd`/src_cert.password create_certs() { #create certificate and key databases with single, simple, self-signed certificate in it mkdir ${CERT_DIR} certutil -N -d ${CERT_DIR} -f ${CERT_PW_FILE} certutil -S -d ${CERT_DIR} -n ${TEST_HOSTNAME} -s "CN=${TEST_HOSTNAME}" -t "CT,," -x -f ${CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null certutil -S -d ${CERT_DIR} -n ${CLIENT_NAME} -s "CN=${CLIENT_NAME}" -t "T,," -x -f ${CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null } delete_certs() { if [[ -e ${CERT_DIR} ]] ; then print "removing cert dir ${CERT_DIR}" rm -rf ${CERT_DIR} fi } CERTUTIL=$(type -p certutil) if [[ !(-x $CERTUTIL) ]] ; then echo "No certutil. Quitting."; exit 0; fi delete_certs create_certs 2> /dev/null if [ ! $? ]; then error "Could not create test certificate" exit 1 fi sasl_config_dir=$builddir/sasl_config echo "sasl_config_dir $sasl_config_dir" tmp_root=${builddir}/sasl_fed_ex_temp print "results dir is ${tmp_root}" rm -rf ${tmp_root} mkdir -p $tmp_root SSL_PORT=6667 TCP_PORT=5801 SSL_LIB=${moduledir}/ssl.so # --ssl-require-client-authentication \ function start_broker { print "Starting SRC broker -- logs are in $tmp_root/qpidd_src.log" $QPIDD_EXEC \ --port=${TCP_PORT} \ --ssl-port ${SSL_PORT} \ --ssl-cert-db $CERT_DIR \ --ssl-cert-password-file $CERT_PW_FILE \ --ssl-cert-name $TEST_HOSTNAME \ --sasl-config=$sasl_config_dir \ --auth yes \ --no-data-dir \ --no-module-dir \ --load-module ${SSL_LIB} \ --load-module ${SSLCONNECTOR_LIB} \ --mgmt-enable=yes \ --log-enable info+ \ --log-source yes \ --daemon \ --log-to-file $tmp_root/qpidd_src.log 2> /dev/null } start_broker ############# end broker start script ######################## Here is how I run qpid-tool: ./tools/src/py/qpid-tool amqps://127.0.0.1:6667 and here is how I check with qpid-stat: ./tools/src/py/qpid-stat -c -b localhost:5801
Well, it's not a regression. I have same behavior going back to 0.12 .
I think this is it: ( Same thing was happening when *not* using SSL -- and it was being deliberately ignored unless connection had aborted. I imitated that code. ) ( And put in a comment, for the poor souls who follow me... ) With this fix -- no more disconnecting & reconnecting -- qpid-tool works normally. Index: qpid/python/qpid/connection.py =================================================================== --- qpid/python/qpid/connection.py (revision 1300263) +++ qpid/python/qpid/connection.py (working copy) @@ -170,6 +170,10 @@ if not status: self.detach_all() break + # When we do not use SSL transport, we get periodic + # spurious timeout events on the socket. When using SSL, + # these events show up as timeout *errors*. Both should be + # ignored unless we have aborted. except socket.timeout: if self.aborted(): self.close_code = (None, "connection timed out") @@ -178,9 +182,12 @@ else: continue except socket.error, e: - self.close_code = (None, str(e)) - self.detach_all() - break + if self.aborted() or str(e) != "The read operation timed out": + self.close_code = (None, str(e)) + self.detach_all() + break + else: + continue frame_dec.write(data) seg_dec.write(*frame_dec.read()) op_dec.write(*seg_dec.read()) ~
jira qpid-3898 Committed in svn revision 1300562. When connecting through SSL, qpid-tool starts disconnecting and reconnecting every 10 seconds. The connection it makes is good -- it gets real data. But then it unilaterally decides to disconnect, immediately reconnects -- and cycles this way forever. Well -- until you stop it, anyway. qpid-stat does not do this. This is similar to a problem that was fixed long ago in the original code -- but that was written before SSL support was available in Python. With SSL support, the same problem shows up in a slightly different guise, and is not caught by the original code.
The problem is that I get "None" when I print out e.errno . And this is when running under RHEL 6 and using Python 2.6.6 .
I installed with --nodeps and --force new messaging packages and this bug seems to be not repaired. Packages: python-qpid-0.14-6.el5.noarch python-qpid-0.14-6.el6.noarch We have two basic scenarios here. scenario (1) ------------ Comment 0 covers simple to reproduce scenario - SSL+ANONYMOUS 1. Set up a broker for ssl 2. Run qpid-tool amqps://localhost 3. Run qpid-stat -c This scenario was repaired. scenario (2) ------------ For cumin to be working properly, we need at least SSL+PLAIN working. Cumin needs to authenticate to broker. SSL+PLAIN: # qpid-stat --connections --sasl-mechanism=PLAIN ampqs://cumin/cumin@localhost:5671 Failed: Timeout - Waiting for connection to be established with broker and in /var/log/messages : qpidd[6293]: 2012-03-21 10:55:01 notice Journal "TplStore": Created qpidd[6293]: 2012-03-21 10:55:01 notice Store module initialized; store-dir=/var/lib/qpidd qpidd[6293]: 2012-03-21 10:55:01 notice Listening on TCP/TCP6 port 5672 qpidd[6293]: 2012-03-21 10:55:01 notice Listening for SSL connections on TCP port 5671 qpidd[6293]: 2012-03-21 10:55:01 notice Broker running qpidd[6293]: 2012-03-21 10:55:19 error Error reading socket: Encountered end of file [-5938] only PLAIN: # qpid-stat --connections --sasl-mechanism=PLAIN ampq://cumin/cumin@localhost:5672 is working properly. But shows that cumin (set for SSL = ampqs/5671) isn't connected. I can provide broken machine(s) and other information if necessary. ---> moving back to ASSIGNED
scenario (3) ------------ Connect using qpid-tool (PLAIN+SSL) and verify connection using qpid-stat (PLAIN). # screen qpid-tool ampqs://cumin/cumin@localhost:5671 # qpid-stat --connections --sasl-mechanism=PLAIN ampq://cumin/cumin@localhost:5672 | grep qpid-tool Not working.
After discussion with mick, we have found out that I created typo in my scripts and in my comments. ! 'ampq' instead of proper 'amqp' ! I will verify again. --> moving back to MODIFIED and clearing needinfo.
It seems that on RHEL6 is problem fixed. On RHEL5 I've got: root@sgraf-rhel5x:~# qpid-stat --connections --sasl-mechanism=PLAIN amqps://cumin/cumin@localhost:5671 Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap self.run() File "/usr/lib64/python2.4/threading.py", line 422, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 185, in run if self.aborted() or str(e) != "The read operation timed out": File "/usr/lib64/python2.4/site-packages/qmf/console.py", line 2431, in aborted raise Timeout("Waiting for connection to be established with broker") Timeout: Waiting for connection to be established with broker - while without SSL it's working - netstat shows that qpidd listens on both 5672 and 5671 ports. - qpidd log shows no errors. I can provide broken machine(s) and other information if necessary. ---> moving back to ASSIGNED
I am reproducing the problem easily -- but I believe this is something new. because: 1. The problem is happening with qpid-stat instead of qpid-tool (they use different underlying python messaging code ) 2. there is no disconnect-and-reconnect behavior as there was in the qpid-tool problem. qpid-stat is just never connecting. I cannot see any error in configuration. no ideas yet ...
I retested again following scenarios: (1) qpid-stat --connections --sasl-mechanism=PLAIN amqps://cumin/cumin@localhost:5671 (2) qpid-tool amqps://cumin/cumin@localhost:5671 (3) diff 'qpid-stat -c' against selected list of daemons RHEL5 (from RHN, 0.10) -qpid-cpp-mrg-0.10-9.el5 -qpid-tools-0.10-6.el5 -python-qpid-0.10-1.el5 (1) Failed: ConnectionFailed - (None, 'The read operation timed out') (2) qpid: Broker disconnected: Disconnected Broker (Comment 0) (3) cumin disconnected too RHEL6 (from RHN, 0.12) -qpid-cpp-0.12-6.el6 -qpid-tools-0.12-2.el6 -python-qpid-0.12-1.el6 (1) OK (2) qpid: Broker disconnected: Disconnected Broker (Comment 0) (3) cumin disconnected too Note: To reproduce on RHEL5 (qpid 0.10) I need to wait longer (may be 10 minutes). But after this disconnection, cumin UI confirms that broker is not connected.
And devel packages: RHEL5 -qpid-cpp-mrg-0.14-14.el5 -qpid-tools-0.14-1.el5 -python-qpid-0.14-6.el5 (1) Exception in thread... (2) OK (3) OK RHEL6 -qpid-cpp-0.14-12.el6 -qpid-tools-0.14-1.el6 -python-qpid-0.14-6.el6 (1) OK (2) OK (3) OK I hope this can help with tracking code changes.
qpid-stat issue moved into Bug 807919
---> VERIFIED