| Summary: | Python client is disconnected from the broker shortly after connection over ssl | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Trevor McKay <tmckay> |
| Component: | python-qpid | Assignee: | mick <mgoulish> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Stanislav Graf <sgraf> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | Development | CC: | freznice, iboverma, jross, matt, sgraf |
| Target Milestone: | 2.1.2 | Keywords: | Regression |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | python-qpid-0.14-6 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-12-07 17:42:25 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | 804001 | ||
| Bug Blocks: | 788116 | ||
|
Description
Trevor McKay
2012-02-14 20:41:24 UTC
I used following reproduction scenario: # yum install -y -q qpid-cpp-server qpid-tools nss-tools ecode=0 # echo password > /var/lib/qpidd/passwordfile ecode=0 # chown qpidd:qpidd /var/lib/qpidd/passwordfile ecode=0 # chmod 600 /var/lib/qpidd/passwordfile ecode=0 # cd /var/lib/qpidd ecode=0 # sudo -u qpidd certutil -N -d . -f passwordfile ecode=0 # sudo -u qpidd certutil -S -d . -f passwordfile -n nickname -s "CN=nickname" -t "CT,," -x -z /usr/bin/certutil Generating key. This may take a few moments... ecode=0 # grep --invert-match ^# /etc/qpidd.conf cluster-mechanism=ANONYMOUS auth=yes ssl-cert-password-file=/var/lib/qpidd/passwordfile ssl-cert-db=/var/lib/qpidd ssl-cert-name=nickname ecode=0 # yum install -y -q qpid-cpp-server-ssl ecode=0 # service qpidd restart Stopping Qpid AMQP daemon: [FAILED] Starting Qpid AMQP daemon: [ OK ] ecode=0 # tail /var/log/messages | grep qpidd | cut --delimiter=']' --fields=2 : 2012-02-15 09:44:55 notice Listening on TCP port 5672 : 2012-02-15 09:44:55 notice Listening for SSL connections on TCP port 5671 : 2012-02-15 09:44:55 notice Broker running ecode=0 Following debug messages are produced by qpidd: 2012-02-15 10:10:46 info SASL: Mechanism list: PLAIN LOGIN ANONYMOUS 2012-02-15 10:10:46 info 127.0.0.1:5671-127.0.0.1:44095 SASL: Authentication succeeded for: anonymous@QPID 2012-02-15 10:10:46 info Queue "reply-[hostname].1683.1": Policy created: type=reject; maxCount=0; maxSize=104857600 2012-02-15 10:10:46 info Queue "reply-[hostname].1683.1": Flow limit created: flowStopCount=0, flowResumeCount=0, flowStopSize=83886080, flowResumeSize=73400320 2012-02-15 10:10:46 info Queue "topic-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600 2012-02-15 10:10:46 info Queue "qmfc-v2-[hostname].1683.1": Policy created: type=reject; maxCount=0; maxSize=104857600 2012-02-15 10:10:46 info Queue "qmfc-v2-[hostname].1683.1": Flow limit created: flowStopCount=0, flowResumeCount=0, flowStopSize=83886080, flowResumeSize=73400320 2012-02-15 10:10:46 info Queue "qmfc-v2-ui-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600 2012-02-15 10:10:46 info Queue "qmfc-v2-hb-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600 2012-02-15 10:10:53 info SASL: Mechanism list: PLAIN LOGIN ANONYMOUS And I can see following AVC error on RHEL6 caused by 'service qpidd start': setroubleshoot: SELinux is preventing qpidd from read access on the directory tmp. For complete SELinux messages. run sealert -l c8bb31c9-6a8e-430c-89f4-2a810ca5c9c4 # sealert -l c8bb31c9-6a8e-430c-89f4-2a810ca5c9c4 SELinux is preventing qpidd from read access on the directory tmp. ***** Plugin catchall (100. confidence) suggests *************************** If you believe that qpidd should be allowed read access on the tmp directory by default. Then you should report this as a bug. You can generate a local policy module to allow this access. Do allow this access for now by executing: # grep qpidd /var/log/audit/audit.log | audit2allow -M mypol # semodule -i mypol.pp ecode=0 # ls -ldZ /tmp drwxrwxrwt. root root system_u:object_r:tmp_t:s0 /tmp ecode=0 My AVC message (read request into /tmp):
type=AVC msg=audit(1329299160.002:19706): avc: denied { read } for pid=1509 comm="qpidd" name="tmp" dev=vda1 ino=128028 scontext=unconfined_u:system_r:qpidd_t:s0 tcontext=system_u:object_r:tmp_t:s0 tclass=dir
Is similar to Bug 694418 (this is write request /tmp), but I log qpidd information into /var/log/messages.
Selinux problem moved into new Bug 790759 I have repro on this. The behavior is that qpid-tool appears to stay connected, but prints out these messages, for me every ten seconds or so:
Broker connected: Broker connected at: 127.0.0.1:6667
Broker disconnected: Disconnected Broker
Here is my broker start script :
#! /bin/bash
# NOTE Run this in cpp/src/tests
source ./test_env.sh
# Debugging print. --------------------------
debug=1
function print {
if [ "$debug" ]; then
echo "sasl_fed_ex: $1"
fi
}
TEST_HOSTNAME=127.0.0.1
CLIENT_NAME=guest
CERT_DIR=`pwd`/src_test_cert_db
CERT_PW_FILE=`pwd`/src_cert.password
create_certs() {
#create certificate and key databases with single, simple, self-signed certificate in it
mkdir ${CERT_DIR}
certutil -N -d ${CERT_DIR} -f ${CERT_PW_FILE}
certutil -S -d ${CERT_DIR} -n ${TEST_HOSTNAME} -s "CN=${TEST_HOSTNAME}" -t "CT,," -x -f ${CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null
certutil -S -d ${CERT_DIR} -n ${CLIENT_NAME} -s "CN=${CLIENT_NAME}" -t "T,," -x -f ${CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null
}
delete_certs() {
if [[ -e ${CERT_DIR} ]] ; then
print "removing cert dir ${CERT_DIR}"
rm -rf ${CERT_DIR}
fi
}
CERTUTIL=$(type -p certutil)
if [[ !(-x $CERTUTIL) ]] ; then
echo "No certutil. Quitting.";
exit 0;
fi
delete_certs
create_certs 2> /dev/null
if [ ! $? ]; then
error "Could not create test certificate"
exit 1
fi
sasl_config_dir=$builddir/sasl_config
echo "sasl_config_dir $sasl_config_dir"
tmp_root=${builddir}/sasl_fed_ex_temp
print "results dir is ${tmp_root}"
rm -rf ${tmp_root}
mkdir -p $tmp_root
SSL_PORT=6667
TCP_PORT=5801
SSL_LIB=${moduledir}/ssl.so
# --ssl-require-client-authentication \
function start_broker {
print "Starting SRC broker -- logs are in $tmp_root/qpidd_src.log"
$QPIDD_EXEC \
--port=${TCP_PORT} \
--ssl-port ${SSL_PORT} \
--ssl-cert-db $CERT_DIR \
--ssl-cert-password-file $CERT_PW_FILE \
--ssl-cert-name $TEST_HOSTNAME \
--sasl-config=$sasl_config_dir \
--auth yes \
--no-data-dir \
--no-module-dir \
--load-module ${SSL_LIB} \
--load-module ${SSLCONNECTOR_LIB} \
--mgmt-enable=yes \
--log-enable info+ \
--log-source yes \
--daemon \
--log-to-file $tmp_root/qpidd_src.log 2> /dev/null
}
start_broker
############# end broker start script ########################
Here is how I run qpid-tool:
./tools/src/py/qpid-tool amqps://127.0.0.1:6667
and here is how I check with qpid-stat:
./tools/src/py/qpid-stat -c -b localhost:5801
Well, it's not a regression. I have same behavior going back to 0.12 . I think this is it:
( Same thing was happening when *not* using SSL -- and it was being deliberately ignored unless connection had aborted. I imitated that code. )
( And put in a comment, for the poor souls who follow me... )
With this fix -- no more disconnecting & reconnecting -- qpid-tool works normally.
Index: qpid/python/qpid/connection.py
===================================================================
--- qpid/python/qpid/connection.py (revision 1300263)
+++ qpid/python/qpid/connection.py (working copy)
@@ -170,6 +170,10 @@
if not status:
self.detach_all()
break
+ # When we do not use SSL transport, we get periodic
+ # spurious timeout events on the socket. When using SSL,
+ # these events show up as timeout *errors*. Both should be
+ # ignored unless we have aborted.
except socket.timeout:
if self.aborted():
self.close_code = (None, "connection timed out")
@@ -178,9 +182,12 @@
else:
continue
except socket.error, e:
- self.close_code = (None, str(e))
- self.detach_all()
- break
+ if self.aborted() or str(e) != "The read operation timed out":
+ self.close_code = (None, str(e))
+ self.detach_all()
+ break
+ else:
+ continue
frame_dec.write(data)
seg_dec.write(*frame_dec.read())
op_dec.write(*seg_dec.read())
~
jira qpid-3898 Committed in svn revision 1300562. When connecting through SSL, qpid-tool starts disconnecting and reconnecting every 10 seconds. The connection it makes is good -- it gets real data. But then it unilaterally decides to disconnect, immediately reconnects -- and cycles this way forever. Well -- until you stop it, anyway. qpid-stat does not do this. This is similar to a problem that was fixed long ago in the original code -- but that was written before SSL support was available in Python. With SSL support, the same problem shows up in a slightly different guise, and is not caught by the original code. The problem is that I get "None" when I print out e.errno . And this is when running under RHEL 6 and using Python 2.6.6 . I installed with --nodeps and --force new messaging packages and this bug seems to be not repaired. Packages: python-qpid-0.14-6.el5.noarch python-qpid-0.14-6.el6.noarch We have two basic scenarios here. scenario (1) ------------ Comment 0 covers simple to reproduce scenario - SSL+ANONYMOUS 1. Set up a broker for ssl 2. Run qpid-tool amqps://localhost 3. Run qpid-stat -c This scenario was repaired. scenario (2) ------------ For cumin to be working properly, we need at least SSL+PLAIN working. Cumin needs to authenticate to broker. SSL+PLAIN: # qpid-stat --connections --sasl-mechanism=PLAIN ampqs://cumin/cumin@localhost:5671 Failed: Timeout - Waiting for connection to be established with broker and in /var/log/messages : qpidd[6293]: 2012-03-21 10:55:01 notice Journal "TplStore": Created qpidd[6293]: 2012-03-21 10:55:01 notice Store module initialized; store-dir=/var/lib/qpidd qpidd[6293]: 2012-03-21 10:55:01 notice Listening on TCP/TCP6 port 5672 qpidd[6293]: 2012-03-21 10:55:01 notice Listening for SSL connections on TCP port 5671 qpidd[6293]: 2012-03-21 10:55:01 notice Broker running qpidd[6293]: 2012-03-21 10:55:19 error Error reading socket: Encountered end of file [-5938] only PLAIN: # qpid-stat --connections --sasl-mechanism=PLAIN ampq://cumin/cumin@localhost:5672 is working properly. But shows that cumin (set for SSL = ampqs/5671) isn't connected. I can provide broken machine(s) and other information if necessary. ---> moving back to ASSIGNED scenario (3) ------------ Connect using qpid-tool (PLAIN+SSL) and verify connection using qpid-stat (PLAIN). # screen qpid-tool ampqs://cumin/cumin@localhost:5671 # qpid-stat --connections --sasl-mechanism=PLAIN ampq://cumin/cumin@localhost:5672 | grep qpid-tool Not working. After discussion with mick, we have found out that I created typo in my scripts and in my comments. ! 'ampq' instead of proper 'amqp' ! I will verify again. --> moving back to MODIFIED and clearing needinfo. It seems that on RHEL6 is problem fixed.
On RHEL5 I've got:
root@sgraf-rhel5x:~# qpid-stat --connections --sasl-mechanism=PLAIN amqps://cumin/cumin@localhost:5671
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
self.run()
File "/usr/lib64/python2.4/threading.py", line 422, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 185, in run
if self.aborted() or str(e) != "The read operation timed out":
File "/usr/lib64/python2.4/site-packages/qmf/console.py", line 2431, in aborted
raise Timeout("Waiting for connection to be established with broker")
Timeout: Waiting for connection to be established with broker
- while without SSL it's working
- netstat shows that qpidd listens on both 5672 and 5671 ports.
- qpidd log shows no errors.
I can provide broken machine(s) and other information if necessary.
---> moving back to ASSIGNED
I am reproducing the problem easily -- but I believe this is something new. because: 1. The problem is happening with qpid-stat instead of qpid-tool (they use different underlying python messaging code ) 2. there is no disconnect-and-reconnect behavior as there was in the qpid-tool problem. qpid-stat is just never connecting. I cannot see any error in configuration. no ideas yet ... I retested again following scenarios: (1) qpid-stat --connections --sasl-mechanism=PLAIN amqps://cumin/cumin@localhost:5671 (2) qpid-tool amqps://cumin/cumin@localhost:5671 (3) diff 'qpid-stat -c' against selected list of daemons RHEL5 (from RHN, 0.10) -qpid-cpp-mrg-0.10-9.el5 -qpid-tools-0.10-6.el5 -python-qpid-0.10-1.el5 (1) Failed: ConnectionFailed - (None, 'The read operation timed out') (2) qpid: Broker disconnected: Disconnected Broker (Comment 0) (3) cumin disconnected too RHEL6 (from RHN, 0.12) -qpid-cpp-0.12-6.el6 -qpid-tools-0.12-2.el6 -python-qpid-0.12-1.el6 (1) OK (2) qpid: Broker disconnected: Disconnected Broker (Comment 0) (3) cumin disconnected too Note: To reproduce on RHEL5 (qpid 0.10) I need to wait longer (may be 10 minutes). But after this disconnection, cumin UI confirms that broker is not connected. And devel packages: RHEL5 -qpid-cpp-mrg-0.14-14.el5 -qpid-tools-0.14-1.el5 -python-qpid-0.14-6.el5 (1) Exception in thread... (2) OK (3) OK RHEL6 -qpid-cpp-0.14-12.el6 -qpid-tools-0.14-1.el6 -python-qpid-0.14-6.el6 (1) OK (2) OK (3) OK I hope this can help with tracking code changes. qpid-stat issue moved into Bug 807919 ---> VERIFIED |