Bug 790563 - Python client is disconnected from the broker shortly after connection over ssl
Summary: Python client is disconnected from the broker shortly after connection over ssl
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: python-qpid
Version: Development
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: 2.1.2
: ---
Assignee: mick
QA Contact: Stanislav Graf
URL:
Whiteboard:
Depends On: 804001
Blocks: 788116
TreeView+ depends on / blocked
 
Reported: 2012-02-14 20:41 UTC by Trevor McKay
Modified: 2012-12-07 17:44 UTC (History)
5 users (show)

Fixed In Version: python-qpid-0.14-6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-12-07 17:42:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 694418 0 urgent CLOSED qpidd does not start if log-to-file parameter specified in the config file due to SELinux ' avc: denied { write } for ... 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 782806 0 medium CLOSED [RFE] Python qpid client ssl support 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 790759 0 unspecified CLOSED qpidd broker triggers SELinux AVCs avc: type=AVC msg=audit(1329299160.002:19706): avc: denied { read } for pid=1509 c... 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 807919 1 None None None 2021-01-20 06:05:38 UTC
Red Hat Bugzilla 825078 1 None None None 2021-01-20 06:05:38 UTC

Internal Links: 694418 782806 790759 807919 825078

Description Trevor McKay 2012-02-14 20:41:24 UTC
Description of problem:

After configuring a broker for ssl communication on port 5671, Python clients that connect (like cumin and qpid-tool) are disconnected a short time later (on the order of tens of seconds).

Version-Release number of selected component (if applicable):

# rpm -qa | grep -i qpid
qpid-cpp-client-ssl-0.14-5.el5
qpid-qmf-0.14-2.el5
qpid-cpp-server-0.14-5.el5
qpid-cpp-server-ssl-0.14-5.el5
python-qpid-0.14-3.el5
qpid-tools-0.14-1.el5
qpid-cpp-client-0.14-5.el5
python-qpid-qmf-0.14-2.el5

# rpm -qa | grep -i saslwrapper
python-saslwrapper-0.10-4.el5
saslwrapper-0.10-4.el5

How reproducible:

100%

Steps to Reproduce:
1. Set up a broker for ssl
2. Run qpid-tool amqps://localhost
3. Run qpid-stat -c
  
Actual results:

The qpid-tool connection will show initially in qpid-stat output (and print statements in the brokerConnected callback in qpid-tool will print).  Shortly thereafter, qpid-tool will show "Broker disconnected" and qpid-stat -c will no longer show the connection. 

Expected results:

The connection should persist as long as qpid-tool is running.

Additional info:

Comment 3 Stanislav Graf 2012-02-15 09:24:17 UTC
I used following reproduction scenario:
# yum install -y -q qpid-cpp-server qpid-tools nss-tools
ecode=0
# echo password > /var/lib/qpidd/passwordfile
ecode=0
# chown qpidd:qpidd /var/lib/qpidd/passwordfile
ecode=0
# chmod 600 /var/lib/qpidd/passwordfile
ecode=0
# cd /var/lib/qpidd
ecode=0
# sudo -u qpidd certutil -N -d . -f passwordfile
ecode=0
# sudo -u qpidd certutil -S -d . -f passwordfile -n nickname -s "CN=nickname" -t "CT,," -x -z /usr/bin/certutil
Generating key.  This may take a few moments...
ecode=0
# grep --invert-match ^# /etc/qpidd.conf 
cluster-mechanism=ANONYMOUS
auth=yes
ssl-cert-password-file=/var/lib/qpidd/passwordfile
ssl-cert-db=/var/lib/qpidd
ssl-cert-name=nickname
ecode=0
# yum install -y -q qpid-cpp-server-ssl
ecode=0
# service qpidd restart
Stopping Qpid AMQP daemon:                                 [FAILED]
Starting Qpid AMQP daemon:                                 [  OK  ]
ecode=0
# tail /var/log/messages | grep qpidd | cut --delimiter=']' --fields=2
: 2012-02-15 09:44:55 notice Listening on TCP port 5672 
: 2012-02-15 09:44:55 notice Listening for SSL connections on TCP port 5671 
: 2012-02-15 09:44:55 notice Broker running 
ecode=0

Following debug messages are produced by qpidd:
2012-02-15 10:10:46 info SASL: Mechanism list: PLAIN LOGIN ANONYMOUS
2012-02-15 10:10:46 info 127.0.0.1:5671-127.0.0.1:44095 SASL: Authentication succeeded for: anonymous@QPID
2012-02-15 10:10:46 info Queue "reply-[hostname].1683.1": Policy created: type=reject; maxCount=0; maxSize=104857600
2012-02-15 10:10:46 info Queue "reply-[hostname].1683.1": Flow limit created: flowStopCount=0, flowResumeCount=0, flowStopSize=83886080, flowResumeSize=73400320
2012-02-15 10:10:46 info Queue "topic-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2012-02-15 10:10:46 info Queue "qmfc-v2-[hostname].1683.1": Policy created: type=reject; maxCount=0; maxSize=104857600
2012-02-15 10:10:46 info Queue "qmfc-v2-[hostname].1683.1": Flow limit created: flowStopCount=0, flowResumeCount=0, flowStopSize=83886080, flowResumeSize=73400320
2012-02-15 10:10:46 info Queue "qmfc-v2-ui-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2012-02-15 10:10:46 info Queue "qmfc-v2-hb-[hostname].1683.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2012-02-15 10:10:53 info SASL: Mechanism list: PLAIN LOGIN ANONYMOUS

Comment 5 Stanislav Graf 2012-02-15 09:25:47 UTC
And I can see following AVC error on RHEL6 caused by 'service qpidd start':

setroubleshoot: SELinux is preventing qpidd from read access on the directory tmp. For complete SELinux messages. run sealert -l c8bb31c9-6a8e-430c-89f4-2a810ca5c9c4

# sealert -l c8bb31c9-6a8e-430c-89f4-2a810ca5c9c4
SELinux is preventing qpidd from read access on the directory tmp.

*****  Plugin catchall (100. confidence) suggests  ***************************

If you believe that qpidd should be allowed read access on the tmp directory by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# grep qpidd /var/log/audit/audit.log | audit2allow -M mypol
# semodule -i mypol.pp


ecode=0
# ls -ldZ /tmp
drwxrwxrwt. root root system_u:object_r:tmp_t:s0       /tmp
ecode=0

Comment 7 Stanislav Graf 2012-02-15 10:00:47 UTC
My AVC message (read request into /tmp):
type=AVC msg=audit(1329299160.002:19706): avc:  denied  { read } for  pid=1509 comm="qpidd" name="tmp" dev=vda1 ino=128028 scontext=unconfined_u:system_r:qpidd_t:s0 tcontext=system_u:object_r:tmp_t:s0 tclass=dir

Is similar to Bug 694418 (this is write request /tmp), but I log qpidd information into /var/log/messages.

Comment 8 Stanislav Graf 2012-02-15 10:53:34 UTC
Selinux problem moved into new Bug 790759

Comment 10 mick 2012-02-29 15:52:55 UTC
I have repro on this.  The behavior is that qpid-tool appears to stay connected, but prints out these messages, for me every ten seconds or so:

  Broker connected: Broker connected at: 127.0.0.1:6667
Broker disconnected: Disconnected Broker


Here is my broker start script :
#! /bin/bash 


# NOTE   Run this in cpp/src/tests


source ./test_env.sh


# Debugging print. --------------------------
debug=1
function print {
  if [ "$debug" ]; then
    echo "sasl_fed_ex: $1"
  fi
}

TEST_HOSTNAME=127.0.0.1
CLIENT_NAME=guest

CERT_DIR=`pwd`/src_test_cert_db
CERT_PW_FILE=`pwd`/src_cert.password


create_certs() {
    #create certificate and key databases with single, simple, self-signed certificate in it
    mkdir ${CERT_DIR}
    certutil -N -d ${CERT_DIR} -f ${CERT_PW_FILE}
    certutil -S -d ${CERT_DIR} -n ${TEST_HOSTNAME} -s "CN=${TEST_HOSTNAME}" -t "CT,," -x -f ${CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null
    certutil -S -d ${CERT_DIR} -n ${CLIENT_NAME} -s "CN=${CLIENT_NAME}" -t "T,," -x -f ${CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null
}


delete_certs() {
    if [[ -e ${CERT_DIR} ]] ;  then
        print "removing cert dir ${CERT_DIR}"
        rm -rf ${CERT_DIR}
    fi
}


CERTUTIL=$(type -p certutil)
if [[ !(-x $CERTUTIL) ]] ; then
    echo "No certutil.  Quitting.";
    exit 0;
fi

delete_certs
create_certs 2> /dev/null
if [ ! $? ]; then
  error "Could not create test certificate"
  exit 1
fi

sasl_config_dir=$builddir/sasl_config
echo "sasl_config_dir $sasl_config_dir"

tmp_root=${builddir}/sasl_fed_ex_temp
print "results dir is ${tmp_root}"
rm -rf ${tmp_root}
mkdir -p $tmp_root

SSL_PORT=6667
TCP_PORT=5801

SSL_LIB=${moduledir}/ssl.so

# --ssl-require-client-authentication        \

function start_broker {
  print "Starting SRC broker -- logs are in  $tmp_root/qpidd_src.log"
  $QPIDD_EXEC                                  \
    --port=${TCP_PORT}                         \
    --ssl-port ${SSL_PORT}                     \
    --ssl-cert-db $CERT_DIR                    \
    --ssl-cert-password-file $CERT_PW_FILE     \
    --ssl-cert-name $TEST_HOSTNAME             \
    --sasl-config=$sasl_config_dir             \
    --auth yes                                 \
    --no-data-dir                              \
    --no-module-dir                            \
    --load-module ${SSL_LIB}                   \
    --load-module ${SSLCONNECTOR_LIB}          \
    --mgmt-enable=yes                          \
    --log-enable info+                         \
    --log-source yes                           \
    --daemon                                   \
    --log-to-file $tmp_root/qpidd_src.log 2> /dev/null
}


start_broker




############# end broker start script ########################

Here is how I run qpid-tool:
./tools/src/py/qpid-tool amqps://127.0.0.1:6667

and here is how I check with qpid-stat:

    ./tools/src/py/qpid-stat -c -b localhost:5801

Comment 11 mick 2012-02-29 19:59:44 UTC
Well, it's not a regression.  I have same behavior going back to 0.12 .

Comment 13 mick 2012-03-13 18:56:21 UTC
I think this is it:

( Same thing was happening when *not* using SSL -- and it was being deliberately ignored unless connection had aborted.  I imitated that code. )

( And put in a comment, for the poor souls who follow me... )

With this fix -- no more disconnecting & reconnecting -- qpid-tool works normally.




Index: qpid/python/qpid/connection.py
===================================================================
--- qpid/python/qpid/connection.py      (revision 1300263)
+++ qpid/python/qpid/connection.py      (working copy)
@@ -170,6 +170,10 @@
           if not status:
             self.detach_all() 
             break
+      # When we do not use SSL transport, we get periodic 
+      # spurious timeout events on the socket.  When using SSL,
+      # these events show up as timeout *errors*.  Both should be
+      # ignored unless we have aborted.
       except socket.timeout:
         if self.aborted():
           self.close_code = (None, "connection timed out")
@@ -178,9 +182,12 @@
         else:
           continue
       except socket.error, e:
-        self.close_code = (None, str(e))
-        self.detach_all()
-        break
+        if self.aborted() or str(e) != "The read operation timed out":
+          self.close_code = (None, str(e))
+          self.detach_all()
+          break
+        else:
+          continue
       frame_dec.write(data)
       seg_dec.write(*frame_dec.read())
       op_dec.write(*seg_dec.read())
~

Comment 14 mick 2012-03-14 14:58:23 UTC
jira  qpid-3898
Committed in svn revision 1300562.

When connecting through SSL, qpid-tool starts disconnecting and
reconnecting every 10 seconds.

The connection it makes is good -- it gets real data.  But then
it unilaterally decides to disconnect, immediately reconnects --
and cycles this way forever.   Well -- until you stop it, anyway.

qpid-stat does not do this.

This is similar to a problem that was fixed long ago in the original
code -- but that was written before SSL support was available in Python.

With SSL support, the same problem shows up in a slightly different guise, and is not caught by the original code.

Comment 22 mick 2012-03-16 11:53:46 UTC
The problem is that I get "None" when I print out e.errno .  And this is when running under RHEL 6 and using Python 2.6.6  .

Comment 24 Stanislav Graf 2012-03-21 10:50:44 UTC
I installed with --nodeps and --force new messaging packages and this bug seems to be not repaired.
Packages:
python-qpid-0.14-6.el5.noarch
python-qpid-0.14-6.el6.noarch

We have two basic scenarios here. 

scenario (1)
------------

Comment 0 covers simple to reproduce scenario - SSL+ANONYMOUS
1. Set up a broker for ssl
2. Run qpid-tool amqps://localhost
3. Run qpid-stat -c

This scenario was repaired.

scenario (2)
------------

For cumin to be working properly, we need at least SSL+PLAIN working. Cumin needs to authenticate to broker.

SSL+PLAIN:

# qpid-stat --connections --sasl-mechanism=PLAIN ampqs://cumin/cumin@localhost:5671
Failed: Timeout - Waiting for connection to be established with broker

and in /var/log/messages :
qpidd[6293]: 2012-03-21 10:55:01 notice Journal "TplStore": Created 
qpidd[6293]: 2012-03-21 10:55:01 notice Store module initialized; store-dir=/var/lib/qpidd 
qpidd[6293]: 2012-03-21 10:55:01 notice Listening on TCP/TCP6 port 5672 
qpidd[6293]: 2012-03-21 10:55:01 notice Listening for SSL connections on TCP port 5671 
qpidd[6293]: 2012-03-21 10:55:01 notice Broker running 
qpidd[6293]: 2012-03-21 10:55:19 error Error reading socket: Encountered end of file [-5938] 

only PLAIN:

# qpid-stat --connections --sasl-mechanism=PLAIN ampq://cumin/cumin@localhost:5672

is working properly. But shows that cumin (set for SSL = ampqs/5671) isn't connected.

I can provide broken machine(s) and other information if necessary.

---> moving back to ASSIGNED

Comment 26 Stanislav Graf 2012-03-21 11:07:19 UTC
scenario (3)
------------

Connect using qpid-tool (PLAIN+SSL) and verify connection using qpid-stat (PLAIN).

# screen qpid-tool ampqs://cumin/cumin@localhost:5671
# qpid-stat --connections --sasl-mechanism=PLAIN
ampq://cumin/cumin@localhost:5672 | grep qpid-tool

Not working.

Comment 27 Stanislav Graf 2012-03-21 14:06:16 UTC
After discussion with mick, we have found out that I created typo in my scripts and in my comments. 

! 'ampq' instead of proper 'amqp' !

I will verify again.

--> moving back to MODIFIED and clearing needinfo.

Comment 28 Stanislav Graf 2012-03-21 16:35:43 UTC
It seems that on RHEL6 is problem fixed.

On RHEL5 I've got:
    root@sgraf-rhel5x:~# qpid-stat --connections --sasl-mechanism=PLAIN amqps://cumin/cumin@localhost:5671
    Exception in thread Thread-2:
    Traceback (most recent call last):
      File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
        self.run()
      File "/usr/lib64/python2.4/threading.py", line 422, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 185, in run
        if self.aborted() or str(e) != "The read operation timed out":
      File "/usr/lib64/python2.4/site-packages/qmf/console.py", line 2431, in aborted
        raise Timeout("Waiting for connection to be established with broker")
    Timeout: Waiting for connection to be established with broker 

- while without SSL it's working
- netstat shows that qpidd listens on both 5672 and 5671 ports.
- qpidd log shows no errors.

I can provide broken machine(s) and other information if necessary.

---> moving back to ASSIGNED

Comment 29 mick 2012-03-21 19:14:48 UTC
I am reproducing the problem easily -- but I believe this is something new.  
because:

1. The problem is happening with qpid-stat instead of qpid-tool (they use different underlying python messaging code ) 

2. there is no disconnect-and-reconnect behavior as there was in the qpid-tool problem.  qpid-stat is just never connecting.

I cannot see any error in configuration.

no ideas yet ...

Comment 30 Stanislav Graf 2012-03-23 11:51:19 UTC
I retested again following scenarios:

(1) qpid-stat --connections --sasl-mechanism=PLAIN amqps://cumin/cumin@localhost:5671
(2) qpid-tool amqps://cumin/cumin@localhost:5671
(3) diff 'qpid-stat -c' against selected list of daemons

RHEL5 (from RHN, 0.10)
-qpid-cpp-mrg-0.10-9.el5
-qpid-tools-0.10-6.el5
-python-qpid-0.10-1.el5
(1) Failed: ConnectionFailed - (None, 'The read operation timed out')
(2) qpid: Broker disconnected: Disconnected Broker (Comment 0)
(3) cumin disconnected too

RHEL6 (from RHN, 0.12)
-qpid-cpp-0.12-6.el6
-qpid-tools-0.12-2.el6
-python-qpid-0.12-1.el6
(1) OK
(2) qpid: Broker disconnected: Disconnected Broker (Comment 0)
(3) cumin disconnected too

Note: To reproduce on RHEL5 (qpid 0.10) I need to wait longer (may be 10 minutes). But after this disconnection, cumin UI confirms that broker is not connected.

Comment 31 Stanislav Graf 2012-03-23 12:36:41 UTC
And devel packages:

RHEL5
-qpid-cpp-mrg-0.14-14.el5
-qpid-tools-0.14-1.el5
-python-qpid-0.14-6.el5
(1) Exception in thread...
(2) OK
(3) OK

RHEL6
-qpid-cpp-0.14-12.el6
-qpid-tools-0.14-1.el6
-python-qpid-0.14-6.el6
(1) OK
(2) OK
(3) OK

I hope this can help with tracking code changes.

Comment 33 Stanislav Graf 2012-03-29 07:10:00 UTC
qpid-stat issue moved into Bug 807919

Comment 36 Stanislav Graf 2012-04-06 08:06:16 UTC
---> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.