Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 950501 - python clients throw "[Errno 1] _ssl.c:1217: error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry"
python clients throw "[Errno 1] _ssl.c:1217: error:1409F07F:SSL routines:SSL3...
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: python-qpid (Show other bugs)
2.3
Unspecified Unspecified
medium Severity medium
: 3.0
: ---
Assigned To: Ken Giusti
mick
: OtherQA
Depends On:
Blocks: 785156
  Show dependency treegraph
 
Reported: 2013-04-10 07:13 EDT by Stuart Auchterlonie
Modified: 2014-09-24 11:07 EDT (History)
5 users (show)

See Also:
Fixed In Version: python-qpid-0.22-3.el6, python-qpid-0.22-2.el5
Doc Type: Bug Fix
Doc Text:
In circumstances when the python client code tried to re-write queued data to a previously blocked socket, it was discovered that the python client code was changing the pointer used prior to the socket blocking. The underlying OpenSSL library detected that the pointer had changed since the last call to write, and threw an exception. The python client is now modified to keep the original pointer constant when new data is added. When the socket becomes writable again, the address of the pointer passed to the socket is identical to the address passed prior to the socket blocking, as required by OpenSSL.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-09-24 11:07:22 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Reproducer (450 bytes, text/plain)
2013-05-20 15:13 EDT, Ken Giusti
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Apache JIRA qpid-4872 None None None Never
Red Hat Product Errata RHEA-2014:1296 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 3.0 Release 2014-09-24 15:00:06 EDT

  None (edit)
Description Stuart Auchterlonie 2013-04-10 07:13:57 EDT
See upstream: https://issues.apache.org/jira/browse/QPID-3175

Description of problem:

when using the ssl transport layer in Python clients, the client is sending messages in burst to the broker in asynchronous manner (sync=False in Sender.send) the exception is occasionally thrown with the following output:

[Errno 1] _ssl.c:1217: error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry

The working theory is that when the client's socket gets full, the next underlying SSLSocket.write() throws the SSLError (with SSL_ERROR_WANT_WRITE as a code) and this isn't handled properly

Setting the socket to blocking is one possible workaround.

Version-Release number of selected component (if applicable):

python-qpid-0.18-4.el6

How reproducible:

"sometimes"

Steps to Reproduce:
1. Push a lot of data at the ssl socket such that a blocking socket would block, but the non blocking socket returns the relevant error.
2.
3.
  
Actual results:

[Errno 1] _ssl.c:1217: error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry

Expected results:

The code handles the full socket gracefully or uses blocking sockets if that is appropriate

Additional info:
Comment 1 Ken Giusti 2013-05-20 13:54:16 EDT
I suspect the root of the problem is actually due to this:

http://bugs.python.org/issue8240

which hasn't been fixed upstream to date.

I haven't been able to reproduce this, but I suspect that the SSL_ERROR_WANT_WRITE is being raised at some point, and the output buffer is being updated before the write is retried.  This would cause python to re-allocate the output buffer, which would change the underlying pointer, which would then cause the SSL_Write to fail with the given exception.

While we _could_ re-write the python client's SSL code to save a reference to the original buffer object and re-supply the write()/recv() call with the same arguments on re-try as required, we're still not guaranteed that the python implementation will ensure that the same physical pointer is used.

I'd much rather we turn on blocking mode.  AFAIK, non-ssl TCP connections use blocking mode.  Why does the SSL implementation use non-blocking?

Justin - Rafi is the original author of this code, could he weigh in on why non-blocking was used?

-K
Comment 2 Ken Giusti 2013-05-20 15:13:33 EDT
Created attachment 750695 [details]
Reproducer

Attached a simple reproducer.

If I run three or four of these in separate shells, each in a loop, it will trigger the exception.
Comment 3 Ken Giusti 2013-05-22 15:27:00 EDT
Fix submitted upstream:

http://svn.apache.org/viewvc?view=revision&revision=1485331
Comment 4 mick 2013-07-26 05:18:35 EDT
Ken -- How frequently should I see the error when I run several of your reproducers in separate windows?
Comment 5 Ken Giusti 2013-07-26 10:18:47 EDT
Hi Mick,

From what I can recall - it was very hard to reproduce.

Having said that - having it happen once is certainly enough.  I (hope) that we should not get this error once the fix is in place.

But there's more:  this fix is actually a work-around for a bug in python.  They've since fixed this bug, but I'll be darned if I can tell what release of python the fix is in :(

So, if you're NOT seeing a reproducer using the latest python for your environment, it -could- be due to the bug being fixed in python.  I suspect that's probably not likely, but should be confirmed before wasting too much time.

The python bug info is here:

http://bugs.python.org/issue8240

looks like it was fixed back in May, but I can't find any info about which release(s) it is in in that bug report.

In any case, our work-around is still necessary to deal with existing (unpatched) python environments.
Comment 6 mick 2013-07-30 12:53:16 EDT
Notes on reproduing this.


===========================================
  Making the SSL info, for the broker:
===========================================

#! /bin/bash

#----------------------------------------------------
# create certificate and key databases with single,
# simple,self-signed certificate in it   #----------------------------------------------------

CERT_DIR=test_cert_dir
CERT_PW_FILE=cert.password
TEST_HOSTNAME=127.0.0.1

rm -rf ${CERT_DIR} ${CERT_PW_FILE}
mkdir ${CERT_DIR}

echo password > ${CERT_PW_FILE}

certutil -N -d ${CERT_DIR} -f ${CERT_PW_FILE}
certutil -S -d ${CERT_DIR} -n ${TEST_HOSTNAME} \
         -s "CN=${TEST_HOSTNAME}" -t "CT,," -x \
         -f ${CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null



======================================
Running the broker
======================================
qpidd -d                                            \
  --port     5801                                   \
  --ssl-port 5802                                   \
  --load-module /usr/lib64/qpid/daemon/ssl.so       \
  --require-encryption                              \
  --auth no                                         \
  --ssl-cert-password-file /home/mick/cert.password \
  --ssl-cert-db /home/mick/test_cert_dir            \
  --ssl-cert-name 127.0.0.1



=============================================
Repro script
( May need several in separate CLIs. )
=============================================

#!/usr/bin/env python
#
from qpid.messaging import *

conn = Connection( "amqps://127.0.0.1:5671" )
try:
  conn.open()
  ssn = conn.session()
  snd = ssn.sender( "ken; {create: always}" )
  snd.send( u"m"*1024*1024*4, sync=False )
  snd.send( u"m"*1024*1024*2, sync=False )
  snd.send( u"m"*1024*1024*4, sync=False )
  snd.send( u"m"*1024*1024*2, sync=False )
  print "X"
except SendError, e:
  print e
except KeyboardInterrupt:
  pass

conn.close()
Comment 7 mick 2013-07-30 14:37:20 EDT


"stable" packages on bug-repro machine

{

cyrus-sasl-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-devel-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-gssapi-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-md5-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-plain-2.1.23-13.el6_3.1.x86_64
python-qpid-0.18-4.el6.noarch
python-qpid-qmf-0.18-15.el6.x86_64
python-saslwrapper-0.18-1.el6_3.x86_64
qpid-cpp-client-0.18-14.el6.x86_64
qpid-cpp-client-devel-0.18-14.el6.x86_64
qpid-cpp-client-devel-docs-0.18-14.el6.noarch
qpid-cpp-client-rdma-0.18-14.el6.x86_64
qpid-cpp-client-ssl-0.18-14.el6.x86_64
qpid-cpp-debuginfo-0.14-22.el6_3.x86_64
qpid-cpp-server-0.18-14.el6.x86_64
qpid-cpp-server-cluster-0.18-14.el6.x86_64
qpid-cpp-server-devel-0.18-14.el6.x86_64
qpid-cpp-server-rdma-0.18-14.el6.x86_64
qpid-cpp-server-ssl-0.18-14.el6.x86_64
qpid-cpp-server-store-0.18-14.el6.x86_64
qpid-cpp-server-xml-0.18-14.el6.x86_64
qpid-java-client-0.18-7.el6.noarch
qpid-java-common-0.18-7.el6.noarch
qpid-java-example-0.18-7.el6.noarch
qpid-jca-0.18-8.el6.noarch
qpid-jca-xarecovery-0.18-8.el6.noarch
qpid-proton-c-0.4-2.2.el6.x86_64
qpid-proton-c-devel-0.4-2.2.el6.x86_64
qpid-qmf-0.18-15.el6.x86_64
qpid-qmf-debuginfo-0.14-14.el6_3.x86_64
qpid-qmf-devel-0.18-15.el6.x86_64
qpid-tests-0.18-2.el6.noarch
qpid-tools-0.18-8.el6.noarch
saslwrapper-0.18-1.el6_3.x86_64
saslwrapper-devel-0.18-1.el6_3.x86_64
}






latest packages on 32-bit machine

---------------------------------------------------------

{

cyrus-sasl-2.1.23-13.el6_3.1.i686
cyrus-sasl-devel-2.1.23-13.el6_3.1.i686
cyrus-sasl-gssapi-2.1.23-13.el6_3.1.i686
cyrus-sasl-lib-2.1.23-13.el6_3.1.i686
cyrus-sasl-md5-2.1.23-13.el6_3.1.i686
cyrus-sasl-plain-2.1.23-13.el6_3.1.i686
python-qpid-0.22-4.el6.noarch
python-qpid-qmf-0.22-7.el6.i686
python-saslwrapper-0.22-3.el6.i686
qpid-cpp-client-0.22-8.el6.i686
qpid-cpp-client-devel-0.22-8.el6.i686
qpid-cpp-client-devel-docs-0.22-8.el6.noarch
qpid-cpp-client-rdma-0.22-8.el6.i686
qpid-cpp-client-ssl-0.22-8.el6.i686
qpid-cpp-debuginfo-0.22-8.el6.i686
qpid-cpp-server-0.22-8.el6.i686
qpid-cpp-server-devel-0.22-8.el6.i686
qpid-cpp-server-ha-0.22-8.el6.i686
qpid-cpp-server-rdma-0.22-8.el6.i686
qpid-cpp-server-ssl-0.22-8.el6.i686
qpid-cpp-server-store-0.22-8.el6.i686
qpid-cpp-server-xml-0.22-8.el6.i686
qpid-cpp-tar-0.22-8.el6.noarch
qpid-java-client-0.22-5.el6.noarch
qpid-java-common-0.22-5.el6.noarch
qpid-java-example-0.22-5.el6.noarch
qpid-proton-c-0.4-2.2.el6.i686
qpid-proton-c-devel-0.4-2.2.el6.i686
qpid-proton-debuginfo-0.4-2.2.el6.i686
qpid-qmf-0.22-7.el6.i686
qpid-qmf-debuginfo-0.22-7.el6.i686
qpid-qmf-devel-0.22-7.el6.i686
qpid-snmpd-1.0.0-12.el6.i686
qpid-snmpd-debuginfo-1.0.0-12.el6.i686
qpid-tests-0.22-4.el6.noarch
qpid-tools-0.22-3.el6.noarch
saslwrapper-0.22-3.el6.i686

}






latest packages on 64-bit machine

--------------------------------------------------------------------

{

cyrus-sasl-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-devel-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-gssapi-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-md5-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-plain-2.1.23-13.el6_3.1.x86_64
python-qpid-0.22-4.el6.noarch
python-qpid-qmf-0.22-7.el6.x86_64
python-saslwrapper-0.22-3.el6.x86_64
qpid-cpp-client-0.22-8.el6.x86_64
qpid-cpp-client-devel-0.22-8.el6.x86_64
qpid-cpp-client-devel-docs-0.22-8.el6.noarch
qpid-cpp-client-rdma-0.22-8.el6.x86_64
qpid-cpp-client-ssl-0.22-8.el6.x86_64
qpid-cpp-debuginfo-0.22-8.el6.x86_64
qpid-cpp-server-0.22-8.el6.x86_64
qpid-cpp-server-devel-0.22-8.el6.x86_64
qpid-cpp-server-ha-0.22-8.el6.x86_64
qpid-cpp-server-rdma-0.22-8.el6.x86_64
qpid-cpp-server-ssl-0.22-8.el6.x86_64
qpid-cpp-server-store-0.22-8.el6.x86_64
qpid-cpp-server-xml-0.22-8.el6.x86_64
qpid-cpp-tar-0.22-8.el6.noarch
qpid-java-client-0.22-5.el6.noarch
qpid-java-common-0.22-5.el6.noarch
qpid-java-example-0.22-5.el6.noarch
qpid-proton-c-0.4-2.2.el6.x86_64
qpid-proton-c-devel-0.4-2.2.el6.x86_64
qpid-proton-debuginfo-0.4-2.2.el6.x86_64
qpid-qmf-0.22-7.el6.x86_64
qpid-qmf-debuginfo-0.22-7.el6.x86_64
qpid-qmf-devel-0.22-7.el6.x86_64
qpid-snmpd-1.0.0-12.el6.x86_64
qpid-snmpd-debuginfo-1.0.0-12.el6.x86_64
qpid-tests-0.22-4.el6.noarch
qpid-tools-0.22-3.el6.noarch
saslwrapper-0.22-3.el6.x86_64
saslwrapper-devel-0.22-3.el6.x86_64

}
Comment 9 errata-xmlrpc 2014-09-24 11:07:22 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1296.html

Note You need to log in before you can comment on or make changes to this bug.