Description of problem: When receiving a large amounts of messages over SSL using a receiver prefetch, the clients fails with an exception "An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full". This exception seems to originate from the SslAsynchIO class, method sslDataIn. Version-Release number of selected component (if applicable): any (e.g. MRG 2.3 but also in qpid 0.22) How reproducible: 100% Steps to Reproduce: 1) Create a large queue on a broker (C++ / Linux) 2) Start feeding messages into the queue using C++/Linux program (in my case I used approximately 1kB messages) 3) Connect with a receiver (C++/Windows) using SSL and prefetch 1000 (no client authentication, I used username & password) 4) Wait few seconds to see the error in the receiver Particular reproducer program: see https://issues.apache.org/jira/secure/attachment/12595257/client.cpp. Actual results: Receiver stucks and logs: debug Exception constructed: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full. (C:\some\path\source\qpid\cpp\src\qpid\sys\windows\SslAsynchIO.cpp:350) Expected results: No stuck consumer, no error. Additional info: 1) Decreasing the capacity seems to improve the frequency with which the problem appears. However with 1MB messages, even capacity 1 doesn't seem to work. 2) Attempting to reproduce in-house, we aimed in the client stuck but without the error seen (could be logging issue, though). 3) Increasing the BufferCount value in AsynchIO.h from 4 to 5 seems to solve the problem - at least in the terms that the error doesn't reproduce anymore.
See JIRA QPID-5033
Jira fixed upstream with detailed comments on nature of fix available. https://issues.apache.org/jira/browse/QPID-5033 Downstream patch available in branch: 0.22-mrg-cjansen-bz995496
Justin, I see the BZ has flag mrg-2.3.x+ : does it mean the fix will be backported both to 0.30-* (MRG-M 3.1) and _also_ to 0.18-* (MRG 2.5.*)? (customer is asking around this)
Pavel, I don't know why this has 2.3.x. That's got to be wrong. (Remember, don't set multiple version flags!) This is set to appear in our 3.1 builds (coming soon). If you want it backported, you need to clone this bug and raise it for 2.5.x. We're not opposed, just need to look at the scope of the change. (In reply to Pavel Moravec from comment #4) > Justin, I see the BZ has flag mrg-2.3.x+ : does it mean the fix will be > backported both to 0.30-* (MRG-M 3.1) and _also_ to 0.18-* (MRG 2.5.*)? > > (customer is asking around this)
The exception did not appear in the clients' logs anymore. Verified on Rhel6.6-i686 and Rhel6.6-x86_64 (on broker side) and clients on Windows 7-x86 and x64, Windows 8-x86 and 64, Windows Server2008-x64 and R2 and Windows Server2012 R2 with packages qpid-cpp-win-3.30.5.1-1 for MS Visual Studio 2008 and 2010 and qpid-cpp-0.30-6.el6. --> VERIFIED
Retested also with .NET client; this issue did not occur.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2015-0805.html