Bug 1030219 - C++ client on Windows connecting a broker via AMQP 1.0 protocol fails
C++ client on Windows connecting a broker via AMQP 1.0 protocol fails
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
Development
Unspecified Windows
high Severity medium
: 3.0
: ---
Assigned To: Chuck Rolke
Petra Svobodová
:
Depends On:
Blocks: 1010399
  Show dependency treegraph
 
Reported: 2013-11-14 03:03 EST by Petra Svobodová
Modified: 2014-09-24 11:09 EDT (History)
4 users (show)

See Also:
Fixed In Version: qpid-cpp-0.22-27
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-09-24 11:09:12 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
clients crashes call stacks transcripts (2.98 KB, application/zip)
2013-12-06 09:06 EST, Petra Svobodová
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Apache JIRA QPID-5363 None None None Never

  None (edit)
Description Petra Svobodová 2013-11-14 03:03:07 EST
Description of problem:
C++ client on Windows connecting a broker via AMQP 1.0 protocol sends/receives messages and crashes before closing; returns exit code -1073741819. 

I did not see this issue on C# client.  

Version-Release number of selected component (if applicable):
qpid-cpp-win-3.22.24.1-1

How reproducible:
On Windows Server2008 R2 about 20%, on other machines about 3% cases

Steps to Reproduce:
1. Unpack and build C++ examples from qpid-cpp-win package.
2. Run in loop 'hello_world.exe <broker_hostname>:5672 amq.topic "{protocol: amqp1.0}"'


Actual results:
"hello_world.exe" application sends and receives the message and sometimes fails with exit code -1073741819.

Expected results:
The application should send and receive the message and close cleanly with exit code 0.

Additional info:
This bug may relate with https://bugzilla.redhat.com/show_bug.cgi?id=1029780
Comment 1 Justin Ross 2013-11-14 06:51:36 EST
Chuck, please assess.
Comment 2 Chuck Rolke 2013-11-14 14:58:54 EST
It is easy to reproduce testing with VS2008 x86 32-bit. Need to get a complete kit built from sources to drill into the code.

A) Unhandled exception at 0x680ad568 (qpidmessagingd.dll) in hello_world.exe: 0xC0000005: Access violation reading location 0xddddddf9

B) qpidmessagingd.dll is loaded at 0x68020000

C) Stack frames
qpidmessagingd.dll!qpid::messaging::amqp::TcpTransport::close()  Line 123 + 0x10 bytes	C++
qpidmessagingd.dll!qpid::messaging::amqp::ConnectionContext::close()  Line 134 + 0x29 bytes	C++
qpidmessagingd.dll!qpid::messaging::amqp::ConnectionHandle::close()  Line 62	C++
qpidmessagingd.dll!qpid::messaging::Connection::close()  Line 78 + 0x24 bytes	C++
hello_world.exe!main(int argc=4, char * * argv=0x0053a5c8)  Line 51 + 0xe bytes	C++
hello_world.exe!__tmainCRTStartup()  Line 586 + 0x19 bytes	C
hello_world.exe!mainCRTStartup()  Line 403	C
kernel32.dll!7548336a() 	
[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]	
ntdll.dll!77729f72() 	
ntdll.dll!77729f45()
Comment 3 Chuck Rolke 2013-11-19 13:59:45 EST
*** Bug 1029780 has been marked as a duplicate of this bug. ***
Comment 4 Chuck Rolke 2013-11-19 14:16:12 EST
This issue is tracked down to a double free of some objects. A proposed fix is up for review.
Comment 5 Chuck Rolke 2013-11-20 15:54:24 EST
Fixed on trunk at Committed revision 1543935.
Comment 7 Petra Svobodová 2013-11-29 07:43:18 EST
Tried to verify on packages qpid-cpp-win-3.22.29.1.-1

This issue seems to be repaired, I did not see it on any supported architecture, however https://bugzilla.redhat.com/show_bug.cgi?id=1029780, marked as a duplicate of this bug occurred quite often (in about 10% cases) and crashed the both clients, C++ and C#; C++ more often.

I am not sure if the bz 1029780 is really duplicate of this bz. Chuck, could you possibly look at it, please?

--> NEEDINFO
Comment 8 Chuck Rolke 2013-12-02 11:16:05 EST
The bug in this BZ and in bz 1029780 is "an access violation at connection close". The problem was determined to be a race condition between two threads in the AsynchIO layer. That is, the AsynchIO driver launched two events (close, eof) and these two events are picked up by different threads. In the windows environment the close function calls the eof function so there are two threads running the eof function at the same time. The fix serializes the eof function access to internal structures so that objects are handled properly.

However, the nature of the bug is that the problem exists regardless of anything layered on top of the AsynchIO interface, including SSL and the .NET or any other binding.

I can't say with certainty that the two bugs are the same without seeing a stack trace of bz 1029780 and comparing the signatures. It would be helpful for debugging an issue like this if the test environment could produce a stack trace.
Comment 9 Petra Svobodová 2013-12-06 09:06:34 EST
Created attachment 833620 [details]
clients crashes call stacks transcripts

The attachment contains call stacks transcripts by windbg tool. The first file contains call stacks of crashes on qpid-cpp-win-3.22.24.1-1 (AMQP 1.0 and SSL crashes) and the second one of crashes on qpid-cpp-win-3.22.29.1-1 (crash on SSL only; crash on AMQP 1.0 not occurred).
Comment 10 Petra Svobodová 2013-12-09 01:50:28 EST
This issue does not occur on packages qpid-cpp-win-3.22.29.1.-1, but bz 1029780 does; it is not a duplicate of this bug. 

Thank you for your quick answer, Chuck!

Verified on packages qpid-cpp-win-3.22.29.1.-1 and platforms:
- Windows7-x86 and x64
- WindowsXP-x86 SP3
- Windows Server2003-x86 and x64
- Windows Server2008-x86, x64 and R2

--> VERIFIED
Comment 11 errata-xmlrpc 2014-09-24 11:09:12 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1296.html

Note You need to log in before you can comment on or make changes to this bug.