Bug 1030219 - C++ client on Windows connecting a broker via AMQP 1.0 protocol fails
Summary: C++ client on Windows connecting a broker via AMQP 1.0 protocol fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Windows
high
medium
Target Milestone: 3.0
: ---
Assignee: Chuck Rolke
QA Contact: Petra Svobodová
URL:
Whiteboard:
Depends On:
Blocks: 1010399
TreeView+ depends on / blocked
 
Reported: 2013-11-14 08:03 UTC by Petra Svobodová
Modified: 2014-09-24 15:09 UTC (History)
4 users (show)

Fixed In Version: qpid-cpp-0.22-27
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-09-24 15:09:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
clients crashes call stacks transcripts (2.98 KB, application/zip)
2013-12-06 14:06 UTC, Petra Svobodová
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Apache JIRA QPID-5363 0 None None None Never
Red Hat Product Errata RHEA-2014:1296 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 3.0 Release 2014-09-24 19:00:06 UTC

Description Petra Svobodová 2013-11-14 08:03:07 UTC
Description of problem:
C++ client on Windows connecting a broker via AMQP 1.0 protocol sends/receives messages and crashes before closing; returns exit code -1073741819. 

I did not see this issue on C# client.  

Version-Release number of selected component (if applicable):
qpid-cpp-win-3.22.24.1-1

How reproducible:
On Windows Server2008 R2 about 20%, on other machines about 3% cases

Steps to Reproduce:
1. Unpack and build C++ examples from qpid-cpp-win package.
2. Run in loop 'hello_world.exe <broker_hostname>:5672 amq.topic "{protocol: amqp1.0}"'


Actual results:
"hello_world.exe" application sends and receives the message and sometimes fails with exit code -1073741819.

Expected results:
The application should send and receive the message and close cleanly with exit code 0.

Additional info:
This bug may relate with https://bugzilla.redhat.com/show_bug.cgi?id=1029780

Comment 1 Justin Ross 2013-11-14 11:51:36 UTC
Chuck, please assess.

Comment 2 Chuck Rolke 2013-11-14 19:58:54 UTC
It is easy to reproduce testing with VS2008 x86 32-bit. Need to get a complete kit built from sources to drill into the code.

A) Unhandled exception at 0x680ad568 (qpidmessagingd.dll) in hello_world.exe: 0xC0000005: Access violation reading location 0xddddddf9

B) qpidmessagingd.dll is loaded at 0x68020000

C) Stack frames
qpidmessagingd.dll!qpid::messaging::amqp::TcpTransport::close()  Line 123 + 0x10 bytes	C++
qpidmessagingd.dll!qpid::messaging::amqp::ConnectionContext::close()  Line 134 + 0x29 bytes	C++
qpidmessagingd.dll!qpid::messaging::amqp::ConnectionHandle::close()  Line 62	C++
qpidmessagingd.dll!qpid::messaging::Connection::close()  Line 78 + 0x24 bytes	C++
hello_world.exe!main(int argc=4, char * * argv=0x0053a5c8)  Line 51 + 0xe bytes	C++
hello_world.exe!__tmainCRTStartup()  Line 586 + 0x19 bytes	C
hello_world.exe!mainCRTStartup()  Line 403	C
kernel32.dll!7548336a() 	
[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]	
ntdll.dll!77729f72() 	
ntdll.dll!77729f45()

Comment 3 Chuck Rolke 2013-11-19 18:59:45 UTC
*** Bug 1029780 has been marked as a duplicate of this bug. ***

Comment 4 Chuck Rolke 2013-11-19 19:16:12 UTC
This issue is tracked down to a double free of some objects. A proposed fix is up for review.

Comment 5 Chuck Rolke 2013-11-20 20:54:24 UTC
Fixed on trunk at Committed revision 1543935.

Comment 7 Petra Svobodová 2013-11-29 12:43:18 UTC
Tried to verify on packages qpid-cpp-win-3.22.29.1.-1

This issue seems to be repaired, I did not see it on any supported architecture, however https://bugzilla.redhat.com/show_bug.cgi?id=1029780, marked as a duplicate of this bug occurred quite often (in about 10% cases) and crashed the both clients, C++ and C#; C++ more often.

I am not sure if the bz 1029780 is really duplicate of this bz. Chuck, could you possibly look at it, please?

--> NEEDINFO

Comment 8 Chuck Rolke 2013-12-02 16:16:05 UTC
The bug in this BZ and in bz 1029780 is "an access violation at connection close". The problem was determined to be a race condition between two threads in the AsynchIO layer. That is, the AsynchIO driver launched two events (close, eof) and these two events are picked up by different threads. In the windows environment the close function calls the eof function so there are two threads running the eof function at the same time. The fix serializes the eof function access to internal structures so that objects are handled properly.

However, the nature of the bug is that the problem exists regardless of anything layered on top of the AsynchIO interface, including SSL and the .NET or any other binding.

I can't say with certainty that the two bugs are the same without seeing a stack trace of bz 1029780 and comparing the signatures. It would be helpful for debugging an issue like this if the test environment could produce a stack trace.

Comment 9 Petra Svobodová 2013-12-06 14:06:34 UTC
Created attachment 833620 [details]
clients crashes call stacks transcripts

The attachment contains call stacks transcripts by windbg tool. The first file contains call stacks of crashes on qpid-cpp-win-3.22.24.1-1 (AMQP 1.0 and SSL crashes) and the second one of crashes on qpid-cpp-win-3.22.29.1-1 (crash on SSL only; crash on AMQP 1.0 not occurred).

Comment 10 Petra Svobodová 2013-12-09 06:50:28 UTC
This issue does not occur on packages qpid-cpp-win-3.22.29.1.-1, but bz 1029780 does; it is not a duplicate of this bug. 

Thank you for your quick answer, Chuck!

Verified on packages qpid-cpp-win-3.22.29.1.-1 and platforms:
- Windows7-x86 and x64
- WindowsXP-x86 SP3
- Windows Server2003-x86 and x64
- Windows Server2008-x86, x64 and R2

--> VERIFIED

Comment 11 errata-xmlrpc 2014-09-24 15:09:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1296.html


Note You need to log in before you can comment on or make changes to this bug.