Bug 1025693

Summary: qpid sender hang on close() after using a std::cout while sending messages with AMQP 1.0
Product: Red Hat Enterprise MRG Reporter: Zdenek Kraus <zkraus>
Component: qpid-cppAssignee: Gordon Sim <gsim>
Status: CLOSED CURRENTRELEASE QA Contact: Zdenek Kraus <zkraus>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: esammons, freznice, gsim, iboverma, jross
Target Milestone: 3.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-0.22-25.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-21 12:55:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1010399    

Description Zdenek Kraus 2013-11-01 10:11:51 UTC
Description of problem:
We encounter this deadlock on the 1 CPU machine. When sender using AMQP 1.0 sends messages to a policy reject limited queue, and while sending messages uses a stdout, it hangs on the sender.close()

Version-Release number of selected component (if applicable):
qpid-cpp-client-0.22-23.el6.x86_64

How reproducible:
~8%

Steps to Reproduce:
1. patch the client (see additinal info) to use stdout, and also to make subsequent closes for sender sesssion and close (latter step is not necessary, but show where is the deadlock)

2. compile
g++ -lqpidmessaging spout.cpp OptionParser.cpp -g -o spout

3. loop until hands, restarting broker, sending exceeding amount of messages to limited queue with flow control off, with patched spout (see additional info)

4. backtrack the deadlock (see actual results)
$ gdb
(gdb) attach <SPOUT PID>
(gdb) t a a bt

Actual results:
Thread 2 (Thread 0x7fad3d024700 (LWP 6657)):
#0  0x00000039c98e8f03 in epoll_wait () from /lib64/libc.so.6
#1  0x00000035c814b79d in qpid::sys::Poller::wait (this=0x6cce20, timeout=...)
    at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/epoll/EpollPoller.cpp:566
#2  0x00000035c814be71 in qpid::sys::Poller::run (this=0x6cce20)
    at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/epoll/EpollPoller.cpp:518
#3  0x00000035c814138a in qpid::sys::(anonymous namespace)::runRunnable (
    p=<value optimized out>)
    at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/posix/Thread.cpp:35
#4  0x00000039ca007851 in start_thread () from /lib64/libpthread.so.0
#5  0x00000039c98e890d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7fad3dcc5840 (LWP 6656)):
#0  0x00000039ca00b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000035c8638224 in wait (this=0x6d8e50)
    at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/posix/Condition.h:63
#2  wait (this=0x6d8e50) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/Monitor.h:41
#3  qpid::messaging::amqp::ConnectionContext::wait (this=0x6d8e50)
    at /usr/src/debug/qpid-0.22/cpp/src/qpid/messaging/amqp/ConnectionContext.cpp:423
#4  0x00000035c863bf18 in qpid::messaging::amqp::ConnectionContext::detach (
    this=0x6d8e50, ssn=..., lnk=...)
    at /usr/src/debug/qpid-0.22/cpp/src/qpid/messaging/amqp/ConnectionContext.cpp:253
#5  0x00000035c864d9ad in qpid::messaging::amqp::SenderHandle::close (
    this=<value optimized out>)
    at /usr/src/debug/qpid-0.22/cpp/src/qpid/messaging/amqp/SenderHandle.cpp:47
#6  0x00000000004032e9 in main (argc=10, argv=0x7fff188e0528) at spout.cpp:167

Expected results:


Additional info:
## spout.patch:
163a164
>                 std::cout << "jg15mRAy9IFCH0hXAEiYymeMFjIYRzp1OTBopkAqTw1UR1qgcV4sgQYMOL21iUSUFaR5gb2yuI4p1520eRajokMISqtsyYkwb4R9YSXFiEqirD27hQEbvvH2pxUJs7YeEDSZ9ohMCa1YK46ILJQp3YYtf9B8vih3vljynkCc9nABj0DqqgqcWnzCj1l0IDaFhGgw4FrJQqk4NHSIy4rLgwQ5tctfVOJsqiis83gVK0QQPqUWDA0P5dMr0Nnn9i8abKjDjVe3nvnybOBoCqzYAWlCpjjtJorzgtfrUdxCSRBN3qBrooAu1iwHsZUXnDfaZa7Jx7hyVaivFZiaQOF545WQ9zVGeeHUrXHqj09NgiANTEXp8BFbmdpxnf6rerJd9h3zKKhm1MjGPMG4ETH2zC2l2oXHPZh5jpwlDVlUbempAxvVMhiK32qgFFZsIY4GBREVQKmHlCe19e4ZJFtgpAUIsNNHeewePzj0XctjhX6f1078NrkX75T241CLxlYe1" << std::endl;
165a167,168
>             sender.close();
>             session.close();

## NOTE: the string is the same, but IMO it could use any long enough string with the endl at the end.


### LOOPING reproducer command
$ for i in $(seq 1 100); do echo -e "\n\n-- $i --"; service qpidd restart; /var/dtests/node_data/clients/spout --broker 192.168.6.2:5672 --connection-options "{  protocol : 'amqp1.0' }" --count 7 --content "jg15mRAy9IFCH0hXAEiYymeMFjIYRzp1OTBopkAqTw1UR1qgcV4sgQYMOL21iUSUFaR5gb2yuI4p1520eRajokMISqtsyYkwb4R9YSXFiEqirD27hQEbvvH2pxUJs7YeEDSZ9ohMCa1YK46ILJQp3YYtf9B8vih3vljynkCc9nABj0DqqgqcWnzCj1l0IDaFhGgw4FrJQqk4NHSIy4rLgwQ5tctfVOJsqiis83gVK0QQPqUWDA0P5dMr0Nnn9i8abKjDjVe3nvnybOBoCqzYAWlCpjjtJorzgtfrUdxCSRBN3qBrooAu1iwHsZUXnDfaZa7Jx7hyVaivFZiaQOF545WQ9zVGeeHUrXHqj09NgiANTEXp8BFbmdpxnf6rerJd9h3zKKhm1MjGPMG4ETH2zC2l2oXHPZh5jpwlDVlUbempAxvVMhiK32qgFFZsIY4GBREVQKmHlCe19e4ZJFtgpAUIsNNHeewePzj0XctjhX6f1078NrkX75T241CLxlYe1"  "test_queue_policy_reject_both;{'node': {'x-declare': {'arguments': {'qpid.flow_stop_count': 0, 'qpid.max_count': 8, 'qpid.policy_type': 'reject', 'qpid.flow_resume_count': 0, 'qpid.flow_stop_size': 0, 'qpid.max_size': 4096, 'qpid.flow_resume_size': 0}}}, 'create': 'always'}"; done

Comment 1 Zdenek Kraus 2013-11-04 09:59:55 UTC
This issue was also encountered on 2 cpu machine, with much larger string sent to stdout.

Comment 2 Gordon Sim 2013-11-04 15:02:57 UTC
Fixed upstream: https://svn.apache.org/r1538629

Comment 3 Zdenek Kraus 2014-01-06 16:39:21 UTC
This was tested on RHEL 6.5 i686, x86_64 with following packages:

qpid-cpp-server-0.22-30.el6
qpid-cpp-server-xml-0.22-30.el6
qpid-proton-debuginfo-0.6-1.el6
qpid-cpp-debuginfo-0.22-30.el6
qpid-jca-0.22-1.el6
qpid-jca-xarecovery-0.22-1.el6
qpid-qmf-devel-0.22-25.el6
qpid-tools-0.22-7.el6
qpid-proton-c-0.6-1.el6
qpid-qmf-0.22-25.el6
python-qpid-qmf-0.22-25.el6
qpid-proton-c-devel-0.6-1.el6
qpid-java-example-0.22-5.el6
python-qpid-0.22-9.el6
qpid-cpp-client-0.22-30.el6
qpid-cpp-client-ssl-0.22-30.el6
qpid-cpp-server-devel-0.22-30.el6
qpid-java-client-0.22-5.el6
qpid-cpp-server-ssl-0.22-30.el6
qpid-java-common-0.22-5.el6
qpid-cpp-client-devel-0.22-30.el6
qpid-cpp-server-store-0.22-30.el6
qpid-cpp-client-devel-docs-0.22-30.el6

no hang on 10k runs.

fix work as expected.