Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 894416

Summary: qpid java client is not able to receive more than 500 map messages with qmf structured data
Product: Red Hat Enterprise MRG Reporter: Frantisek Reznicek <freznice>
Component: qpid-javaAssignee: Frantisek Reznicek <freznice>
Status: CLOSED NOTABUG QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: urgent Docs Contact:
Priority: high    
Version: DevelopmentCC: esammons, iboverma, jross, lzhaldyb, rrajasek
Target Milestone: 2.3   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-17 09:22:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Testing clients none

Description Frantisek Reznicek 2013-01-11 17:15:25 UTC
Description of problem:

qpid java client is not able to receive more than 500 map messages with qmf structured data.

Qpid java client app stops at reading 501st map message containing clientConnect qmf map data. When analyzed it hangs at:

[root@kvm-guest-01 bz845223]# jstack 1524
2013-01-11 11:23:05
Full thread dump Java HotSpot(TM) Server VM (20.12-b01 mixed mode):

"Attach Listener" daemon prio=10 tid=0x09244000 nid=0xad5 runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Dispatcher-0-Conn-1" prio=10 tid=0x6ff1fc00 nid=0x603 in Object.wait() [0x6fe0b000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xa0000118> (a org.apache.qpid.client.util.FlowControllingBlockingQueue)
        at java.lang.Object.wait(Object.java:485)
        at org.apache.qpid.client.util.FlowControllingBlockingQueue.take(FlowControllingBlockingQueue.java:92)
        - locked <0xa0000118> (a org.apache.qpid.client.util.FlowControllingBlockingQueue)
        at org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3348)
        at java.lang.Thread.run(Thread.java:662)

"ack-flusher" daemon prio=10 tid=0x6ff7f400 nid=0x602 in Object.wait() [0x6fe5c000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xa00001d8> (a java.util.TaskQueue)
        at java.util.TimerThread.mainLoop(Timer.java:509)
        - locked <0xa00001d8> (a java.util.TaskQueue)
        at java.util.TimerThread.run(Timer.java:462)

"IoReceiver - localhost/127.0.0.1:5672" daemon prio=10 tid=0x6ff20400 nid=0x601 runnable [0x6fead000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:147)
        at java.lang.Thread.run(Thread.java:662)

"IoSender - localhost/127.0.0.1:5672" daemon prio=10 tid=0x6ff04800 nid=0x600 in Object.wait() [0x6fefe000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xa00018e8> (a java.lang.Object)
        at java.lang.Object.wait(Object.java:485)
        at org.apache.qpid.transport.network.io.IoSender.run(IoSender.java:284)
        - locked <0xa00018e8> (a java.lang.Object)
        at java.lang.Thread.run(Thread.java:662)

"Low Memory Detector" daemon prio=10 tid=0x09011400 nid=0x5fe runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x0900fc00 nid=0x5fd waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x09004400 nid=0x5fc waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x09002800 nid=0x5fb runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x08ff0c00 nid=0x5fa in Object.wait() [0x704f1000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xa007c900> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
        - locked <0xa007c900> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x08fef400 nid=0x5f9 in Object.wait() [0x70542000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xa0000288> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked <0xa0000288> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x08f6fc00 nid=0x5f5 waiting on condition [0xb7387000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xa0067f58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.qpid.client.BasicMessageConsumer.getMessageFromQueue(BasicMessageConsumer.java:460)
        at org.apache.qpid.client.BasicMessageConsumer_0_10.getMessageFromQueue(BasicMessageConsumer_0_10.java:397)
        at org.apache.qpid.client.BasicMessageConsumer.receive(BasicMessageConsumer.java:422)
        at org.apache.qpid.client.BasicMessageConsumer.receive(BasicMessageConsumer.java:397)
        at EventReceiver.main(EventReceiver.java:61)

"VM Thread" prio=10 tid=0x08feb800 nid=0x5f8 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x08f77000 nid=0x5f6 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x08f78400 nid=0x5f7 runnable

"VM Periodic Task Thread" prio=10 tid=0x09013400 nid=0x5ff waiting on condition

JNI global references: 1444

So far it was seen on RHEL 5.9 i386 only.

Version-Release number of selected component (if applicable):
python-qpid-0.18-4.el5
python-qpid-qmf-0.18-13.el5
qpid-cpp-client-0.18-13.el5
qpid-cpp-client-devel-0.18-13.el5
qpid-cpp-client-devel-docs-0.18-13.el5
qpid-cpp-client-rdma-0.18-13.el5
qpid-cpp-client-ssl-0.18-13.el5
qpid-cpp-mrg-debuginfo-0.18-13.el5
qpid-cpp-server-0.18-13.el5
qpid-cpp-server-cluster-0.18-13.el5
qpid-cpp-server-devel-0.18-13.el5
qpid-cpp-server-ha-0.18-13.el5
qpid-cpp-server-rdma-0.18-13.el5
qpid-cpp-server-ssl-0.18-13.el5
qpid-cpp-server-store-0.18-13.el5
qpid-cpp-server-xml-0.18-13.el5
qpid-java-client-0.18-6.el5
qpid-java-common-0.18-6.el5
qpid-java-example-0.18-6.el5
qpid-qmf-0.18-13.el5
qpid-qmf-debuginfo-0.18-13.el5
qpid-qmf-devel-0.18-13.el5
qpid-tests-0.18-2.el5
qpid-tools-0.18-7.el5
rh-qpid-cpp-tests-0.18-13.el5
ruby-qpid-qmf-0.18-13.el5  

How reproducible:
100%

Steps to Reproduce:
0. service qpidd restart
1. make clean make # build the client
2. while true; do ./event_receiver localhost:5672 "qmf.default.topic/agent.ind.event.org_apache_qpid_broker.ClientConnect.#" "" -1; done # termA
3. ./event_receiver.sh localhost:5672 "qmf.default.topic/agent.ind.event.org_apache_qpid_broker.clientConnect.#"# termB
4. ./event_receiver.py localhost:5672 "qmf.default.topic/agent.ind.event.org_apache_qpid_broker.clientConnect.#"# termC
5. ./event_receiver localhost:5672 "qmf.default.topic/agent.ind.event.org_apache_qpid_broker.clientConnect.#"# termD
6. #sometimes c++ client from termD crashes
  
Actual results:
500 qmf events/alerts receved ok, byr 501st reception hangs.

Expected results:
No such limitation should exist.

Additional info:

Comment 1 Frantisek Reznicek 2013-01-11 17:17:56 UTC
Created attachment 677011 [details]
Testing clients

Comment 2 Frantisek Reznicek 2013-01-11 19:10:15 UTC
Attached testing clients show that both c++ an python event receivers are able to receive unlimited number of messages/events, but java one hangs at processing 501st message/event.

Comment 4 Justin Ross 2013-01-14 15:24:22 UTC
Rajith, please assess.

Comment 6 Rajith Attapattu 2013-01-15 20:58:04 UTC
There is no bug in the java client. The issue is in the test program.

The test is using CLIENT_ACK but is not acking any messages.
So once the credits are exausted the broker will not be sending any more messages and the client appears hanged as it's just waiting for more messages.

Why 500 ? That is bcos the default message credits is set to max_prefetch which defaults to 500.

Please modify the test program to add a message ack and everything should be fine.

Regards,

Rajith

Comment 7 Frantisek Reznicek 2013-01-17 09:22:14 UTC
Dear Rajith, you're indeed right.
Not acking will hit max_prefetch limit.
I workarounded both using:
  - acking
  - modifying max_prefetch

and I can see no limit resp. modified limit.

This defect has no content atm.

-> CLOSED / NOTABUG