Bug 474865 - Java client appears not to handle commit failed error
Java client appears not to handle commit failed error
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-java (Show other bugs)
1.1
All Linux
high Severity high
: 1.1.1
: ---
Assigned To: Rajith Attapattu
ppecka
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-12-05 12:07 EST by Rajith Attapattu
Modified: 2011-07-14 10:07 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-07-14 10:07:20 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rajith Attapattu 2008-12-05 12:07:00 EST
I ran a java test with a tx size=5000.
It looks like the queue size wasn't enough and the commit failed.
error Failed to prepare: Enqueue capacity threshold exceeded on queue "testQueueD". (JournalImpl.cpp:538)
error Execution exception: internal-error: Commit failed (qpid/broker/SemanticState.cpp:125)

However the broker needs to propogate this exception to the client so it can handle this exception correctly. Currently nothing is sent and the client times out on the sync.

Error when running test timed out waiting for sync: complete = 180058, point = 185059
org.apache.qpid.transport.SessionException: timed out waiting for sync: complete = 180058, point = 185059
        at org.apache.qpid.transport.Session.sync(Session.java:568)
Comment 1 Gordon Sim 2008-12-05 12:19:50 EST
The broker does send an exception in this case. I ran a perftest with --pub-tx 10000 --durable yes with default journal layout. That hits the exact same error and an exception is sent to the client that then reports it.

I wonder whether this is therefore a bug in the java client. Categorising it as such for now until further evidence proves otherwise.
Comment 2 Rajith Attapattu 2008-12-15 18:08:50 EST
After further analyzing the logs for the above bz I found that the
broker does send a connection close with the correct error code.

But then the java client tries to reconnect immediately (as indicated by
the logs below). The broker has marked the session as session_busy and
then it becomes not_attached. On the broker side the following line is
printed repeatedly. 

2008-dec-15 17:24:27 error Channel exception: not-attached: Channel 1 is
not attached (qpid/amqp_0_10/SessionHandler.cpp:67)

Eventually the java client throws the following exception after it times
out trying to resume the session.

org.apache.qpid.transport.SessionException: timed out waiting for
session to become open (state=DETACHED)

However the original reason that caused the session close is not visible
(unless u enable tracing).

As far I can see there are two things wrong here.

1) If the session was closed due to an error, IMO it should throw the
exception up the stack and should not try to reconnect. It should only
attempt a reconnect if the underlying connection was broken due to
network issue or broker crashing.

2) The broker on the other hand should not allow to resume a session
that was closed due to a semantic error.

I would appreciate if you guys could help clarify the correct behavior.


-------------- Java client logs --------------------------------------
IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:50,860 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] ch=0
ConnectionClose(replyCode=FRAMING_ERROR, replyText=Enqueue capacity
threshold exceeded on queue "testQueueD". (JournalImpl.cpp:538))
IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:50,860 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] ch=0
ConnectionClose(replyCode=FRAMING_ERROR, replyText=Enqueue capacity
threshold exceeded on queue "testQueueD". (JournalImpl.cpp:538))

IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:50,867 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] AMQP.1 0-10
IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:50,867 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] AMQP.1 0-10
IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:50,868 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] ch=0
ConnectionStart(serverProperties={qpid.federation_tag=ef5b9951-7798-41a1-8f80-0ff30b8f093a}, mechanisms=[ANONYMOUS], locales=[en_US])

IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:50,954 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] ch=1
SessionDetached(name=[B@36baa466, code=SESSION_BUSY)
IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:50,954 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] ch=1
SessionDetached(name=[B@36baa466, code=SESSION_BUSY)
IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:53,597 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] ch=1
SessionDetached(name=[B@177f409c, code=NOT_ATTACHED)
IoReceiver - localhost/127.0.0.1:5672 2008-12-15 17:06:53,597 DEBUG
[apache.qpid.transport.Connection] RECV: [conn:13b8f864] ch=1
SessionDetached(name=[B@177f409c, code=NOT_ATTACHED)

-----------------------------------------------------------------------
Comment 3 Rajith Attapattu 2008-12-15 18:10:14 EST
We should perhaps change the description of the bz as it has nothing to do with tx anymore, rather a more general problem with the java failover mechanism.
Is that allowed or should I create a new bz?
Comment 4 Rajith Attapattu 2009-04-15 12:26:06 EDT
This issue is fixed as of rev749740 on Qpid trunk.
The packages for MRG 1.1.1 is built of rev 751061 and is inclusive of this fix.
Comment 5 Justin Ross 2011-06-28 14:16:47 EDT
Frantisek, does this one have test coverage?
Comment 6 ppecka 2011-07-14 10:03:53 EDT
VERIFIED on RHEL 5.6 / 6.2 (i686/x86_64)

# rpm -qa | grep qpid | sort -u
python-qpid-0.10-1.el5
python-qpid-qmf-0.10-10.el5
qpidc-debuginfo-0.5.752581-3.el5
qpid-cpp-client-0.10-8.el5
qpid-cpp-client-devel-0.10-8.el5
qpid-cpp-client-devel-docs-0.10-8.el5
qpid-cpp-client-ssl-0.10-8.el5
qpid-cpp-server-0.10-8.el5
qpid-cpp-server-cluster-0.10-8.el5
qpid-cpp-server-devel-0.10-8.el5
qpid-cpp-server-ssl-0.10-8.el5
qpid-cpp-server-store-0.10-8.el5
qpid-cpp-server-xml-0.10-8.el5
qpid-java-client-0.10-6.el5
qpid-java-common-0.10-6.el5
qpid-java-example-0.10-6.el5
qpid-qmf-0.10-10.el5
qpid-qmf-devel-0.10-10.el5
qpid-tools-0.10-6.el5

--> VERIFIED

Note You need to log in before you can comment on or make changes to this bug.