Description of problem: There are 2^16 channels available on each connection. Each session in use requires one of these channels to be assigned. If more than 2^16 sessions are created and deleted, the channel allocated to new sessions 'wraps around' and can result in a collision with that allocated to a pre-existing session resulting in a session-busy exception. This exception renders the pre-existing session unusable on the client side (though the broker still assumes it is attached). Version-Release number of selected component (if applicable): 1.1.6 How reproducible: Easily. Steps to Reproduce: 1. start broker 2. run attached test program 3. send a stream of messages to test-queue (e.g. for i in `seq 1 100000`; do echo message $i; sleep 1; done | sender) Actual results: The test program will print out the messages it receives. After some time (when 2^16 + 2 extra sessions have been created) it reports a session-busy exception (E.g. Failed to create session: session-busy: Channel 2 attached to anonymous.620486fd-ab5d-4beb-a58a-48759b9ad489), after this occurs no messages are received by the test program. Expected results: The pre-existing session should remain operational and continue to receive messages regardless of wraparound. Additional info:
Created attachment 365127 [details] Test program
Created attachment 365281 [details] Faster test program This is a faster test program that turns on logging only at the point where things go wrong.
Fixed by SVN r828108. Back ported to 1.1.6 hotfix and 1.1.x branch.
Fixed in SVN r828108
Created attachment 366095 [details] reproducer Verified on RHEL4 and RHEL5, both i386 and x86_64. qpidd-0.5.752581-30.el4 qpidd-0.5.752581-30.el5
The hotfix introduced two regressions: - an NPE in java clients - rajith will attach a reproducer - abort with "terminate called after throwing an instance of 'qpid::TransportFailure'" in failover_soak test. The fixes are on the hotfix branch http://git.et.redhat.com/git/qpid.git/?p=qpid.git;a=shortlog;h=refs/heads/mrg_1.1.6_hotfix - NPE: http://git.et.redhat.com/git/qpid.git/?p=qpid.git;a=commit;h=a1087043290555a0989f930bcb9e379412966436 - terminate: http://git.et.redhat.com/git/qpid.git/?p=qpid.git;a=commit;h=15dac6d16477b07ae119512291138f6b34260745 To reproduce the abort, do run_failover_soak in a loop. It should show up within a couple of iterations.
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Corrected issue with channel collision, a pre-existing session should remain operational and continue to receive messages regardless of wraparound (529489)
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,2 +1,9 @@ -Corrected issue with channel collision, a pre-existing session should remain operational and continue to receive messages regardless of +Messaging bug fix -wraparound (529489)+ +C: There are 2^16 channels available on each connection. Each session in use requires one of these channels to be assigned. If more than 2^16 sessions are created and deleted, the channel allocated to new sessions 'wraps around'. +C: Can result in a collision with a channel that has previously been allocated, resulting in a session-busy exception. This exception renders the pre-existing session unusable on the client side +F: +R: A pre-existing session will remain operational and continue to receive messages regardless of +wraparound + +FURTHER INFORMATION REQUIRED FOR RELNOTE
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,9 +1,4 @@ -Messaging bug fix - C: There are 2^16 channels available on each connection. Each session in use requires one of these channels to be assigned. If more than 2^16 sessions are created and deleted, the channel allocated to new sessions 'wraps around'. C: Can result in a collision with a channel that has previously been allocated, resulting in a session-busy exception. This exception renders the pre-existing session unusable on the client side -F: +F: Channel allocation searches for an unused slot rather than just wrapping around. -R: A pre-existing session will remain operational and continue to receive messages regardless of +R: A pre-existing session will remain operational and continue to receive messages regardless of wraparound-wraparound - -FURTHER INFORMATION REQUIRED FOR RELNOTE
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,4 +1,7 @@ C: There are 2^16 channels available on each connection. Each session in use requires one of these channels to be assigned. If more than 2^16 sessions are created and deleted, the channel allocated to new sessions 'wraps around'. C: Can result in a collision with a channel that has previously been allocated, resulting in a session-busy exception. This exception renders the pre-existing session unusable on the client side F: Channel allocation searches for an unused slot rather than just wrapping around. -R: A pre-existing session will remain operational and continue to receive messages regardless of wraparound+R: A pre-existing session will remain operational and continue to receive messages regardless of wraparound + + +There are 2^16 channels available on each connection. Each session in use requires one of these channels to be assigned. If more than 2^16 sessions are created and deleted, the channel allocated to new sessions 'wraps around'. This was occasionally causing a collision with a channel that had previously been allocated, resulting in a session-busy exception. This exception renders the pre-existing session unusable on the client side. Channel allocation now searches for an unused slot rather than just wrapping around, which prevents the collisions.
Release note looks good.
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,7 +1,4 @@ C: There are 2^16 channels available on each connection. Each session in use requires one of these channels to be assigned. If more than 2^16 sessions are created and deleted, the channel allocated to new sessions 'wraps around'. C: Can result in a collision with a channel that has previously been allocated, resulting in a session-busy exception. This exception renders the pre-existing session unusable on the client side F: Channel allocation searches for an unused slot rather than just wrapping around. -R: A pre-existing session will remain operational and continue to receive messages regardless of wraparound +R: A pre-existing session will remain operational and continue to receive messages regardless of wraparound- - -There are 2^16 channels available on each connection. Each session in use requires one of these channels to be assigned. If more than 2^16 sessions are created and deleted, the channel allocated to new sessions 'wraps around'. This was occasionally causing a collision with a channel that had previously been allocated, resulting in a session-busy exception. This exception renders the pre-existing session unusable on the client side. Channel allocation now searches for an unused slot rather than just wrapping around, which prevents the collisions.
Thanks Alan :) LKB
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1633.html