Bug 526868
Summary: | connecting to a disabled network address takes too long to fail | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Gordon Sim <gsim> | ||||||
Component: | qpid-cpp | Assignee: | Andrew Stitcher <astitcher> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Jan Sarenik <jsarenik> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 1.1.6 | CC: | acme, iboverma, jsarenik, lbrindle, tross | ||||||
Target Milestone: | 1.2 | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: |
Messaging bug fix:
C: If a network address becomes disabled, the
client will block trying to connect until the tcp implementation times it out.
C: The time taken for the network connection to fail was excessive.
F: The connection behavior was adjusted
R: Network timeouts will now fail in a much shorter time frame.
If a network address becomes disabled, the client would block trying to connect until the TCP implementation timed out. The time taken for the network connection to fail was excessive. The connection behavior was adjusted so that network timeouts now fail promptly.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2009-12-03 09:15:44 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 527551 | ||||||||
Attachments: |
|
Description
Gordon Sim
2009-10-02 08:21:42 UTC
Created attachment 363455 [details]
One possible fix
The attached patch is one possible fix. It simply moves the initialisation of the heartbeat timer to just before the connect call.
Note that even as the code stands without this patch, any change to the heartbeat that might happen due to negotiation would be ignored which is not strictly correct (though I think it would only be an issue when using qpidd if the heartbeat specified for client was greater than the brokers maximum value).
A more desirable fix would probably be to connect in non-blocking mode.
Agreed, non-blocking connect + start the heartbeat time after it (immediately) returns + poll for connection completion seems to be the right fix. Agreed, non-blocking connect + start the heartbeat timer after it (immediately) returns + poll for connection completion seems to be the right fix. Fixed this using the existing non-blocking code There is one change to the previous connect failure behaviour: The exception that gets thrown when Connection::open() fails no longer has any useful error text. The error text does now get logged as a warning though. Added line 'settings.heartbeat = 2;' to replaying_sender.cpp and compiled it on RHEL 4 and 5, i386 and x86_64, once with old qpidc-devel (0.5.752581-26.el5) where the long delay occured. Everything works fine on latest (0.5.752581-28.el5) versions. Created attachment 365480 [details]
the one-line patched replaying_sender which triggers the bug
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Connecting to a disabled network address is now failing quickly (526868) Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1,10 @@ -Connecting to a disabled network address is now failing quickly (526868)+Messaging bug fix: + +C: If a network address becomes disabled, the +client will block trying to connect until the tcp implementation times it out. +C: The time taken for the network connection to fail was excessive. +F: The connection behavior was adjusted +R: Network timeouts will now fail in a much shorter time frame. + +If a network address becomes disabled, the +client would block trying to connect until the TCP implementation timed out. The time taken for the network connection to fail was excessive. The connection behavior was adjusted so that network timeouts now fail promptly. Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -6,5 +6,4 @@ F: The connection behavior was adjusted R: Network timeouts will now fail in a much shorter time frame. -If a network address becomes disabled, the +If a network address becomes disabled, the client would block trying to connect until the TCP implementation timed out. The time taken for the network connection to fail was excessive. The connection behavior was adjusted so that network timeouts now fail promptly.-client would block trying to connect until the TCP implementation timed out. The time taken for the network connection to fail was excessive. The connection behavior was adjusted so that network timeouts now fail promptly. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1633.html |