| Summary: | [AMPQ1.0] Setting idle-timeout to ffffffff, broker fires infinite loop of ConnectionTicker with zero timeout | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Pavel Moravec <pmoravec> |
| Component: | qpid-cpp | Assignee: | messaging-bugs <messaging-bugs> |
| Status: | CLOSED ERRATA | QA Contact: | Zdenek Kraus <zkraus> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.2 | CC: | jross, mcressma, zkraus |
| Target Milestone: | 3.2.5 | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | qpid-cpp-1.36.0-2 | Doc Type: | Bug Fix |
| Doc Text: |
Cause:
Setting idle-timeout to 0xffffffff in an incoming AMQP1.0 connection
Consequence:
causes broker to go into a loop sending messages to the logfile and high CPU usage results.
Fix:
The code generating timer events was corrected to handle this case.
Result:
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-07-04 07:49:26 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
fixed in upstream -> POST This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions tested on RHEL 6 i386, x86_64 RHEL 7 x86_64 with qpid-cpp-server-1.36.0-5 Fix works as expected -> VERIFIED Also there was a small issue with "# open" command in reproducer, here is working version with some protocol comments
send_and_receive( # open
# AMQP Frame [SPEC.2]
'0000009f' # AMQP Frame length
'02' # AMQP Frame Doff
'00' # AMQP Frame Type
'0000' # AMQP Frame Channel
'0053' # ??
'10' # Performative Open [SPEC.1]
'd0' # compound type [SPEC.3] (4byte fields)
'0000008a' # compound size bytes
'0000000a' # count of items in compound (for open has to be 10)
'a1' # container-id string
'24' # length
'61343639623263372d623030642d346465362d626361362d623066656261326562643562' #data
'40' # hostname null
'40' # max-frame size null
'40' # channel-max null
'70' # idle-timeout milliseconds [SPEC.6] -> uint [SPEC.5] (4bytes)
'ffffffff' # idle-timeout
'40' # outgoing-locales null
'40' # incoming-locales null
'40' # offered-capabilities null
'40' # desired-capabilities null
'd1' # properties compound
'00000054' # compound size
'00000006' # compound item count
'a313717069642e636c69656e745f70726f63657373a30c717069642d72656365697665a30f717069642e636c69656e745f'
'70696471000071efa310717069642e636c69656e745f70706964710000689b', # compound data
191)
# references
# [SPEC] AMQP 1.0 Protocol specification
# http://docs.oasis-open.org/amqp/core/v1.0/amqp-core-complete-v1.0.pdf
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1671 |
Description of problem: Setting idle-timeout to ffffffff / 4294967295 in an incoming AMQP1.0 connection, broker starts to raise errors: 2016-03-07 22:19:24 [System] error ConnectionTicker couldn't setup next timer firing: 455.058ms[0ns] in a tight loop to logfile, causing also high CPU usage of qpid broker. Version-Release number of selected component (if applicable): any (incl 0.34-5) How reproducible: 100% Steps to Reproduce: 1. since our clients cant set idle timeout in ms granularity, let mimic qpid-receive by below python script sending the same frames just with timeout ffffffff: import socket import struct import binascii from time import sleep import sys sys.path.insert(0,'bitstring') from bitstring import BitStream, BitArray client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) client_socket.connect(('127.0.0.1', 5672)) def send_and_receive(hexstream, recv_len): hex = binascii.unhexlify(hexstream) data = BitArray(bytes=hex) client_socket.sendall(data.tobytes()) if (recv_len > 0): data = client_socket.recv(recv_len) send_and_receive('414d515003010000', 48) # AMQP1.0 hello send_and_receive('0000004f02010000005341d00000003f00000003a309414e4f4e594d4f5553a023616e6f6e796d6f757340646863702d312d3234312e6272712e7265646861742e636f6da1093132372e302e302e31', 16+8) # sasl.init send_and_receive('414d515000010000', 0) # AMQP1.0 hello, auth done send_and_receive('0000009f02000000005310d00000000000000aa12461343639623263372d623030642d346465362d626361362d62306665626132656264356240404070'+'ffffffff'+'40404040d10000005400000006a313717069642e636c69656e745f70726f63657373a30c717069642d72656365697665a30f717069642e636c69656e745f70696471000071efa310717069642e636c69656e745f70706964710000689b', 191) # open send_and_receive('0000002102000000005311d00000001100000004405200707fffffff707fffffff', 35) # begin send_and_receive('000000a202000000005312d0000000920000000aa12e7465737451756575655f39326336333861342d636137382d343337302d616165362d30663930653063313062303452004250025000005328d00000001b0000000ba109746573745175657565520040520042404040404040005329d00000002800000007a10974657374517565756552004052004240a3106372656174652d6f6e2d64656d616e6440405200', 197) # attach sleep(100000) client_socket.close() 2. Have python bitstring installed (just unpack the archive from https://pypi.python.org/pypi/bitstring/3.1.3) 3. Run the script against a qpid broker (update client_socket.connect line if needed) 4. Watch broker logs and CPU usage Actual results: 4. broker logging: 2016-03-07 22:19:24 [System] error ConnectionTicker couldn't setup next timer firing: 455.058ms[0ns] and CPU >100% Expected results: no such errors, normal CPU usage Additional info: see upstream https://issues.apache.org/jira/browse/QPID-7127 and commit https://svn.apache.org/r1733966