Bug 1315485

Summary: [AMPQ1.0] Setting idle-timeout to ffffffff, broker fires infinite loop of ConnectionTicker with zero timeout
Product: Red Hat Enterprise MRG Reporter: Pavel Moravec <pmoravec>
Component: qpid-cppAssignee: messaging-bugs <messaging-bugs>
Status: CLOSED ERRATA QA Contact: Zdenek Kraus <zkraus>
Severity: high Docs Contact:
Priority: high    
Version: 3.2CC: jross, mcressma, zkraus
Target Milestone: 3.2.5   
Target Release: ---   
Hardware: x86_64   
OS: All   
Whiteboard:
Fixed In Version: qpid-cpp-1.36.0-2 Doc Type: Bug Fix
Doc Text:
Cause: Setting idle-timeout to 0xffffffff in an incoming AMQP1.0 connection Consequence: causes broker to go into a loop sending messages to the logfile and high CPU usage results. Fix: The code generating timer events was corrected to handle this case. Result:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-04 07:49:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Pavel Moravec 2016-03-07 21:34:49 UTC
Description of problem:
Setting idle-timeout to ffffffff / 4294967295 in an incoming AMQP1.0 connection, broker starts to raise errors:

2016-03-07 22:19:24 [System] error ConnectionTicker couldn't setup next timer firing: 455.058ms[0ns]

in a tight loop to logfile, causing also high CPU usage of qpid broker.


Version-Release number of selected component (if applicable):
any (incl 0.34-5)


How reproducible:
100%


Steps to Reproduce:
1. since our clients cant set idle timeout in ms granularity, let mimic qpid-receive by below python script sending the same frames just with timeout ffffffff:

import socket
import struct
import binascii
from time import sleep
import sys
sys.path.insert(0,'bitstring')
from bitstring import BitStream, BitArray

client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect(('127.0.0.1', 5672))

def send_and_receive(hexstream, recv_len):
    hex = binascii.unhexlify(hexstream)
    data = BitArray(bytes=hex)
    client_socket.sendall(data.tobytes())
    if (recv_len > 0):
        data = client_socket.recv(recv_len)

send_and_receive('414d515003010000', 48) # AMQP1.0 hello

send_and_receive('0000004f02010000005341d00000003f00000003a309414e4f4e594d4f5553a023616e6f6e796d6f757340646863702d312d3234312e6272712e7265646861742e636f6da1093132372e302e302e31', 16+8) # sasl.init
send_and_receive('414d515000010000', 0) # AMQP1.0 hello, auth done
send_and_receive('0000009f02000000005310d00000000000000aa12461343639623263372d623030642d346465362d626361362d62306665626132656264356240404070'+'ffffffff'+'40404040d10000005400000006a313717069642e636c69656e745f70726f63657373a30c717069642d72656365697665a30f717069642e636c69656e745f70696471000071efa310717069642e636c69656e745f70706964710000689b', 191) # open

send_and_receive('0000002102000000005311d00000001100000004405200707fffffff707fffffff', 35) # begin
send_and_receive('000000a202000000005312d0000000920000000aa12e7465737451756575655f39326336333861342d636137382d343337302d616165362d30663930653063313062303452004250025000005328d00000001b0000000ba109746573745175657565520040520042404040404040005329d00000002800000007a10974657374517565756552004052004240a3106372656174652d6f6e2d64656d616e6440405200', 197) # attach

sleep(100000)
client_socket.close()

2. Have python bitstring installed (just unpack the archive from https://pypi.python.org/pypi/bitstring/3.1.3)
3. Run the script against a qpid broker (update client_socket.connect line if needed)
4. Watch broker logs and CPU usage


Actual results:
4. broker logging:

2016-03-07 22:19:24 [System] error ConnectionTicker couldn't setup next timer firing: 455.058ms[0ns]

and CPU >100%


Expected results:
no such errors, normal CPU usage


Additional info:
see upstream https://issues.apache.org/jira/browse/QPID-7127 and commit https://svn.apache.org/r1733966

Comment 1 Pavel Moravec 2016-03-07 21:37:15 UTC
fixed in upstream -> POST

Comment 3 Mike McCune 2016-03-28 23:04:30 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 5 Zdenek Kraus 2017-05-10 15:14:43 UTC
tested on RHEL 6 i386, x86_64 RHEL 7 x86_64
with qpid-cpp-server-1.36.0-5

Fix works as expected
-> VERIFIED

Comment 6 Zdenek Kraus 2017-05-10 15:16:05 UTC
Also there was a small issue with "# open" command in reproducer, here is working version with some protocol comments

send_and_receive( # open                                                                                                   
  # AMQP Frame [SPEC.2]                                                                                                
  '0000009f'  # AMQP Frame length                                                                                          
  '02'  # AMQP Frame Doff                                                                                                  
  '00'  # AMQP Frame Type                                                                                                  
  '0000'  # AMQP Frame Channel                                                                                             
  '0053'  # ??                                                                                                             
  '10'  # Performative Open [SPEC.1]                                                                                   
    'd0'  # compound type [SPEC.3] (4byte fields)                                                                      
    '0000008a'  # compound size bytes                                                                                      
    '0000000a'  # count of items in compound (for open has to be 10)                                                       
    'a1'  # container-id string                                                                                            
      '24'  # length                                                                                                       
      '61343639623263372d623030642d346465362d626361362d623066656261326562643562'  #data                                    
    '40'  # hostname null                                                                                                  
    '40'  # max-frame size null                                                                                            
    '40'  # channel-max null                                                                                               
    '70'  # idle-timeout milliseconds [SPEC.6] -> uint [SPEC.5] (4bytes)                                           
      'ffffffff' # idle-timeout                                                                                            
    '40'  # outgoing-locales null                                                                                          
    '40'  # incoming-locales null                                                                                          
    '40'  # offered-capabilities null                                                                                      
    '40'  # desired-capabilities null                                                                                      
    'd1'  # properties compound                                                                                            
      '00000054'  # compound size                                                                                          
      '00000006'  # compound item count                                                                                    
      'a313717069642e636c69656e745f70726f63657373a30c717069642d72656365697665a30f717069642e636c69656e745f'                 
      '70696471000071efa310717069642e636c69656e745f70706964710000689b', # compound data                                    
191)


# references                                                                                                            
# [SPEC] AMQP 1.0 Protocol specification                                                                                
#     http://docs.oasis-open.org/amqp/core/v1.0/amqp-core-complete-v1.0.pdf

Comment 8 errata-xmlrpc 2017-07-04 07:49:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1671