Bug 624988 - qpidd broker suffers from error 'framing-error: Frame size too small 0 (qpid/framing/AMQFrame.cpp:101)' which can result in client getting SessionDetached()
Summary: qpidd broker suffers from error 'framing-error: Frame size too small 0 (qpid/...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: beta
Hardware: All
OS: Linux
high
high
Target Milestone: 1.3
: ---
Assignee: Kim van der Riet
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-18 10:16 UTC by Frantisek Reznicek
Modified: 2015-11-16 01:12 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-08-31 08:24:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Frantisek Reznicek 2010-08-18 10:16:17 UTC
Description of problem:

While bug 456272 regression testing, there was detected new problem related to sudden and unexpected SessionDetached()exception propagation from broker to client.

This exception on client side is most probably linked to broker's error:
  2010-08-13 16:19:17 error framing-error: Frame size too small 0 (qpid/framing/AMQFrame.cpp:101)

The python client is creating simultaneously multiple queues in multiple threads in the broker. The testing environment also consist of loading 'find' processes which take huge amount of disk I/O and CPU utilization. The environment under which the problem occurrend can be clearly titled as stressy.



Version-Release number of selected component (if applicable):
python-qmf-0.7.946106-8.el5
python-qpid-0.7.946106-12.el5
qpid-cpp-*-0.7.946106-12.el5

How reproducible:
very rarely (<1%)

Steps to Reproduce:
Run the MRG/Messaging/qpid_broker_jfail_bz456272 test in the loop or alternatively:
1. start broker with message store
2. launch few machine loading find processes
  (find /usr -type f -exec cat {} \; | \
   split -b 10m -  ${FIND_RESULT_LOG}.$(date +%Y%m%d_%H%M%S) ) &
   # split allows to splitting for search result deletion
3. launch bz456272 reproducer in the loop, can be found here (*.py)
   http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/distribution/MRG/Messaging/qpid_broker_jfail_bz456272/
   'python ./bz456272.py' or better 'python ./bz456272.pyc'

Actual results:
Broker rarely exits the python client with SessionDetached() and prints framing error:
2010-08-13 16:19:17 error framing-error: Frame size too small 0 (qpid/framing/AMQFrame.cpp:101)

Expected results:
No framing error, no exception on client's side:

Additional info (the log from the failing case):

[16:19:12] run 10/12--------------------------[err_cnt:0]-
qpidd-transcript-------------
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/msgstore.so
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/ssl.so
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/watchdog.so
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/rdma.so
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/xml.so
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/cluster.so
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/replicating_listener.so
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/replication_exchange.so
2010-08-13 16:19:12 info Loaded Module: /usr/lib64/qpid/daemon/acl.so
2010-08-13 16:19:12 info No message store configured, persistence is disabled.
2010-08-13 16:19:12 info Management enabled
2010-08-13 16:19:12 info ManagementAgent generated broker ID: 37763838-616d-4418-8e8d-9b3805e3b105
2010-08-13 16:19:12 notice Journal "TplStore": Created
2010-08-13 16:19:12 notice Store module initialized; store-dir=/mnt/tests/distribution/MRG/Messaging/qpid_broker_jfail_bz456272/rhts_qpidd/20100813_154811/broker.9670
2010-08-13 16:19:12 info > Default files per journal: 16
2010-08-13 16:19:12 info > Default journal file size: 32 (wpgs)
2010-08-13 16:19:12 info > Default write cache page size: 32 (Kib)
2010-08-13 16:19:12 info > Default number of write cache pages: 32
2010-08-13 16:19:12 info > TPL files per journal: 8
2010-08-13 16:19:12 info > TPL journal file size: 32 (wpgs)
2010-08-13 16:19:12 info > TPL write cache page size: 128 (Kib)
2010-08-13 16:19:12 info > TPL number of write cache pages: 8
2010-08-13 16:19:12 info Registered xml exchange
2010-08-13 16:19:12 info Registered replication exchange
2010-08-13 16:19:12 info Most recent persistence id found: 0x0
2010-08-13 16:19:12 notice SASL disabled: No Authentication Performed
2010-08-13 16:19:12 notice Listening on TCP port 5672
2010-08-13 16:19:12 info SSL plugin not enabled, you must set --ssl-cert-db to enable it.
2010-08-13 16:19:12 info Rdma: Disabled: no rdma devices found
2010-08-13 16:19:12 info Policy file not specified. ACL Disabled, no ACL checking being done!
2010-08-13 16:19:12 notice Broker running
2010-08-13 16:19:16 info Queue "mgmt-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_0_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "repl-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_0_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "mgmt-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_1_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "repl-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_1_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "mgmt-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_2_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "repl-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_2_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "mgmt-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_3_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "repl-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_3_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "mgmt-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_4_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:16 info Queue "repl-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_4_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:17 info Queue "mgmt-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_5_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:17 info Queue "repl-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_5_{(2010, 8, 13, 16, 19, 16, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:17 error framing-error: Frame size too small 0 (qpid/framing/AMQFrame.cpp:101)
2010-08-13 16:19:41 notice Journal "q_queue_000100": Created
2010-08-13 16:19:41 notice Journal "q_queue_000001": Created
2010-08-13 16:19:42 notice Journal "q_queue_000010": Created
2010-08-13 16:19:42 notice Journal "q_queue_100000": Created
2010-08-13 16:19:42 notice Journal "q_queue_001000": Created
2010-08-13 16:19:42 info Queue "mgmt-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_4_{(2010, 8, 13, 16, 19, 42, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:42 info Queue "repl-rhts_tyan-gt24-04.rhts.eng.bos.redhat.com.29819_4_{(2010, 8, 13, 16, 19, 42, 4, 225, 1)}": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:43 info Queue "q_queue_000001": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:43 info Queue "q_queue_000100": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:43 info Queue "q_queue_000010": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:43 info Queue "q_queue_100000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:19:43 info Queue "q_queue_001000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:00 notice Journal "q_queue_000002": Created
2010-08-13 16:20:00 notice Journal "q_queue_200000": Created
2010-08-13 16:20:00 notice Journal "q_queue_020000": Created
2010-08-13 16:20:00 notice Journal "q_queue_002000": Created
2010-08-13 16:20:00 notice Journal "q_queue_000200": Created
2010-08-13 16:20:00 notice Journal "q_queue_000020": Created
2010-08-13 16:20:01 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:20:01 info Queue "q_queue_000002": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:01 info Queue "q_queue_000200": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:01 info Queue "q_queue_000020": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:01 info Queue "q_queue_002000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:01 info Queue "q_queue_020000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:01 info Queue "q_queue_200000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:15 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:20:17 warning Timer callback overran by 2ms [taking 22000ns]
2010-08-13 16:20:17 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:20:18 notice Journal "q_queue_000003": Created
2010-08-13 16:20:18 notice Journal "q_queue_030000": Created
2010-08-13 16:20:18 notice Journal "q_queue_300000": Created
2010-08-13 16:20:18 notice Journal "q_queue_000030": Created
2010-08-13 16:20:18 notice Journal "q_queue_003000": Created
2010-08-13 16:20:18 notice Journal "q_queue_000300": Created
2010-08-13 16:20:19 info Queue "q_queue_003000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:19 info Queue "q_queue_000003": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:19 info Queue "q_queue_300000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:19 info Queue "q_queue_030000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:19 info Queue "q_queue_000300": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:19 info Queue "q_queue_000030": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:27 warning Timer callback overran by 2ms [taking 15000ns]
2010-08-13 16:20:27 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:20:28 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:20:28 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:20:28 warning Timer callback overran by 2ms [taking 13000ns]
2010-08-13 16:20:28 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:20:32 warning Timer callback overran by 2ms [taking 10000ns]
2010-08-13 16:20:32 warning Timer callback overran by 2ms [taking 16000ns]
2010-08-13 16:20:33 warning Timer callback overran by 2ms [taking 15000ns]
2010-08-13 16:20:33 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:20:36 notice Journal "q_queue_040000": Created
2010-08-13 16:20:36 notice Journal "q_queue_000400": Created
2010-08-13 16:20:36 notice Journal "q_queue_000004": Created
2010-08-13 16:20:36 notice Journal "q_queue_004000": Created
2010-08-13 16:20:36 notice Journal "q_queue_000040": Created
2010-08-13 16:20:36 notice Journal "q_queue_400000": Created
2010-08-13 16:20:37 warning Timer callback overran by 2ms [taking 12000ns]
2010-08-13 16:20:37 warning Timer callback overran by 2ms [taking 10000ns]
2010-08-13 16:20:37 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:20:37 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:20:38 info Queue "q_queue_000400": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:38 info Queue "q_queue_400000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:39 info Queue "q_queue_040000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:39 info Queue "q_queue_004000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:39 info Queue "q_queue_000004": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:39 info Queue "q_queue_000040": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:42 warning Timer callback overran by 2ms [taking 10000ns]
2010-08-13 16:20:42 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:20:47 warning Timer callback overran by 2ms [taking 8000ns]
2010-08-13 16:20:53 warning Timer callback overran by 2ms [taking 18000ns]
2010-08-13 16:20:54 notice Journal "q_queue_000050": Created
2010-08-13 16:20:54 notice Journal "q_queue_050000": Created
2010-08-13 16:20:54 notice Journal "q_queue_500000": Created
2010-08-13 16:20:54 notice Journal "q_queue_000005": Created
2010-08-13 16:20:54 notice Journal "q_queue_005000": Created
2010-08-13 16:20:54 notice Journal "q_queue_000500": Created
2010-08-13 16:20:56 info Queue "q_queue_500000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:56 info Queue "q_queue_005000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:56 info Queue "q_queue_000500": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:56 info Queue "q_queue_050000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:56 info Queue "q_queue_000050": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:20:56 info Queue "q_queue_000005": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:00 warning Timer callback overran by 2ms [taking 13000ns]
2010-08-13 16:21:00 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:00 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:21:01 warning Timer callback overran by 2ms [taking 17000ns]
2010-08-13 16:21:02 warning Timer callback overran by 2ms [taking 9000ns]
2010-08-13 16:21:02 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:02 warning Timer callback overran by 2ms [taking 9000ns]
2010-08-13 16:21:04 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:21:04 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:09 warning Timer callback overran by 2ms [taking 7000ns]
2010-08-13 16:21:09 warning Timer callback overran by 2ms [taking 9000ns]
2010-08-13 16:21:12 notice Journal "q_queue_000600": Created
2010-08-13 16:21:12 notice Journal "q_queue_600000": Created
2010-08-13 16:21:12 notice Journal "q_queue_000060": Created
2010-08-13 16:21:12 notice Journal "q_queue_000006": Created
2010-08-13 16:21:12 notice Journal "q_queue_060000": Created
2010-08-13 16:21:12 notice Journal "q_queue_006000": Created
2010-08-13 16:21:14 info Queue "q_queue_060000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:14 info Queue "q_queue_000006": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:14 info Queue "q_queue_006000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:14 info Queue "q_queue_600000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:14 info Queue "q_queue_000600": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:14 info Queue "q_queue_000060": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:18 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:18 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:18 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:18 warning Timer callback overran by 2ms [taking 11000ns]
2010-08-13 16:21:22 warning Timer callback overran by 2ms [taking 8000ns]
2010-08-13 16:21:22 warning Timer callback overran by 2ms [taking 13000ns]
2010-08-13 16:21:22 warning Timer callback overran by 2ms [taking 12000ns]
2010-08-13 16:21:22 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:22 warning Timer callback overran by 2ms [taking 12000ns]
2010-08-13 16:21:27 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:27 warning Timer callback overran by 2ms [taking 8000ns]
2010-08-13 16:21:27 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:27 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:27 warning Timer callback overran by 2ms [taking 10000ns]
2010-08-13 16:21:30 notice Journal "q_queue_000007": Created
2010-08-13 16:21:30 notice Journal "q_queue_070000": Created
2010-08-13 16:21:30 notice Journal "q_queue_700000": Created
2010-08-13 16:21:30 notice Journal "q_queue_000700": Created
2010-08-13 16:21:30 notice Journal "q_queue_000070": Created
2010-08-13 16:21:30 notice Journal "q_queue_007000": Created
2010-08-13 16:21:32 info Queue "q_queue_000070": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:32 info Queue "q_queue_000007": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:32 info Queue "q_queue_007000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:32 info Queue "q_queue_000700": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:32 info Queue "q_queue_070000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:32 info Queue "q_queue_700000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:43 warning Timer callback overran by 2ms [taking 15000ns]
2010-08-13 16:21:44 warning Timer callback overran by 2ms [taking 17000ns]
2010-08-13 16:21:44 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:44 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:21:45 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:21:45 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:45 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:21:45 warning Timer callback overran by 2ms [taking 12000ns]
2010-08-13 16:21:46 warning Timer callback overran by 2ms [taking 12000ns]
2010-08-13 16:21:46 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:46 warning Timer callback overran by 2ms [taking 10000ns]
2010-08-13 16:21:48 notice Journal "q_queue_008000": Created
2010-08-13 16:21:48 notice Journal "q_queue_000008": Created
2010-08-13 16:21:48 notice Journal "q_queue_800000": Created
2010-08-13 16:21:48 notice Journal "q_queue_080000": Created
2010-08-13 16:21:48 notice Journal "q_queue_000080": Created
2010-08-13 16:21:48 notice Journal "q_queue_000800": Created
2010-08-13 16:21:49 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:21:49 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:49 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:54 warning Timer callback overran by 2ms [taking 16000ns]
2010-08-13 16:21:55 info Queue "q_queue_800000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:55 info Queue "q_queue_000008": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:55 info Queue "q_queue_000080": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:21:56 warning Timer callback overran by 2ms [taking 10000ns]
2010-08-13 16:21:57 warning Timer callback overran by 2ms [taking 12000ns]
2010-08-13 16:21:57 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:57 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:57 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:57 warning Timer callback overran by 2ms [taking 7000ns]
2010-08-13 16:21:57 warning Timer callback overran by 2ms [taking 9000ns]
2010-08-13 16:21:57 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:58 warning Timer callback overran by 2ms [taking 6000ns]
2010-08-13 16:21:58 warning Timer callback overran by 2ms [taking 7000ns]
2010-08-13 16:21:58 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:58 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:21:58 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:21:58 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:21:59 warning Timer callback overran by 2ms [taking 23000ns]
2010-08-13 16:22:01 warning Timer callback overran by 2ms [taking 5000ns]
2010-08-13 16:22:01 warning Timer callback overran by 2ms [taking 3000ns]
2010-08-13 16:22:06 notice Journal "q_queue_900000": Created
2010-08-13 16:22:06 notice Journal "q_queue_000090": Created
2010-08-13 16:22:06 notice Journal "q_queue_000009": Created
2010-08-13 16:22:07 info Queue "q_queue_008000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:07 notice Journal "q_queue_009000": Created
2010-08-13 16:22:07 info Queue "q_queue_080000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:07 notice Journal "q_queue_090000": Created
2010-08-13 16:22:08 info Queue "q_queue_000800": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:08 notice Journal "q_queue_000900": Created
2010-08-13 16:22:09 info Queue "q_queue_900000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:09 info Queue "q_queue_000090": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:10 info Queue "q_queue_000009": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:18 info Queue "q_queue_090000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:18 info Queue "q_queue_009000": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:18 info Queue "q_queue_000900": Policy created: type=reject; maxCount=0; maxSize=104857600
2010-08-13 16:22:19 warning Timer callback overran by 2ms [taking 7000ns]
2010-08-13 16:22:19 warning Timer callback overran by 2ms [taking 4000ns]
2010-08-13 16:22:22 notice Shut down
core-test-transcript---------
Exception in thread Thread-19:
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
    self.run()
  File "/usr/lib64/python2.4/threading.py", line 422, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.4/site-packages/qpid/queue.py", line 84, in run
    self.listener(o)
  File "/usr/lib/python2.4/site-packages/qpid/management.py", line 200, in topicCb
    self.ssn.channel.session_completed(self.ssn.receiver._completed)
  File "/usr/lib/python2.4/site-packages/qpid/generator.py", line 25, in <lambda>
    method = lambda self, *args, **kwargs: self.invoke(op, args, kwargs)
  File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 229, in invoke
    self.connection.write_op(ctl)
  File "/usr/lib/python2.4/site-packages/qpid/connection.py", line 200, in write_op
    self.flush()
  File "/usr/lib/python2.4/site-packages/qpid/framer.py", line 57, in flush
    self._write(cipher_buf)
  File "/usr/lib/python2.4/site-packages/qpid/framer.py", line 68, in _write
    n = self.sock.send(buf)
error: (32, 'Broken pipe')

Traceback (most recent call last):
  File "bz456272.py", line 72, in run
    self.mgr.AddQueue(qid, True)
  File "rhtslib_management.py", line 105, in AddQueue
    arguments=args)
  File "/usr/lib/python2.4/site-packages/qpid/generator.py", line 25, in <lambda>
    method = lambda self, *args, **kwargs: self.invoke(op, args, kwargs)
  File "/usr/lib/python2.4/site-packages/qpid/session.py", line 138, in invoke
    return self.do_invoke(op, args, kwargs)
  File "/usr/lib/python2.4/site-packages/qpid/session.py", line 150, in do_invoke
    raise SessionDetached()
SessionDetached
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
    self.run()
  File "bz456272.py", line 85, in run
    traceback.print_stack()
===> Preparing ... 
AddQueue(0) - connecting to broker
AddQueue(0) - connected
AddQueue(0) ... Waiting for phase 1 ...
AddQueue(1) - connecting to broker
AddQueue(1) - connected
AddQueue(1) ... Waiting for phase 1 ...
AddQueue(2) - connecting to broker
AddQueue(2) - connected
AddQueue(2) ... Waiting for phase 1 ...
AddQueue(3) - connecting to broker
AddQueue(3) - connected
AddQueue(3) ... Waiting for phase 1 ...
AddQueue(4) - connecting to broker
AddQueue(4) - connected
AddQueue(5) - connecting to broker
AddQueue(4) ... Waiting for phase 1 ...
AddQueue(5) - connected
AddQueue(5) ... Waiting for phase 1 ...
===> All threads ready for phase 1 ... sleeping a few (10) seconds!
AddQueue(0) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
AddQueue(4) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
===> Doing loop 1 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
---> Removing lock
[1281730781.998739] AddQueue(2) ... Adding queue 'q_queue_000100'
[1281730781.999683] AddQueue(0) ... Adding queue 'q_queue_000001'
[1281730782.000617] AddQueue(1) ... Adding queue 'q_queue_000010'
[1281730782.001600] AddQueue(0) done (time used: 0.001917)
AddQueue(0) ... Waiting for phase 1 ...
[1281730782.001675] AddQueue(5) ... Adding queue 'q_queue_100000'
[1281730782.001935] AddQueue(2) done (time used: 0.003196)
AddQueue(2) ... Waiting for phase 1 ...
[1281730782.002523] AddQueue(1) done (time used: 0.001906)
AddQueue(1) ... Waiting for phase 1 ...
[1281730782.002666] AddQueue(5) done (time used: 0.000991)
AddQueue(5) ... Waiting for phase 1 ...
[1281730782.002771] AddQueue(3) ... Adding queue 'q_queue_001000'
[1281730782.003716] AddQueue(4) ... Adding queue 'q_queue_010000'
[1281730782.003806] AddQueue(4) FAILED (time used: 0.000090) : 
Exception Info:
[1281730782.003970] AddQueue(3) done (time used: 0.001199)
AddQueue(3) ... Waiting for phase 1 ...
Stack Trace:
AddQueue(4) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


===> Doing loop 2 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
AddQueue(0) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(4) ... waiting for phase 2... 
---> Removing lock
[1281730800.124553] AddQueue(0) ... Adding queue 'q_queue_000002'
[1281730800.125485] AddQueue(5) ... Adding queue 'q_queue_200000'
[1281730800.126431] AddQueue(4) ... Adding queue 'q_queue_020000'
[1281730800.127374] AddQueue(5) done (time used: 0.001889)
AddQueue(5) ... Waiting for phase 1 ...
[1281730800.127460] AddQueue(3) ... Adding queue 'q_queue_002000'
[1281730800.128119] AddQueue(4) done (time used: 0.001688)
AddQueue(4) ... Waiting for phase 1 ...
[1281730800.128193] AddQueue(2) ... Adding queue 'q_queue_000200'
[1281730800.128846] AddQueue(0) done (time used: 0.004293)
AddQueue(0) ... Waiting for phase 1 ...
[1281730800.129006] AddQueue(3) done (time used: 0.001546)
AddQueue(3) ... Waiting for phase 1 ...
[1281730800.129126] AddQueue(1) ... Adding queue 'q_queue_000020'
[1281730800.129878] AddQueue(1) done (time used: 0.000752)
AddQueue(1) ... Waiting for phase 1 ...
[1281730800.130141] AddQueue(2) done (time used: 0.001948)
AddQueue(2) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


===> Doing loop 3 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
AddQueue(5) ... waiting for phase 2... 
AddQueue(4) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
AddQueue(0) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
---> Removing lock
[1281730818.157759] AddQueue(0) ... Adding queue 'q_queue_000003'
[1281730818.158421] AddQueue(4) ... Adding queue 'q_queue_030000'
[1281730818.159349] AddQueue(0) done (time used: 0.001590)
AddQueue(0) ... Waiting for phase 1 ...
[1281730818.159455] AddQueue(2) ... Adding queue 'q_queue_000300'
[1281730818.159642] AddQueue(5) ... Adding queue 'q_queue_300000'
[1281730818.160329] AddQueue(4) done (time used: 0.001908)
AddQueue(4) ... Waiting for phase 1 ...
[1281730818.160437] AddQueue(5) done (time used: 0.000795)
AddQueue(5) ... Waiting for phase 1 ...
[1281730818.160484] AddQueue(1) ... Adding queue 'q_queue_000030'
[1281730818.160878] AddQueue(1) done (time used: 0.000394)
AddQueue(1) ... Waiting for phase 1 ...
[1281730818.160906] AddQueue(3) ... Adding queue 'q_queue_003000'
[1281730818.161177] AddQueue(3) done (time used: 0.000271)
AddQueue(3) ... Waiting for phase 1 ...
[1281730818.161332] AddQueue(2) done (time used: 0.001877)
AddQueue(2) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


===> Doing loop 4 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
AddQueue(0) ... waiting for phase 2... 
AddQueue(4) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
---> Removing lock
[1281730836.155377] AddQueue(0) ... Adding queue 'q_queue_000004'
[1281730836.155513] AddQueue(4) ... Adding queue 'q_queue_040000'
[1281730836.156124] AddQueue(4) done (time used: 0.000611)
AddQueue(4) ... Waiting for phase 1 ...
[1281730836.156194] AddQueue(2) ... Adding queue 'q_queue_000400'
[1281730836.157592] AddQueue(5) ... Adding queue 'q_queue_400000'
[1281730836.157899] AddQueue(0) done (time used: 0.002522)
AddQueue(0) ... Waiting for phase 1 ...
[1281730836.158040] AddQueue(2) done (time used: 0.001846)
AddQueue(2) ... Waiting for phase 1 ...
[1281730836.158115] AddQueue(3) ... Adding queue 'q_queue_004000'
[1281730836.158714] AddQueue(1) ... Adding queue 'q_queue_000040'
[1281730836.159196] AddQueue(3) done (time used: 0.001081)
AddQueue(3) ... Waiting for phase 1 ...
[1281730836.159287] AddQueue(1) done (time used: 0.000573)
AddQueue(1) ... Waiting for phase 1 ...
[1281730836.159772] AddQueue(5) done (time used: 0.002180)
AddQueue(5) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


===> Doing loop 5 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
AddQueue(4) ... waiting for phase 2... 
AddQueue(0) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
---> Removing lock
[1281730854.161455] AddQueue(1) ... Adding queue 'q_queue_000050'
[1281730854.162380] AddQueue(4) ... Adding queue 'q_queue_050000'
[1281730854.163361] AddQueue(1) done (time used: 0.001906)
AddQueue(1) ... Waiting for phase 1 ...
[1281730854.163444] AddQueue(5) ... Adding queue 'q_queue_500000'
[1281730854.164083] AddQueue(0) ... Adding queue 'q_queue_000005'
[1281730854.164477] AddQueue(5) done (time used: 0.001033)
AddQueue(5) ... Waiting for phase 1 ...
[1281730854.164616] AddQueue(4) done (time used: 0.002236)
AddQueue(4) ... Waiting for phase 1 ...
[1281730854.165453] AddQueue(0) done (time used: 0.001370)
AddQueue(0) ... Waiting for phase 1 ...
[1281730854.165771] AddQueue(3) ... Adding queue 'q_queue_005000'
[1281730854.166363] AddQueue(2) ... Adding queue 'q_queue_000500'
[1281730854.167016] AddQueue(3) done (time used: 0.001245)
AddQueue(3) ... Waiting for phase 1 ...
[1281730854.167135] AddQueue(2) done (time used: 0.000772)
AddQueue(2) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


===> Doing loop 6 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
AddQueue(1) ... waiting for phase 2... 
AddQueue(0) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
AddQueue(4) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
---> Removing lock
[1281730872.159513] AddQueue(2) ... Adding queue 'q_queue_000600'
[1281730872.160431] AddQueue(5) ... Adding queue 'q_queue_600000'
[1281730872.161379] AddQueue(1) ... Adding queue 'q_queue_000060'
[1281730872.162357] AddQueue(5) done (time used: 0.001926)
AddQueue(5) ... Waiting for phase 1 ...
[1281730872.162531] AddQueue(2) done (time used: 0.003018)
AddQueue(2) ... Waiting for phase 1 ...
[1281730872.162699] AddQueue(1) done (time used: 0.001320)
AddQueue(1) ... Waiting for phase 1 ...
[1281730872.162774] AddQueue(4) ... Adding queue 'q_queue_060000'
[1281730872.162952] AddQueue(0) ... Adding queue 'q_queue_000006'
[1281730872.163990] AddQueue(0) done (time used: 0.001038)
AddQueue(0) ... Waiting for phase 1 ...
[1281730872.164055] AddQueue(3) ... Adding queue 'q_queue_006000'
[1281730872.164615] AddQueue(3) done (time used: 0.000560)
AddQueue(3) ... Waiting for phase 1 ...
[1281730872.164704] AddQueue(4) done (time used: 0.001930)
AddQueue(4) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


===> Doing loop 7 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
AddQueue(2) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(0) ... waiting for phase 2... 
AddQueue(4) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
---> Removing lock
[1281730890.162640] AddQueue(0) ... Adding queue 'q_queue_000007'
[1281730890.163562] AddQueue(4) ... Adding queue 'q_queue_070000'
[1281730890.164529] AddQueue(0) done (time used: 0.001889)
AddQueue(0) ... Waiting for phase 1 ...
[1281730890.164643] AddQueue(1) ... Adding queue 'q_queue_000070'
[1281730890.164866] AddQueue(5) ... Adding queue 'q_queue_700000'
[1281730890.165660] AddQueue(4) done (time used: 0.002098)
AddQueue(4) ... Waiting for phase 1 ...
[1281730890.165800] AddQueue(5) done (time used: 0.000934)
AddQueue(5) ... Waiting for phase 1 ...
[1281730890.165872] AddQueue(2) ... Adding queue 'q_queue_000700'
[1281730890.168876] AddQueue(2) done (time used: 0.003004)
AddQueue(2) ... Waiting for phase 1 ...
[1281730890.168976] AddQueue(1) done (time used: 0.004333)
AddQueue(1) ... Waiting for phase 1 ...
[1281730890.169028] AddQueue(3) ... Adding queue 'q_queue_007000'
[1281730890.169553] AddQueue(3) done (time used: 0.000525)
AddQueue(3) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


===> Doing loop 8 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
AddQueue(0) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
AddQueue(4) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
---> Removing lock
[1281730908.164828] AddQueue(1) ... Adding queue 'q_queue_000080'
[1281730908.164997] AddQueue(3) ... Adding queue 'q_queue_008000'
[1281730908.165667] AddQueue(5) ... Adding queue 'q_queue_800000'
[1281730908.166178] AddQueue(0) ... Adding queue 'q_queue_000008'
[1281730908.167496] AddQueue(4) ... Adding queue 'q_queue_080000'
[1281730908.168318] AddQueue(3) done (time used: 0.003321)
AddQueue(3) ... Waiting for phase 1 ...
[1281730908.168476] AddQueue(0) done (time used: 0.002298)
AddQueue(0) ... Waiting for phase 1 ...
[1281730908.168646] AddQueue(4) done (time used: 0.001150)
AddQueue(4) ... Waiting for phase 1 ...
[1281730908.168800] AddQueue(5) done (time used: 0.003133)
AddQueue(5) ... Waiting for phase 1 ...
[1281730908.169423] AddQueue(1) done (time used: 0.004595)
AddQueue(1) ... Waiting for phase 1 ...
[1281730908.169476] AddQueue(2) ... Adding queue 'q_queue_000800'
[1281730908.169962] AddQueue(2) done (time used: 0.000486)
AddQueue(2) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


===> Doing loop 9 of 10
---> Sleeping for another 15 seconds, to allow all threads to get ready for phase 2 ...
AddQueue(3) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(4) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
AddQueue(0) ... waiting for phase 2... 
---> Removing lock
[1281730926.168189] AddQueue(4) ... Adding queue 'q_queue_090000'
[1281730926.169123] AddQueue(5) ... Adding queue 'q_queue_900000'
[1281730926.170089] AddQueue(4) done (time used: 0.001900)
AddQueue(4) ... Waiting for phase 1 ...
[1281730926.171932] AddQueue(3) ... Adding queue 'q_queue_009000'
[1281730926.172504] AddQueue(0) ... Adding queue 'q_queue_000009'
[1281730926.172886] AddQueue(5) done (time used: 0.003763)
AddQueue(5) ... Waiting for phase 1 ...
[1281730926.173000] AddQueue(3) done (time used: 0.001068)
AddQueue(3) ... Waiting for phase 1 ...
[1281730926.173099] AddQueue(1) ... Adding queue 'q_queue_000090'
[1281730926.174987] AddQueue(1) done (time used: 0.001888)
AddQueue(1) ... Waiting for phase 1 ...
[1281730926.175138] AddQueue(0) done (time used: 0.002634)
AddQueue(0) ... Waiting for phase 1 ...
[1281730926.175237] AddQueue(2) ... Adding queue 'q_queue_000900'
[1281730926.175968] AddQueue(2) done (time used: 0.000731)
AddQueue(2) ... Waiting for phase 1 ...
---> Waiting for all threads to get ready again
---> ready: 0


---> Exiting...
AddQueue(4) ... waiting for phase 2... 
AddQueue(5) ... waiting for phase 2... 
AddQueue(3) ... waiting for phase 2... 
AddQueue(1) ... waiting for phase 2... 
AddQueue(0) ... waiting for phase 2... 
AddQueue(2) ... waiting for phase 2... 
AddQueue(0) ... Disconnecting
AddQueue(1) ... Disconnecting
AddQueue(2) ... Disconnecting
AddQueue(3) ... Disconnecting
AddQueue(4) ... Disconnecting
AddQueue(5) ... Disconnecting

Comment 1 Frantisek Reznicek 2010-08-23 10:23:49 UTC
Reproducibility is higher now as I succeeded to trigger the issue again on RHEL 5.5 x86_64 on different machine (with same package set).

How reproducible:
40% on RHEL 5.5 x86_64

Comment 2 Kim van der Riet 2010-08-27 13:42:13 UTC
After examining the logs with trace-level logging enabled, it seems these failures are connected with qmf traffic (if I understand Gordon's analysis correctly), specifically when a qmf session is closed. In all cases I reproduced, the frame size 0 error was immediately preceded by a SessionCompletedBody with very large (corrupted) command completion values. In once case, this corrupted message also sparked an Execution exception on account of bad session completion information.

2010-08-23 19:14:01 trace RECV [127.0.0.1:57357]: Frame[BEbe; channel=1; {SessionCompletedBody: commands={ [0,10] [65537,50462837] }; }]
2010-08-23 19:14:01 debug anonymous.rhts_mrg10.lab.bos.redhat.com.11352_3_{(2010, 8, 23, 19, 14, 1, 0, 235, 1)}: sender marked completed: { [0,10] [65537,50462837] }
2010-08-23 19:14:01 debug anonymous.rhts_mrg10.lab.bos.redhat.com.11352_3_{(2010, 8, 23, 19, 14, 1, 0, 235, 1)}: sender confirmed point moved to (11+0)
2010-08-23 19:14:01 debug Exception constructed: Frame size too small 0 (qpid/framing/AMQFrame.cpp:101)
2010-08-23 19:14:01 error framing-error: Frame size too small 0 (qpid/framing/AMQFrame.cpp:101)


I commented out the management client from the test code (rhtslib_management.py:58, 59, 78) and reran the test. Although client connection timeout errors were observed in some runs, no further occurrence of the "Frame size 0" error was seen.

Since the client is using the old API which has known issues with thread safety, and since this is a multi-threaded client, I would suggest that in this case, the broker is not at fault, and is behaving correctly given the corrupted frame is likely being sent from the Python client.

I would propose changing this bug to state CLOSED/NOTABUG unless there are any objections or other considerations/tests.

Comment 3 Gordon Sim 2010-08-31 08:24:46 UTC
Thread-safety issue in old management client is now tracked by bug 628856 (whether or not this is deemed worth fixing given it is now superceded is another question). I'm going to mark this closed as suggested by Kim.


Note You need to log in before you can comment on or make changes to this bug.