Bug 872934 - Unit test causing segfault on clustered broker
Unit test causing segfault on clustered broker
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
Development
Unspecified Unspecified
high Severity urgent
: 2.3
: ---
Assigned To: mick
Petr Matousek
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-04 07:48 EST by Petr Matousek
Modified: 2013-03-19 12:38 EDT (History)
5 users (show)

See Also:
Fixed In Version: qpid-cpp-0.18-10
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-03-19 12:38:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
clustered broker log (5672) (12.15 KB, text/plain)
2012-11-04 08:02 EST, Petr Matousek
no flags Details
clustered broker log (5673) (12.59 KB, text/plain)
2012-11-04 08:02 EST, Petr Matousek
no flags Details
clustered broker log (5672) (25.90 KB, text/plain)
2012-11-04 08:44 EST, Petr Matousek
no flags Details
clustered broker log (5673) (27.43 KB, text/plain)
2012-11-04 08:45 EST, Petr Matousek
no flags Details
broker coredump (5672) (15.34 KB, text/plain)
2012-11-04 08:46 EST, Petr Matousek
no flags Details
broker coredump (5673) (14.74 KB, text/plain)
2012-11-04 08:47 EST, Petr Matousek
no flags Details

  None (edit)
Description Petr Matousek 2012-11-04 07:48:23 EST
Description of problem:

Following qmf unit test is causing segfault on rhel5 clustered broker in the exit phase:
qpid_tests.broker_0_10.qmf_events.EventTests.test_queue_autodelete_exclusive ......................................................................... fail
Error during teardown:  Traceback (most recent call last):
    File "/usr/bin/qpid-python-test", line 340, in run
      phase()
    File "/usr/lib/python2.4/site-packages/qpid/tests/messaging/init.py", line 55, in teardown
      self.teardown_connection(self.conn)
    File "/usr/lib/python2.4/site-packages/qpid/tests/messaging/init.py", line 59, in teardown_connection
      conn.close(timeout=self.timeout())
    File "<string>", line 6, in close
    File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 316, in close
      ssn.close(timeout=timeout)
    File "<string>", line 6, in close
    File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 749, in close
      if not self._ewait(lambda: self.closed, timeout=timeout):
    File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 566, in _ewait
      result = self.connection._ewait(lambda: self.error or predicate(), timeout)
    File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 209, in _ewait
      self.check_error()
    File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 202, in check_error
      raise self.error
  ConnectionError: (104, 'Connection reset by peer')
Totals: 1 tests, 0 passed, 0 skipped, 0 ignored, 1 failed
[1]-  Segmentation fault      (core dumped) qpidd --cluster-name pematous --data-dir=broker1 >&broker1.log
[2]+  Segmentation fault      (core dumped) qpidd -p 5673 --cluster-name pematous --data-dir=broker2 >&broker2.log

Broker's log files are attached.

Standalone broker do not suffer from that.

Version-Release number of selected component (if applicable):
qpid-cpp-*-0.18-6

How reproducible:
100%

Steps to Reproduce:
1. qpidd  --cluster-name pematous --data-dir=broker1 &>broker1.log &
2. qpidd -p 5673 --cluster-name pematous --data-dir=broker2 &>broker2.log &
3. PYTHONPATH=${PYTHONPATH}:. qpid-python-test -m qpid_tests qpid_tests.broker_0_10.qmf_events.EventTests.test_queue_autodelete_exclusive
  
Actual results:
Seqfault

Expected results:
Test is passing

Additional info:
Comment 2 Petr Matousek 2012-11-04 08:00:53 EST
NOTE: this was also seen on rhel6
Comment 3 Petr Matousek 2012-11-04 08:02:00 EST
Created attachment 637988 [details]
clustered broker log (5672)
Comment 4 Petr Matousek 2012-11-04 08:02:33 EST
Created attachment 638000 [details]
clustered broker log (5673)
Comment 5 Petr Matousek 2012-11-04 08:44:29 EST
Created attachment 638007 [details]
clustered broker log (5672)
Comment 6 Petr Matousek 2012-11-04 08:45:07 EST
Created attachment 638008 [details]
clustered broker log (5673)
Comment 7 Petr Matousek 2012-11-04 08:46:43 EST
Created attachment 638009 [details]
broker coredump (5672)
Comment 8 Petr Matousek 2012-11-04 08:47:18 EST
Created attachment 638010 [details]
broker coredump (5673)
Comment 9 mick 2012-11-09 08:38:39 EST
This problem introduced to the 0.18-mrg branch at this point:

  Bug 869002 - QPID-4394
  sha: 795e416d4a3a07c655ae47e23c48b7244436a87c

Broker::createQueue thinks it has created the queue.
Cluster::deliverToQueue disagrees.   

...still investigating...
Comment 10 mick 2012-11-19 13:46:46 EST
partial fix:

to avoid the SEGV, 

in file src/qpid/cluster/Cluster.cpp

in fn   Cluster::deliverToQueue

in the code block    if ( ! q )

there should be a return; after the call to leave(l);

Otherwise, control will pass to the code below that block, and we will attempt to deliver a message to a queue that we already know to be nonexistent, using a pointer that is null.


This change must definitely be made, but it is still only a partial fix, as the test in question still fails.  

...still investigating...
Comment 11 mick 2012-11-20 14:28:27 EST
The SEGV has been fixed by a recent checkin that added a throw to the code mentioned in comment 10, above.

( The fix came from Alan's checkin for BZ 875660 )

I believe that the continuing test failure is benign, and only indicates that the testing code needs to be made a little smarter.  ( I will explain in the new BZ. )

Please redefine this bug as concerning the SEGV only.
I am moving the failure of the test to a new BZ.
Comment 12 mick 2012-11-20 15:37:12 EST
new BZ is 878638.
Comment 13 Petr Matousek 2012-11-28 05:06:27 EST
The SEGV has been fixed, however the above mentioned test is still failing and causing the  broker to shut down due to cluster delivery to non-existent queue.
This issue is tracked by bug 878638.

Verified on rhel5.9 and rhel6.3 (x86_64, i386)

packages used for testing:
qpid-cpp-*-0.18-10

-> VERIFIED

Note You need to log in before you can comment on or make changes to this bug.