Bug 786418

Summary: QMF errors ignored by cluster, causing cluster de-sync
Product: Red Hat Enterprise MRG Reporter: Pavel Moravec <pmoravec>
Component: qpid-cppAssignee: Ken Giusti <kgiusti>
Status: CLOSED NOTABUG QA Contact: Frantisek Reznicek <freznice>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: esammons, freznice, iboverma, jross, kgiusti, pematous, rdassen
Target Milestone: 2.2   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-21 15:37:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Test cases: QMF create() vs AMQP queue.declare none

Description Pavel Moravec 2012-02-01 11:52:29 UTC
Description of problem:
Cluster error handling ignores errors on QMF. That leads to leave running a node affected by an error not seen by other nodes, i.e cluster de-sync.

Particular example: Via QMF, create a huge durable queue on a 2 node cluster, such that node1 of the cluster does not have sufficient free disk space for the queue journals, while node2 has enough free disk space. Cluster won't detect node1 failed to create the queue, leaving a cluster running with 1 node with the queue and 1 node without the queue.


Version-Release number of selected component (if applicable):
any (including qpid-0.12)


How reproducible:
100%


Steps to Reproduce:
1) 2 node cluster running
2) Let leave less than 13M of free disk space on node1 (while enough free space on node2)
3) On node1, run the attached simple program that will create queue HugeDurableQueue with qpid.file_count=64 and qpid.file_size=16384 (i.e. journal to be created of approx. 13M size).
4) QMF response will be negative (correct), but both nodes will be running with node1 not having the queue provisioned while node2 having the queue.
5) Repeating the test with sending the QMF command to node2 (with enough free disk space) will produce positive QMF response - a user is not aware of a problem on the cluster anyhow.

  
Actual results:
a) node1 is not shut down
b) invoking the QMF command against node2 creates a positive response


Expected results:
a) node1 shut down due to local error not met by other broker(s)
b) invoking the QMF command against any node shall be negativelly responded


Additional info:
QPID-3796 created for the same

Comment 1 Justin Ross 2012-02-02 16:49:43 UTC
Ken, please assess.

Comment 2 Ken Giusti 2012-02-03 17:05:33 UTC
Created attachment 559326 [details]
Test cases: QMF create() vs AMQP queue.declare

Shows the inconsistent behaviour between the two methods used to create a queue. The state of the cluster is invalid after the QMF method is used.

Comment 3 Ken Giusti 2012-02-03 17:13:13 UTC
Alan Conway hit the issue directly on the head:

"I think that is a bug. Haven't looked at the code but I would guess the QMF
errors are sent as QMF response messages, and not raised as AMQP errors, which
is what the cluster code is looking out for."

He's exactly correct - QMF creates a queue "in-band", using regular old data traffic that just happens to have meaning to the broker.  To the cluster, this traffic isn't anything special - it's just passed around like regular data....

But when AMQP commands are used to create the queue (queue.declare), the cluster is able to see the create happen, as these commands are "out of band", and the cluster is monitoring this traffic. 

qpid-config currently uses the AMQP commands method, not the QMF method, which is why we probably haven't hit this sooner.

This issue will be seen for all other object types - exchanges, bindings, etc - not specific to queues.

To solve this, we have to make QMF and clustering aware of each other.

Do we make cluster sniff for well known QMF addresses?  Or should QMF indicate these operations directly to the cluster code via an API?

Comment 5 Ken Giusti 2012-06-21 17:19:46 UTC
The result, while not ideal, cannot be prevented because the cluster cannot be guaranteed to operate correctly in this configuration.

The host environment differs between the clustered brokers - one host has more available diskspace than the other.   This contradicts the prescribed deployment guidelines for clustering - the environments must provide equvalent resources. If that is not held, eventually discrepencies will be introduced.