Bug 860018

Summary:

Possible message loss if a cluster is partitioned

Product:

Red Hat Enterprise MRG

Reporter:

Alan Conway <aconway>

Component:

qpid-cpp

Assignee:

messaging-bugs <messaging-bugs>

Status:

CLOSED WONTFIX

QA Contact:

MRG Quality Engineering <mrgqe-bugs>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

2.2

CC:

jross, mcressma

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-11-27 23:15:45 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Reproducer	none

Description Alan Conway 2012-09-24 15:35:31 UTC

Description of problem:

If there is a partition of a cluster, it is theoretically possible that some messages sent by a client of an inquorate broker could be lost because of mis-match between cman and corosync status.

Version-Release number of selected component (if applicable): 0.18

How reproducible: has never been observed, this is a theoretical bug.

Steps to Reproduce: Unknown

Actual results: message loss

Expected results: no message loss

Additional info:

Qpidd monitors the cman for quorum changes and shuts the broker down if it becomes inquorate. This is intended to prevent message loss by forcing clients to fail over to a healthy broker and replay their un-acknowledged messages.

However there is a possible race condition between when qpidd checks the quorum status with cpg and when it multicasts messages to corosync.

Each time a broker joins or leaves the cluster, the cluster is considered to be in a new configuration. Each configuration is identified by a sequence number called the ring-id. Although CPG and corosync are dealing with the same cluster, they update their cluster status independently. As qpidd is currently coded, it's possible for it to see a cpg status from an older indicating a quorate configuration but to send corosync messages to a newer inquorate configuration.

In order to be sure not to send messages to an inquorate cluster, qpidd needs to check before each mcast that the cman and corosync ring-ids are the same AND cman indicates is quorate. If not, qpidd needs to wait till the sequence numbers converge before mcasting anything.

The fix should be reasonably straightforward, but testing will probably be very difficult. I'm not sure how the problem could be reproduced.

Comment 2 Pavel Moravec 2012-10-04 10:49:29 UTC

Created attachment 621562 [details]
Reproducer

A "weak" reproducer - using the script test_bz860018.sh, I was able to very few times to get message loss (once per many hours of run) and/or message duplicity (twice per the same time).

The repro simply runs qpid-send and qpid-receive (with message loss&duplicity checks on) against a broker where network failure is emulated.

The network failure is emulated following https://access.redhat.com/knowledge/solutions/79523 where it is dropped the whole traffic on the eth.interface used by corosync+cman (note, one needs to run this test on a machine with 2 NICs to keep AMQP traffic passing).

The reproducer has two flaws:
1) It takes ages to detect and recover from a split-brain. Usually, node reboot is required to un-fence. The script somehow mimics this just without reboots.

2) Message loss or duplicity is seen quite rarely, needs to run for a long time to verify possible fix.