Red Hat Bugzilla – Bug 143555
Want redundant quorum partitions
Last modified: 2009-04-16 16:26:34 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.3; Linux) (KHTML, like Gecko)
Description of problem:
It would be nice to have redundant quorum partitions.
As for now, both are MANDATORY for cluster services to start.
But what if i have two delocalized storage arrays, Oracle 10g with ASM (datafiles mirroring) and want the cluster to run even if one of them is broken/unpowered/stolen-by-martians ?
/dev/sda resides on array 1 in Rome
/dev/sdb resides on array 2 in Paris
if Rome gets hit by a meteor the surviving cluster nodes will not start anymore.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Make one of the quorum raw devices unavailable
2.service clumanager start
Actual Results: Starting Red Hat Cluster Manager...
Starting Quorum Daemon:
Message from syslogd@dbrhs2 at Wed Dec 22 08:33:46 2004 ...
dbrhs2 cluquorumd: <emerg> Not Starting Cluster Manager: Shared State Error: Bad file descriptor
Unable to open /dev/raw/raw2 read/write.
initSharedFD: unable to validate partition /dev/raw/raw2. Configuration error?
Expected Results: Starting Red Hat Cluster Manager...
Starting Quorum Daemon: [ OK ]
Are you intending to use back-end (e.g. storage-handled) data replication?
If so, simply mirror both quorum partitions as well. Note that
disaster tolerance (e.g. site-disaster) doesn't currently work without
That is, you can mirror all of the data (as long as the cluster
members' hostnames and service IP addresses can be the same in both
locations) and run the cluster in either place. When site A fails, an
administrator must (a) prevent site A from restarting the cluster
services and (b) start the cluster services at site B.
In future releases of Cluster Suite, quorum partitions will not be
Evaluating for future enhancement.
No back-end data replication is available.
I don't matter about Data, as it's managed by Oracle 10g Advanced
I have 2 node and 2 arrays: node1 and array1 in site A and node2 and
array2 in site B:
sda -> quorum on array1 (raw1)
sdb -> ASM data on array1
sdc -> quorum on array2 (raw2)
sdd -> ASM data on array2
if either sites fail (fire, blackout, meteor) the node in the other
site also fails (can't reach one of the quorum partition).
I've resolved it this way:
sda -> Linux raid autodetect on array1
sdb -> Linux raid autodetect on array1
sdc -> ASM data on array1
sdd -> Linux raid autodetect on array2
sde -> Linux raid autodetect on array2
sdf -> ASM data on array2
md0 -> RAID1 (sda sdd) -> quorum partition (raw1)
md1 -> RAID1 (sdb sde) -> quorum partition (raw2)
seems to work
but md is not cluster aware; i am prone to incosistency between
maybe using an md as raw pretects me from inconsistency ?
I don't think MD as raw will entirely fix this situation. However, if
you're not actually sharing data (and the only shared data is the
shared clumanager partitions themselves), the chances of an
inconsistency should be very low (or nonexistent). Clumanager only
shares small pieces of data, and those pieces are typically protected
by a lock.
(That said, running clumanager atop of MD or software LVM partitions
is not supported. What is needed for this to work properly is a
Clustered LVM is available for RHEL4, and RHEL4 doesn't use shared partitions
for state information.
Since they're not required any longer, I'm closing this against RHCS3.