Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 236580

Summary:

[HA LVM]: Bringing site back on-line after failure causes problems

Product:

[Retired] Red Hat Cluster Suite

Reporter:

Jonathan Earl Brassow <jbrassow>

Component:

rgmanager

Assignee:

Jonathan Earl Brassow <jbrassow>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Cluster QE <mspqa-list>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

CC:

cfeist, cluster-maint, lhh

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2009-02-05 00:19:57 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
lvm.sh script with bad device exclusion	none

Description Jonathan Earl Brassow 2007-04-16 15:31:48 UTC

Our HA cluster is configured as follows:
Two node cluster - one node in the 'B' datacentre and one node in the 'C' datacentre.
Two disk arrays - one in each datacentre.
Two services - each using lvm volumes that are mirrored across the two disk arrays.
IPMI is the only automatic fencing mechanism.

To simulate a failure of the 'C' datacentre, we simultaneously shut the power off to the 'C' node while
disabling the SAN ports to the 'C' disk array. To prevent the 'B' node from fencing the 'C' node - the 'B'
node network interface that connects to the 'C' node IPMI device was disabled.

After the 'C' node had missed too many heartbeats, the 'B' node attempted to fence the 'C' node using
fence_ipmilan. This failed because the 'B' node couldn't connect to the 'C' node IPMI device.

I then initiated a manual fence with the fence_ack_manual command. The 'B' node successfully took
over the services from the 'C' node. It handled the volume group inconsistencies, and successfully
activated the previously mirrored volumes as linear volumes.

Up to this point I'm very happy with how it's operating!

The problems begin if I then power on the 'C' node again. At the point when the 'C' node is powered on,
the 'B' node is running all the services and the SAN ports to the 'C' disk array are still unavailable.

When rgmanager starts on the 'C' node, it attempts to stop all the resources that are running on the 'B'
node. It then appears to attempt to start the services locally - even though they are running on the B-
node. When I run clustat on the 'C' node, it now reports that all the services are failed and that the last
node they ran on was the 'C' node.

I wanted to see if the logical volumes were still active on the 'B' node; however, when I entered the 'lvs -
a -o +devices,tags' command on the 'B' node, it hung and never returned. No LVM commands would
return on the 'B' node. The only way I could recover the 'B' node was to power it off and on again. I
couldn't reboot the node because the Cluster Suite services were hung.

When I enter the 'lvs -a -o +devices,tags' command on the 'C' node, the Cluster Suite-managed
volumes are NOT active, but they are tagged with BOTH nodenames!

Comment 1 Jonathan Earl Brassow 2007-04-16 15:34:16 UTC

I'm seeing something very different from you, but it may be worth trying with the changes I've made.

Here's what I see:
Site fail-over works fine.  If I reactivate the failed site (including the storage device), when the service 
tries to move back, it fails to activate due to a conflict it sees in the available devices.  [The failed device 
has now come back - leaving a LVM metadata conflict.]  This leaves the service in the 'failed' state.

Here's what I've done.
I've added some code to determine what the valid devices are, and use those and only those devices 
when activating.  This solved the problem for me.  You will need to ensure that this works fine with 
your multipath setup.  I don't think there should be issues in that regard, but I don't want to guess.

This may not be the issue you are seeing, but the bug I found could certainly cause similar problems.  
Be sure that you have the latest updates.  I've attached the lvm.sh file to be placed in /usr/share/cluster 
on all the machines.  When we've gone through a few successful iterations of testing we will be sure to 
commit the changes.

Comment 2 Jonathan Earl Brassow 2007-04-16 15:35:49 UTC

Created attachment 152701 [details]
lvm.sh script with bad device exclusion

Comment 3 Jonathan Earl Brassow 2007-04-18 18:17:17 UTC

bad device exclusion script with minor changes checked-in

assigned -> post

Another concern I have in the user's implementation is the initrd.  The initrd
should (must) contain the correctly modified lvm.conf

Comment 4 Chris Feist 2009-02-05 00:19:57 UTC

This has been built and is in the current RHEL4 release of rgmanager.