Bug 689466 - Issue with RGManager and CLVMD
Summary: Issue with RGManager and CLVMD
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2-cluster
Version: 5.6
Hardware: x86_64
OS: Linux
medium
urgent
Target Milestone: rc
: ---
Assignee: LVM and device-mapper development team
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-21 15:49 UTC by rauch
Modified: 2018-11-28 21:27 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-02-10 18:21:13 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Legacy) 18610 0 None None None Never

Description rauch 2011-03-21 15:49:07 UTC
Description of problem:

We have the following issue on a two node cluster system (node1 + node2) with qdisk.

The RGManager command clustat does not show the cluster services any more on node1. The RGManager status was also not listed for both cluster members by clustat.

On cluster member node2, the command clustat shows the cluster services and the RGManager for both nodes as running.

The status logging of the RGManager stops on the cluster members 2 days ago without errors.

See below the command clustat from both cluster nodes:


node1 ~ # clustat
Service states unavailable: Temporary failure; try again
Cluster Status for CL_XY @ Fri Mar 18 08:51:21 2011
Member Status: Quorate

Member Name	        ID		Status
------ ----				---- ------
node1			1		Online, Local
node2			2		Online
/dev/dm-11		0		Online, Quorum Disk


node2 ~ # clustat
Cluster Status for CL_XY @ Fri Mar 18 08:53:33 2011
Member Status: Quorate

Member Name	        ID	        Status
------ ----			        ---- ------
node1		        1	        Online, RG-Worker
node2		        2	        Online, Local, RG-Master
/dev/dm-11		0	        Online, Quorum Disk

Service Name		Owner (Last)		State 
------- ----		----- ------		----- 
service:1		node1			started
service:2		node1			started 
service:3		node2			started
service:4		node2			started

Furthermore the LVM command does not successfully execute. The commands just hang, but it was possible to kill the processes by pressing ctrl+c.

The exact same issue happend a month ago on this system with RHEL5.5. In the meantime the cluster was updated to RHEL5.6. Debugging for RGManager and cman is now activated in the /etc/cluster/cluster.conf file.


Version-Release number of selected component (if applicable):

kernel-2.6.18-238.5.1.el5.x86_64
cman-2.0.115-68.el5_6.1.x86_64
rgmanager-2.0.52-9.el5.x86_64

How reproducible:
not reproducible

Steps to Reproduce:
1.
2.
3.
  
Actual results:
RGManager does stop writing logs.
Command clustat is not working for 1 node.
LVM commands stop working.

Expected results:


Additional info:
The cluster services include mounted disks.
For this setup a LVM HA configuration is used (RHEL5.6).

Comment 1 ot 2011-08-26 15:44:29 UTC
I am faced with the same issue. Any idea when will a fix be available? Any work-arounds?

(In reply to comment #0)
> Description of problem:
> 
> We have the following issue on a two node cluster system (node1 + node2) with
> qdisk.
> 
> The RGManager command clustat does not show the cluster services any more on
> node1. The RGManager status was also not listed for both cluster members by
> clustat.
> 
> On cluster member node2, the command clustat shows the cluster services and the
> RGManager for both nodes as running.
> 
> The status logging of the RGManager stops on the cluster members 2 days ago
> without errors.
> 
> See below the command clustat from both cluster nodes:
> 
> 
> node1 ~ # clustat
> Service states unavailable: Temporary failure; try again
> Cluster Status for CL_XY @ Fri Mar 18 08:51:21 2011
> Member Status: Quorate
> 
> Member Name         ID  Status
> ------ ----    ---- ------
> node1   1  Online, Local
> node2   2  Online
> /dev/dm-11  0  Online, Quorum Disk
> 
> 
> node2 ~ # clustat
> Cluster Status for CL_XY @ Fri Mar 18 08:53:33 2011
> Member Status: Quorate
> 
> Member Name         ID         Status
> ------ ----           ---- ------
> node1          1         Online, RG-Worker
> node2          2         Online, Local, RG-Master
> /dev/dm-11  0         Online, Quorum Disk
> 
> Service Name  Owner (Last)  State 
> ------- ----  ----- ------  ----- 
> service:1  node1   started
> service:2  node1   started 
> service:3  node2   started
> service:4  node2   started
> 
> Furthermore the LVM command does not successfully execute. The commands just
> hang, but it was possible to kill the processes by pressing ctrl+c.
> 
> The exact same issue happend a month ago on this system with RHEL5.5. In the
> meantime the cluster was updated to RHEL5.6. Debugging for RGManager and cman
> is now activated in the /etc/cluster/cluster.conf file.
> 
> 
> Version-Release number of selected component (if applicable):
> 
> kernel-2.6.18-238.5.1.el5.x86_64
> cman-2.0.115-68.el5_6.1.x86_64
> rgmanager-2.0.52-9.el5.x86_64
> 
> How reproducible:
> not reproducible
> 
> Steps to Reproduce:
> 1.
> 2.
> 3.
> 
> Actual results:
> RGManager does stop writing logs.
> Command clustat is not working for 1 node.
> LVM commands stop working.
> 
> Expected results:
> 
> 
> Additional info:
> The cluster services include mounted disks.
> For this setup a LVM HA configuration is used (RHEL5.6).

Comment 2 Lon Hohberger 2012-02-10 17:54:11 UTC
Looks like this slipped through the cracks.

clustat is just a victim here; if the cluster is locked up (e.g. fencing causing a problem or clvmd causing a problem), the errors will bubble up to the top.

That lvm commands were hanging indicates that this is not an rgmanager issue.

Comment 4 Alasdair Kergon 2012-02-10 18:21:13 UTC
If anyone sees this again with the current versions of the packages, then please reopen this and attach relevant LVM diagnostics to this bug.


Note You need to log in before you can comment on or make changes to this bug.