191961 – clustat segfault when node is fenced

Bug 191961 - clustat segfault when node is fenced

Summary: clustat segfault when node is fenced

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Cluster Suite
Classification:	Retired
Component:	rgmanager
Sub Component:
Version:	4
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Lon Hohberger
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-05-16 16:09 UTC by Lenny Maiorani
Modified:	2009-04-16 20:20 UTC (History)
CC List:	2 users (show)
Fixed In Version:	RHBA-2007:149
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-06-21 16:13:00 UTC
Embargoed:

Attachments	(Terms of Use)

Description Lenny Maiorani 2006-05-16 16:09:15 UTC

Description of problem:
At nearly the same time as node3 was fenced, clustat was being run on node1 and
segfault'd. 


Version-Release number of selected component (if applicable):
1.9.46-1.3speed


Actual results:
May 15 19:18:45 sqaone01 kernel: CMAN: removing node sqaone03 from the cluster :
Missed too many heartbeats
May 15 19:18:45 sqaone01 kernel: clustat[22394]: segfault at 000000000000002a
rip 0000003765bb1463 rsp 0000007fbffffa90 error 4
May 15 19:18:46 sqaone01 fenced: sqaone03 not a cluster member after 0 sec
post_fail_delay
May 15 19:18:46 sqaone01 fenced: fencing node "sqaone03"
May 15 19:18:46 sqaone01 fenced: fence "sqaone03" success


Expected results:
no segfault

Comment 1 Lon Hohberger 2006-05-16 19:55:34 UTC

Did ccsd die as well?

Comment 2 Lenny Maiorani 2006-05-17 15:06:44 UTC

Nothing else died. The node was still running OK after the segfault.

Comment 3 Lon Hohberger 2006-05-17 15:37:06 UTC

Thank you -- I will keep looking; so far, I have not been able to reporduce it,
so I think it is a timing issue of some sort (eg - getting a member list while
cman is handling the transition).

Comment 4 Lenny Maiorani 2006-06-07 22:24:52 UTC

I think the new rgmanager (now using 1.9.46-1.4.2x) has fixed this.

Note You need to log in before you can comment on or make changes to this bug.