253341 – failure to start cluster service which had been modifed for correction

Bug 253341 - failure to start cluster service which had been modifed for correction

Summary: failure to start cluster service which had been modifed for correction

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	conga
Sub Component:
Version:	5.0
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ryan McCabe
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-08-17 22:43 UTC by Corey Marthaler
Modified:	2009-04-16 22:24 UTC (History)
CC List:	5 users (show)
Fixed In Version:	RHBA-2007-0642
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-11-07 17:00:45 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
log from luci server (13.89 KB, text/plain) 2007-08-17 22:46 UTC, Corey Marthaler	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2007:0642	0	normal	SHIPPED_LIVE	clustermon bug fix and enhancement update	2007-10-30 15:42:04 UTC

Description Corey Marthaler 2007-08-17 22:43:21 UTC

Description of problem:
This may be an issue with rgmanger. 

I defined a nfs cluster service consisting of two gfs filesystems. I created
both gfs resources with invalid devices and then attempted to start those
services. The service failed to start, as expected, with the error about not
being able to match a device to what's in the fs resource defination. 

I then corrected the mistake using conga, updated the cluster, and verified that
the .conf file was correct on all nodes. From here, I was unable to successfully
restart the service. Everytime I tried, it remained in the failed state. I then
attemped to restart rgmanager by hand and that worked. I then reloaded luci and
all was happy. 

So why wasn't conga able to start that service without manual intervention? If I
had used conga's big hammer and restarted the whole entire cluster, it would
have probably worked, but an entire cluster restart shouldn't be necessary.


Version-Release number of selected component (if applicable):
ricci-0.10.0-4.el5
luci-0.10.0-4.el5

How reproducible:
everytime

Comment 1 Corey Marthaler 2007-08-17 22:46:11 UTC

Created attachment 161775 [details]
log from luci server

Here is what luci was trying while i was playing around with this.

Comment 2 Ryan McCabe 2007-08-17 23:44:35 UTC

To clarify, when you say you restarted rgmanager, do you mean you did a 'service
rgmanager restart' or did you start it with rgmanager (i.e., clusvcadm -e svcname)?

Comment 3 Corey Marthaler 2007-08-20 13:46:32 UTC

I restarted it with 'service rgmanager restart'.

Comment 4 Ryan McCabe 2007-08-20 15:13:36 UTC

This is a conga bug. Lon says failed services need to be explicitly disabled
before being started again. Patch to be checked in shortly.

Comment 6 Corey Marthaler 2007-08-22 16:03:28 UTC

fix verified in 0.10.0-5.el5.

Comment 8 errata-xmlrpc 2007-11-07 17:00:45 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0642.html

Note You need to log in before you can comment on or make changes to this bug.