Bug 201565

Summary: clusvcmgrd Service "Status" operation fails returning "ERROR: Memory fault"
Product: [Retired] Red Hat Cluster Suite Reporter: Carlos Rodrigues <cjmrodrigues>
Component: clumanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED NOTABUG QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-08 16:23:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Carlos Rodrigues 2006-08-07 14:20:12 UTC
From Bugzilla Helper:
User-Agent: Opera/9.01 (Windows NT 5.1; U; en)

Description of problem:
clusvcmgrd Service "Status" operation fails returning "ERROR: Memory fault", 
causing the stop of the cluster service:

Aug  7 10:42:38 rtc2 clusvcmgrd: [18687]: <err> service error: User script '/
home/core/DSCP/cluster/bin/clu_Offline status' returned
Aug  7 10:42:39 rtc2 clusvcmgrd: [18687]: <err> service error: LTRH Running 
Aug  7 10:42:39 rtc2 clusvcmgrd: [18687]: <err> service error: ERROR: Memory 
fault 
Aug  7 10:42:39 rtc2 clusvcmgrd: [18687]: <err> service error: Check status 
failed on user script for Offline 
Aug  7 10:42:39 rtc2 clusvcmgrd[18686]: <warning> Restarting locally failed 
service Offline 

Version-Release number of selected component (if applicable):
clumanager-1.2.22-2

How reproducible:
Sometimes


Steps to Reproduce:
1.Configure a short monitor interval for this service 
2.Wait for a failure (happens arount once each week)
3.

Actual Results:


Expected Results:


Additional info:

Comment 1 Lon Hohberger 2006-08-08 16:23:17 UTC
The errors in the logs are stdout/stderr from the application script
'/home/core/DSCP/cluster/bin/clu_Offline', and are being reported to syslog by
the service handler.

That is, the error is coming from the application script, not from clusvcmgrd
itself.

If this is an intermittent problem (which you are confident does not indicate an
actual error with your application), then you could have the application script
retry when this particular error occurs.

Comment 2 Lon Hohberger 2006-08-08 21:32:27 UTC
If you'd like to attach the script, I can see if there's an easy way to make it
retry in this case.