Red Hat Bugzilla – Bug 201565
clusvcmgrd Service "Status" operation fails returning "ERROR: Memory fault"
Last modified: 2009-04-16 16:20:49 EDT
From Bugzilla Helper:
User-Agent: Opera/9.01 (Windows NT 5.1; U; en)
Description of problem:
clusvcmgrd Service "Status" operation fails returning "ERROR: Memory fault",
causing the stop of the cluster service:
Aug 7 10:42:38 rtc2 clusvcmgrd: : <err> service error: User script '/
home/core/DSCP/cluster/bin/clu_Offline status' returned
Aug 7 10:42:39 rtc2 clusvcmgrd: : <err> service error: LTRH Running
Aug 7 10:42:39 rtc2 clusvcmgrd: : <err> service error: ERROR: Memory
Aug 7 10:42:39 rtc2 clusvcmgrd: : <err> service error: Check status
failed on user script for Offline
Aug 7 10:42:39 rtc2 clusvcmgrd: <warning> Restarting locally failed
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Configure a short monitor interval for this service
2.Wait for a failure (happens arount once each week)
The errors in the logs are stdout/stderr from the application script
'/home/core/DSCP/cluster/bin/clu_Offline', and are being reported to syslog by
the service handler.
That is, the error is coming from the application script, not from clusvcmgrd
If this is an intermittent problem (which you are confident does not indicate an
actual error with your application), then you could have the application script
retry when this particular error occurs.
If you'd like to attach the script, I can see if there's an easy way to make it
retry in this case.