Bug 670630 - NFS share is working fine, but keeps failing-over
Summary: NFS share is working fine, but keeps failing-over
Keywords:
Status: CLOSED DUPLICATE of bug 661881
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: resource-agents
Version: 6.0
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Marek Grac
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-01-18 20:59 UTC by joshua
Modified: 2011-03-18 15:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-03-18 15:10:32 UTC
Target Upstream Version:


Attachments (Terms of Use)
sanitized cluster.conf config file (2.21 KB, text/plain)
2011-01-18 21:05 UTC, joshua
no flags Details

Description joshua 2011-01-18 20:59:31 UTC
Description of problem:

My nfs share I've setup on RHEL6 Cluster Suite does in fact work... however, it keeps "failing":

Jan 18 15:49:50 cdtg-rtp-sun-1 rgmanager[6559]: Recovering failed service service:CDTG-NFS-share
Jan 18 15:49:51 cdtg-rtp-sun-1 rgmanager[23610]: Adding export: xxx.18.188.0/25:/data/cluster-storage/ (fsid=40050,rw)
Jan 18 15:49:51 cdtg-rtp-sun-1 rgmanager[23683]: Adding IPv4 address xxx.18.188.202/25 to bond0
Jan 18 15:49:54 cdtg-rtp-sun-1 ntpd[6419]: Listening on interface #1969 bond0, xxx.18.188.202#123 Enabled
Jan 18 15:49:54 cdtg-rtp-sun-1 rgmanager[6559]: Service service:CDTG-NFS-share started

From another machine, I can see that the NFS service is up:
#showmount  -e xxx.18.188.202
Export list for xxx.18.188.202:
/data/cluster-storage xxx.18.188.0/25

... then about a minute later...

Jan 18 15:51:00 cdtg-rtp-sun-1 rgmanager[24355]: nfsclient:CDTG-NFS-Service is missing!
Jan 18 15:51:00 cdtg-rtp-sun-1 rgmanager[6559]: status on nfsclient "CDTG-NFS-Service" returned 1 (generic error)
Jan 18 15:51:00 cdtg-rtp-sun-1 rgmanager[24412]: Removing export: xxx.18.188.0/25:/data/cluster-storage/
Jan 18 15:51:00 cdtg-rtp-sun-1 rgmanager[24447]: Adding export: xxx.18.188.0/25:/data/cluster-storage/ (fsid=40050,rw)
Jan 18 15:51:10 cdtg-rtp-sun-1 rgmanager[6559]: Stopping service service:CDTG-NFS-share
Jan 18 15:51:10 cdtg-rtp-sun-1 rgmanager[24658]: Removing IPv4 address xxx.18.188.202/25 from bond0
Jan 18 15:51:12 cdtg-rtp-sun-1 ntpd[6419]: Deleting interface #1969 bond0, xxx.18.188.202#123, interface stats: received=0, sent=0, dropped=0, active_time=78 secs
Jan 18 15:51:20 cdtg-rtp-sun-1 rgmanager[24720]: Removing export: xxx.18.188.0/25:/data/cluster-storage/
Jan 18 15:51:20 cdtg-rtp-sun-1 rgmanager[6559]: Service service:CDTG-NFS-share is recovering
Jan 18 15:51:24 cdtg-rtp-sun-1 rgmanager[6559]: Service service:CDTG-NFS-share is now running on member 2

... why is this?  The export works, but is continuously failed-over by rgmanager on both nodes! :-(


Version-Release number of selected component (if applicable):

rgmanager-3.0.12-10.el6.x86_64

Comment 2 joshua 2011-01-18 21:05:38 UTC
Created attachment 474147 [details]
sanitized cluster.conf config file

Comment 3 joshua 2011-01-18 22:45:41 UTC
Not sure if this has any bearing on the issue, but I found this similar complaint:

http://www.redhat.com/archives/linux-cluster/2010-March/msg00019.html

Comment 4 joshua 2011-01-18 22:51:05 UTC
The xxx.18.188.202 IP address didn't have a hostname before, and and such "clufindhostname -i xxx.18.188.202" would fail with error code 2.  I've added a reverse lookup entry for it, and the clufindhostname is much happier now, and exists with error code 0.  However, this doesn't stop the failing-over of an otherwise working service from happening.

Comment 5 Lon Hohberger 2011-03-18 15:10:32 UTC
Ah ha -

*** This bug has been marked as a duplicate of bug 661881 ***

Comment 6 Lon Hohberger 2011-03-18 15:11:34 UTC
clufindhostname is not working right; also, remove the trailing slash from your mount point name in your fs resource.


Note You need to log in before you can comment on or make changes to this bug.