| Summary: | Ganesha service fails to restart after reboot with missing nfs folder under /var/lib | ||
|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Shashank Raj <sraj> |
| Component: | nfs-ganesha | Assignee: | Kaleb KEITHLEY <kkeithle> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | storage-qa-internal <storage-qa-internal> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.1 | CC: | jthottan, kkeithle, ndevos, nlevinki, sashinde, skoduri, sraj |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-06-20 12:36:34 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Shashank Raj
2016-05-01 18:29:26 UTC
sosreport of the node can be found under http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1332047 From ganesha.log,
02/05/2016 03:17:11 : epoch 5726795d : dhcp37-180.lab.eng.blr.redhat.com : ganesha.nfsd-13892[main] Bind_sockets_V6 :DISP :WARN :Cannot bind RQUOTA tcp6 socket, error 98 (Address already in use)
02/05/2016 03:17:11 : epoch 5726795d : dhcp37-180.lab.eng.blr.redhat.com : ganesha.nfsd-13892[main] Bind_sockets :DISP :FATAL :Error binding to V6 interface. Cannot continue.
02/05/2016 03:17:11 : epoch 5726795d : dhcp37-180.lab.eng.blr.redhat.com : ganesha.nfsd-13892[main] unregister_fsal :FSAL :CRIT :Unregister FSAL GLUSTER with non-zero refcount=1
02/05/2016 03:17:11 : epoch 5726795d : dhcp37-180.lab.eng.blr.redhat.com : ganesha.nfsd-13892[main] glusterfs_unload :FSAL :CRIT :FSAL Gluster unable to unload. Dying ...
Rquota port (875) was already in use by some other process.
[root@dhcp37-180 ~]# netstat -ntaunlp | grep 875
tcp 0 0 10.70.37.180:875 10.70.37.127:24007 TIME_WAIT -
It doesn't list any process pid which is using this port, but
Looks like the port was being used by one of the processes which is/was connected to glusterd port. But seems very strange why its pid is not being listed in the above netstat command.
However when configured a different port for RQuota in '/etc/ganesha/ganesha.conf', nfs-ganesha process has got started.
NFS_Core_Param {
#Use supplied name other tha IP In NSM operations
NSM_Use_Caller_Name = true;
#Copy lock states into "/var/lib/nfs/ganesha" dir
Clustered = false;
#By default port number '2049' is used for NFS service.
#Configure ports for MNT, NLM, RQuota services.
#The ports chosen here are from '/etc/sysconfig/nfs'
MNT_Port = 20048;
NLM_Port = 32803;
Rquota_Port = 8750;
}
%include "/etc/ganesha/exports/export.tiervolume.conf"
[root@dhcp37-180 ~]#
[root@dhcp37-180 ~]# showmount -e localhost
Export list for localhost:
/tiervolume (everyone)
[root@dhcp37-180 ~]#
This seems like a known issue. Since the ports which we use for NLM/RQuota are not registered, we could occasionally run into these issues if there is any other process using them. We need to document to configure a different port in such cases, open that port via firewalld and then start nfs-ganesha. Please open another bug to track this issue.
This bug can be used to track why '/var/lib/nfs' link has been missing. Thanks!
With respect to /var/lib/nfs folder missing, I tried to re-create the issue but not able to reproduce. One thing to note here is while setting up ganesha, we move existing /var/lib/nfs to /var/lib/nfs.backup folder and create a link to '/var/lib/nfs' to a folder in our shared_storage. While tearing down the ganesha setup, we restore back /var/lib/nfs/backup to '/var/lib/nfs'. Since current ganesha ocf scripts check for the presence of '/var/lib/nfs' before taking any action, if by an chance that folder link is removed by any other process, it shall leave the folder as is both during setup and teardown. I request Shashank to keep monitoring the state of '/var/lib/nfs' and provide definite steps of reproducing the issue. (In reply to Soumya Koduri from comment #4) > With respect to /var/lib/nfs folder missing, I tried to re-create the issue > but not able to reproduce. > > One thing to note here is while setting up ganesha, we move existing > /var/lib/nfs to /var/lib/nfs.backup folder and create a link to > '/var/lib/nfs' to a folder in our shared_storage. While tearing down the > ganesha setup, we restore back /var/lib/nfs/backup to '/var/lib/nfs'. Since Sorry for the typo above. Its /var/lib/nfs.backup. not seen in 3.1.3 testing. reopen if necessary for 3.2. |