Hide Forgot
Description of problem: Missing ganesha folder from shared storage after reboot of few nodes. Version-Release number of selected component (if applicable): glusterfs-3.7.9-1 How reproducible: once Steps to Reproduce: 1.Configure ganesha on a 4 node gluster. 2.Create a volume and enable ganesha on the volume. 3.Take down 2 of the nodes and bring it back after sometime. 4.Make sure after the nodes comes back, shared volume is mounted on all the nodes. 5.Start pcs, pacemaker and nfs-ganesha service on the nodes which came up. 5.Observe that from the shared volume, nfs-ganesha folder got missing and because of which statd service failed on the rebooted node with below messages in logs Apr 5 07:13:30 dhcp37-127 systemd: Starting NFS status monitor for NFSv2/3 locking.... Apr 5 07:13:30 dhcp37-127 rpc.statd[31060]: Version 1.3.0 starting Apr 5 07:13:30 dhcp37-127 rpc.statd[31060]: Flags: TI-RPC Apr 5 07:13:30 dhcp37-127 rpc.statd[31060]: Failed to open directory sm: No such file or directory Apr 5 07:13:30 dhcp37-127 rpc.statd[31060]: Initializing NSM state Apr 5 07:13:30 dhcp37-127 rpc.statd[31060]: Failed to create /var/lib/nfs/statd/state.new: No such file or directory Apr 5 07:13:30 dhcp37-127 systemd: nfs-ganesha-lock.service: control process exited, code=exited status=1 Apr 5 07:13:30 dhcp37-127 systemd: Failed to start NFS status monitor for NFSv2/3 locking.. Apr 5 07:13:30 dhcp37-127 systemd: Unit nfs-ganesha-lock.service entered failed state. Apr 5 07:13:30 dhcp37-127 systemd: nfs-ganesha-lock.service failed. nfs-ganesha lock service status from the 2 nodes: [root@dhcp37-127 ~]# service nfs-ganesha-lock status Redirecting to /bin/systemctl status nfs-ganesha-lock.service ● nfs-ganesha-lock.service - NFS status monitor for NFSv2/3 locking. Loaded: loaded (/usr/lib/systemd/system/nfs-ganesha-lock.service; static; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2016-04-05 07:13:30 IST; 5min ago Process: 31059 ExecStart=/usr/sbin/rpc.statd --no-notify $STATDARGS (code=exited, status=1/FAILURE) Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com systemd[1]: Starting NFS status monitor for NFSv2/3 locking.... Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com rpc.statd[31060]: Version 1.3.0 starting Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com rpc.statd[31060]: Flags: TI-RPC Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com rpc.statd[31060]: Failed to open directory sm: No such file or directory Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com rpc.statd[31060]: Initializing NSM state Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com rpc.statd[31060]: Failed to create /var/lib/nfs/statd/state.new: No such file or directory Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com systemd[1]: nfs-ganesha-lock.service: control process exited, code=exited status=1 Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com systemd[1]: Failed to start NFS status monitor for NFSv2/3 locking.. Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com systemd[1]: Unit nfs-ganesha-lock.service entered failed state. Apr 05 07:13:30 dhcp37-127.lab.eng.blr.redhat.com systemd[1]: nfs-ganesha-lock.service failed. [root@dhcp37-174 ~]# service nfs-ganesha-lock status Redirecting to /bin/systemctl status nfs-ganesha-lock.service ● nfs-ganesha-lock.service - NFS status monitor for NFSv2/3 locking. Loaded: loaded (/usr/lib/systemd/system/nfs-ganesha-lock.service; static; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2016-04-05 06:35:45 IST; 49min ago Process: 12973 ExecStart=/usr/sbin/rpc.statd --no-notify $STATDARGS (code=exited, status=1/FAILURE) Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com systemd[1]: Starting NFS status monitor for NFSv2/3 locking.... Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com rpc.statd[12974]: Version 1.3.0 starting Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com rpc.statd[12974]: Flags: TI-RPC Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com rpc.statd[12974]: Failed to open directory sm: No such file or directory Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com rpc.statd[12974]: Initializing NSM state Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com rpc.statd[12974]: Failed to create /var/lib/nfs/statd/state.new: No such file or directory Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com systemd[1]: nfs-ganesha-lock.service: control process exited, code=exited status=1 Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com systemd[1]: Failed to start NFS status monitor for NFSv2/3 locking.. Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com systemd[1]: Unit nfs-ganesha-lock.service entered failed state. Apr 05 06:35:45 dhcp37-174.lab.eng.blr.redhat.com systemd[1]: nfs-ganesha-lock.service failed. [root@dhcp37-127 ~]# cd /var/run/gluster/shared_storage/ [root@dhcp37-127 shared_storage]# ls [root@dhcp37-127 shared_storage]# pwd /var/run/gluster/shared_storage Actual results: Missing ganesha folder from shared storage after reboot of few nodes. Expected results: nfs-ganesha should not get deleted from the shared volume Additional info:
sosreports are placed under http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1324064
I am unable to reproduce the issue on our cluster. Could you please reproduce the issue and provide us the setup. Also before reboot of any node, please verify that '/var/lib/nfs' is symbolic link to the right location under gluster_shared_storage volumes, on all the nodes.
Haven't seen this issue with latest ganesha builds, will keep an eye on this and update bug accordingly.
Based on the comments above, closing this bug. Please re-open if the issue still exists.