Bug 1400599

Summary: [GANESHA] failed to create directory of hostname of new node in var/lib/nfs/ganesha/ in already existing cluster nodes
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Manisha Saini <msaini>
Component: common-haAssignee: Kaleb KEITHLEY <kkeithle>
Status: CLOSED ERRATA QA Contact: Manisha Saini <msaini>
Severity: unspecified Docs Contact:
Priority: medium    
Version: rhgs-3.2CC: amukherj, rcyriac, rhinduja, rhs-bugs, skoduri, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1400613 (view as bug list) Environment:
Last Closed: 2017-03-23 05:53:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1351528, 1400613, 1405576, 1405577    

Description Manisha Saini 2016-12-01 15:05:28 UTC
Description of problem:
When new node is added in ganesha cluster,directory of the hostname is added to /var/lib/nfs/ganesha/dhcp46-219.lab.eng.blr.redhat.com having simlink to 

dhcp46-219.lab.eng.blr.redhat.com -> /var/run/gluster/shared_storage/nfs-ganesha/dhcp46-219.lab.eng.blr.redhat.com/nfs/ganesha

This directory (having hostname of new node) is missing from all other existing nodes in ganesha cluster after adding new node.

ganesha.nfsd-12930[dbus_heartbeat] nfs4_load_recov_clids_nolock :CLIENT ID :EVENT :Failed to open v4 recovery dir (/var/lib/nfs/ganesha/dhcp46-208.lab.eng.blr.redhat.com/v4recov), errno=2


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.Create a 4 node cluster and configure ganesha on it.
2.Add new node to cluster after performing prerequisite for adding new node to cluster

[root@dhcp47-3 gdeploy]# /usr/libexec/ganesha/ganesha-ha.sh --add /var/run/gluster/shared_storage/nfs-ganesha/ dhcp46-208.lab.eng.blr.redhat.com 10.70.44.157

Following messages were observed in ganesha.log of all the existing nodes:

01/12/2016 20:13:49 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
01/12/2016 20:13:49 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_start_grace :STATE :EVENT :NFS Server recovery event 5 nodeid -1 ip 
01/12/2016 20:13:49 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_load_recov_clids_nolock :CLIENT ID :EVENT :Recovery for nodeid -1 dir (/var/lib/nfs/ganesha//v4recov)
01/12/2016 20:13:57 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now IN GRACE
01/12/2016 20:14:05 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
01/12/2016 20:14:05 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_start_grace :STATE :EVENT :NFS Server recovery event 5 nodeid -1 ip dhcp46-208.lab.eng.blr.redhat.com
01/12/2016 20:14:05 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_load_recov_clids_nolock :CLIENT ID :EVENT :Recovery for nodeid -1 dir (/var/lib/nfs/ganesha/dhcp46-208.lab.eng.blr.redhat.com/v4recov)
01/12/2016 20:14:05 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_load_recov_clids_nolock :CLIENT ID :EVENT :Failed to open v4 recovery dir (/var/lib/nfs/ganesha/dhcp46-208.lab.eng.blr.redhat.com/v4recov), errno=2
01/12/2016 20:14:05 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
01/12/2016 20:14:05 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_start_grace :STATE :EVENT :NFS Server recovery event 5 nodeid -1 ip dhcp46-208.lab.eng.blr.redhat.com
01/12/2016 20:14:05 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_load_recov_clids_nolock :CLIENT ID :EVENT :Recovery for nodeid -1 dir (/var/lib/nfs/ganesha/dhcp46-208.lab.eng.blr.redhat.com/v4recov)
01/12/2016 20:14:05 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[dbus_heartbeat] nfs4_load_recov_clids_nolock :CLIENT ID :EVENT :Failed to open v4 recovery dir (/var/lib/nfs/ganesha/dhcp46-208.lab.eng.blr.redhat.com/v4recov), errno=2
01/12/2016 20:15:37 : epoch 8cd10000 : dhcp46-241.lab.eng.blr.redhat.com : ganesha.nfsd-12930[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now NOT IN GRACE


Actual results:

No Directory of hostname is created in  /var/lib/nfs/ganesha/ 
 


Expected results:
Directory of hostname should be created in  /var/lib/nfs/ganesha/ 
No messages regarding  Failed to open v4 recovery dir (/var/lib/nfs/ganesha/dhcp46-208.lab.eng.blr.redhat.com/v4recov), errno=2 should be observed in ganesha.log


Additional info:

Comment 4 Atin Mukherjee 2016-12-13 15:28:54 UTC
upstream mainline patch : http://review.gluster.org/#/c/16036/

Comment 7 Atin Mukherjee 2016-12-16 12:53:16 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/93172/

Comment 9 Manisha Saini 2016-12-20 09:56:15 UTC
Verified this bug on 

# rpm -qa | grep ganesha
nfs-ganesha-2.4.1-3.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.1-3.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-9.el7rhgs.x86_64

Hostname directory of new node is created on already existing nodes.

A clean up is required on new node which is being addressed in bug-
https://bugzilla.redhat.com/show_bug.cgi?id=1406330

As the issue addressed in this bug is resolved,Hence marking this bug as verified

Comment 11 errata-xmlrpc 2017-03-23 05:53:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html