Created attachment 1147228 [details] glusterd logs from one node Description of problem: ganesha exported volumes doesn't get synced up on shutdown node when it comes up. Version-Release number of selected component (if applicable): [root@dhcp37-180 ganesha]# rpm -qa|grep glusterfs glusterfs-cli-3.7.9-1.el7rhgs.x86_64 glusterfs-fuse-3.7.9-1.el7rhgs.x86_64 glusterfs-geo-replication-3.7.9-1.el7rhgs.x86_64 glusterfs-libs-3.7.9-1.el7rhgs.x86_64 glusterfs-3.7.9-1.el7rhgs.x86_64 glusterfs-client-xlators-3.7.9-1.el7rhgs.x86_64 glusterfs-api-3.7.9-1.el7rhgs.x86_64 glusterfs-rdma-3.7.9-1.el7rhgs.x86_64 glusterfs-server-3.7.9-1.el7rhgs.x86_64 glusterfs-debuginfo-3.7.9-1.el7rhgs.x86_64 glusterfs-ganesha-3.7.9-1.el7rhgs.x86_64 [root@dhcp37-180 ganesha]# rpm -qa|grep ganesha nfs-ganesha-2.3.1-3.el7rhgs.x86_64 nfs-ganesha-gluster-2.3.1-3.el7rhgs.x86_64 glusterfs-ganesha-3.7.9-1.el7rhgs.x86_64 How reproducible: Always Steps to Reproduce: 1) Configure a 4 node cluster and configure ganesha on it 2) Create 4 volumes and start it 3) Do ganesha enable on volumes following below steps enable ganesha on v1 and shutdown node4 (it has .export_added as 2) enable ganesha on v2 and shutdown node3 (it has .export_added as 3) enable ganesha on v3 and shutdown node2 (it has .export_added as 4) enable ganesha on v4 and node1 will have .export_added as 5 4) Bring back all the nodes 5) Do a nfs-ganesha, pcsd and pacemaker service start 6) After that do showmount on all the nodes: Node1: [root@dhcp37-180 ganesha]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) /v4 (everyone) [root@dhcp37-180 ganesha]# Node2: [root@dhcp37-158 ganesha]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) [root@dhcp37-158 ganesha]# Node3: [root@dhcp37-127 ganesha]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) [root@dhcp37-127 ganesha]# Node4: root@dhcp37-174 ganesha]# showmount -e localhost Export list for localhost: /v1 (everyone) [root@dhcp37-174 ganesha]# 7) Observe ganesha.conf on all the nodes: Node1: %include "/etc/ganesha/exports/export.v1.conf" %include "/etc/ganesha/exports/export.v2.conf" %include "/etc/ganesha/exports/export.v3.conf" %include "/etc/ganesha/exports/export.v4.conf" [root@dhcp37-180 ganesha]# Node2: %include "/etc/ganesha/exports/export.v1.conf" %include "/etc/ganesha/exports/export.v2.conf" %include "/etc/ganesha/exports/export.v3.conf" [root@dhcp37-158 ganesha]# Node3: %include "/etc/ganesha/exports/export.v1.conf" %include "/etc/ganesha/exports/export.v2.conf" [root@dhcp37-127 ganesha]# Node4: %include "/etc/ganesha/exports/export.v1.conf" [root@dhcp37-174 ganesha]# 8) Check export files on all nodes: Node 1: [root@dhcp37-180 exports]# ls export.v1.conf export.v2.conf export.v3.conf export.v4.conf Node2: [root@dhcp37-158 exports]#ls export.v1.conf export.v2.conf export.v3.conf Node3: [root@dhcp37-127 exports]# ls export.v1.conf export.v2.conf Node4: [root@dhcp37-174 exports]# ls export.v1.conf Observe that the exported volumes doesn't get synced up on all the nodes 9) Do a refresh config on all volumes: For v1: [root@dhcp37-180 ganesha]# /usr/libexec/ganesha/ganesha-ha.sh --refresh-config /etc/ganesha/ v1 Refresh-config completed on dhcp37-127. Refresh-config completed on dhcp37-158. Refresh-config completed on dhcp37-174. Success: refresh-config completed. For v2: [root@dhcp37-180 ganesha]# /usr/libexec/ganesha/ganesha-ha.sh --refresh-config /etc/ganesha/ v2 Refresh-config completed on dhcp37-127. Refresh-config completed on dhcp37-158. cat: /etc/ganesha/exports/export.v2.conf: No such file or directory Error: refresh-config failed on dhcp37-174. For v3: [root@dhcp37-180 ganesha]# /usr/libexec/ganesha/ganesha-ha.sh --refresh-config /etc/ganesha/ v3 cat: /etc/ganesha/exports/export.v3.conf: No such file or directory Error: refresh-config failed on dhcp37-127. For v4: [root@dhcp37-180 ganesha]# /usr/libexec/ganesha/ganesha-ha.sh --refresh-config /etc/ganesha/ v4 cat: /etc/ganesha/exports/export.v4.conf: No such file or directory Error: refresh-config failed on dhcp37-127. Actual results: ganesha exported volumes doesn't get synced up on shutdown node when it comes up. Expected results: When a nodes comes up or after glusterd restart, all the exported volumes should gets synced up on all the nodes. Additional info:
I suspect below behaviour may have lead to this issue- For each volume to be exported, we use dbus-send.sh "dynamic_export_add()". In case of this dbus cmd failure, we remove the corresponding export.$vol.conf file and return error. So when any of the offline nodes come up, glusterd will try to sync the options which in turn will result in calling "dynamic_export_add()". But since nfs-ganesha is not started by then, the export file may have got removed. And later when nfs-ganesha is not started, those volumes will not be exported as their export entries are not present in '/etc/ganesha/exports'. So the fix can be, in case of "dynamic_export_add()" failure, return error without removing the export file. That way, when nfs-ganesha comes up, it can export the corresponding volume. The side-effect would be that in case of refresh-config too, if the AddExport fails, the export shall still remain, which I think is fine.
(In reply to Soumya Koduri from comment #2) > I suspect below behaviour may have lead to this issue- > > For each volume to be exported, we use dbus-send.sh "dynamic_export_add()". > In case of this dbus cmd failure, we remove the corresponding > export.$vol.conf file and return error. So when any of the offline nodes > come up, glusterd will try to sync the options which in turn will result in > calling "dynamic_export_add()". But since nfs-ganesha is not started by > then, the export file may have got removed. And later when nfs-ganesha is > not started, those volumes will not be exported as their export entries are typo - %s/nfs-ganesha is not started/nfs-ganesha is started/ > not present in '/etc/ganesha/exports'. > > > So the fix can be, in case of "dynamic_export_add()" failure, return error > without removing the export file. That way, when nfs-ganesha comes up, it > can export the corresponding volume. The side-effect would be that in case > of refresh-config too, if the AddExport fails, the export shall still > remain, which I think is fine.
Looks like above mentioned theory is not the case here. After a brief chat with Kaushal from glusterd team, its been clear that, hook-scripts/runner jobs executed as part of any volume set option change are not run by default when node/glusterd gets restarted. We need special handling if we want glusterd to sync any information (in our case export files) when it gets re-started. The fix seems non-trivial.
the work-around is to manually copy the export.$vol.conf files to the node which got rebooted and force start the volume.
Hi Soumya, Sorry for the late update. I had worked on this bug, but didn't update the BZ. You comment4 is true. After discussion with atin, he suggested to add a similar routine in "volume set option" to glusterd_restart_brick() function. I just looked into the code, it seems to be. As a work around a simple "gluster vol <volnmae> start force" will be enough.(No need to copy the file)
Thanks Jiffin. Kaushal had suggested to look at "glusterd_compare_friend_data" . I am not sure if the fix is trivial or risky for this release. We do need to copy the file before restarting the volume to copy the updated export.$vol.conf files (instead of creating default ones) from the online node just in case if there was refresh-config done for that volume before the node gets online.
Agreed on that.
http://review.gluster.org/#/c/14063/
Verified this bug with latest nfs-ganesha-2.3.1-6 build: [root@dhcp43-175 exports]# rpm -qa|grep ganesha nfs-ganesha-2.3.1-6.el7rhgs.x86_64 nfs-ganesha-gluster-2.3.1-6.el7rhgs.x86_64 glusterfs-ganesha-3.7.9-4.el7rhgs.x86_64 >>>>>>> Stopping glusterd on few nodes; enable ganesha on volumes and then start glusterd on the nodes: >> on a 4 node cluster, stopped glusterd on 2 nodes: [root@dhcp42-20 exports]# service glusterd stop Redirecting to /bin/systemctl stop glusterd.service [root@dhcp42-239 exports]# service glusterd stop Redirecting to /bin/systemctl stop glusterd.service >> enabled ganesha on 4 volumes: [root@dhcp43-175 exports]# gluster vol set v1 ganesha.enable on volume set: success [root@dhcp43-175 exports]# gluster vol set v2 ganesha.enable on volume set: success [root@dhcp43-175 exports]# gluster vol set v3 ganesha.enable on volume set: success [root@dhcp43-175 exports]# gluster vol set v4 ganesha.enable on volume set: success >> showmount e localhost and /etc/ganesha/exports shows as below on the nodes: [root@dhcp43-175 exports]# ls export.v1.conf export.v2.conf export.v3.conf export.v4.conf [root@dhcp43-175 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) /v4 (everyone) [root@dhcp42-196 exports]# ls export.v1.conf export.v2.conf export.v3.conf export.v4.conf [root@dhcp42-196 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) /v4 (everyone) [root@dhcp42-20 exports]# ls [root@dhcp42-20 exports]# showmount -e localhost Export list for localhost: [root@dhcp42-20 exports]# [root@dhcp42-239 exports]# ls [root@dhcp42-239 exports]# showmount -e localhost Export list for localhost: [root@dhcp42-239 exports]# >> Started glusterd on the 2 nodes, where it was down: [root@dhcp42-20 exports]# service glusterd start Redirecting to /bin/systemctl start glusterd.service [root@dhcp42-239 exports]# service glusterd start Redirecting to /bin/systemctl start glusterd.service >> check for showmount and export files under /etc/ganesha/exports on the nodes where glusterd was started: [root@dhcp42-20 exports]# ls export.v1.conf export.v2.conf export.v3.conf export.v4.conf [root@dhcp42-20 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) /v4 (everyone) [root@dhcp42-239 exports]# ls export.v1.conf export.v2.conf export.v3.conf export.v4.conf [root@dhcp42-239 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) /v4 (everyone) As can be seen above, if glusterd is down on few nodes and we export volumes from other nodes, then after glusterd starts the volumes will get exported on those nodes as well where glusterd was down. This particular scenario is working fine ---------------------------------------------------------------------------- >>>>>>> Reboot scenario tested for 4 volumes [root@dhcp42-20 ~]# gluster vol list gluster_shared_storage v1 v2 v3 v4 >> enable ganesha on volumes > enable ganesha on v1 and shutdown node 4 on node4 before shutdown: [root@dhcp42-196 ~]# showmount -e localhost Export list for localhost: /v1 (everyone) [root@dhcp42-196 ~]# cd /etc/ganesha/exports/ [root@dhcp42-196 exports]# ls export.v1.conf [root@dhcp42-196 exports]# > enable ganesha on v2 and shutdown node 3 on node3 before shutdown: [root@dhcp43-175 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) [root@dhcp43-175 exports]# ls export.v1.conf export.v2.conf > enable ganesha on v3 and shutdown node 2 on node2 before shutdown: [root@dhcp42-239 exports]# ls export.v1.conf export.v2.conf export.v3.conf [root@dhcp42-239 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) > enable ganesha on v4 on node4: [root@dhcp42-20 exports]# ls export.v1.conf export.v2.conf export.v3.conf export.v4.conf [root@dhcp42-20 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) /v4 (everyone) >> Bring back all the nodes and make sure pcs and pacemaker services are running >> After sometime restart nfs-ganesha service on all the 3 nodes which came back >> Observe the entries under /etc/ganesha/exports, showmount and ganesha.conf On node 4: [root@dhcp42-196 exports]# ls export.v1.conf [root@dhcp42-196 exports]# showmount -e localhost Export list for localhost: [root@dhcp42-196 exports]# On node 3: [root@dhcp43-175 exports]# ls export.v1.conf export.v2.conf [root@dhcp43-175 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) [root@dhcp43-175 exports]# On node 2: [root@dhcp42-239 exports]# ls export.v1.conf export.v2.conf export.v3.conf [root@dhcp42-239 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) [root@dhcp42-239 exports]# On node 1: [root@dhcp42-20 exports]# ls export.v1.conf export.v2.conf export.v3.conf export.v4.conf [root@dhcp42-20 exports]# showmount -e localhost Export list for localhost: /v1 (everyone) /v2 (everyone) /v3 (everyone) /v4 (everyone) [root@dhcp42-20 exports]# >> So the original reported issue, the exported volumes doesnt get synced up on nodes when they come back in cluster, is still reproducible Hence assigning it back
We have tested the above mentioned case. If we impose the order as mentioned above, glusterd doesn't seem to start post node reboot as pacemaker takes a while to start the required services. Alternatively we tried following work-arounds 1) enable the services via systemd in the order of pacemaker, pcsd and glusterd. This approach dint work as well. 2) `systemctl disable gluster` and manually start it post reboot. But this seem to have worked on only one node and hasn't on other 2-nodes. This case still needs RCA. Meanwhile since glusterd is enabled by default via systemd and imposing a order on the services doesn't seem right, following work-around may help us get past this issue. The possible work-arounds are * either scp good copy of the volumes' (in question) export files to the nodes which got reboot and then start force such volumes. * or Make 'systemctl restart glusterd' to copy export files missing and export them. But this would need some code changes.
Upstream link : http://review.gluster.org/#/c/14906/ Downstream link : https://code.engineering.redhat.com/gerrit/#/c/84776/
Tested and verified the fix in build, nfs-ganesha-2.4.1-2.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.1-2.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-8.el7rhgs.x86_64
Edited the doc text further for the errata.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2017-0493.html