Description of problem: ganesha.enable remains on in volume info file after we disable nfs-ganesha on the cluster. Version-Release number of selected component (if applicable): nfs-ganesha-2.3.1-8 How reproducible: Always Steps to Reproduce: 1. Disable nfs-ganesha, on a already running ganesha cluster, using below command gluster nfs-ganesha disable 2. Observe that gluster volume info shows ganesha.disable as off while the info file under /var/lib/glusterd/vols/volname shows ganesha.enable as on. >>ganesha.enable is off and features.cache-invalidation is on in vol info: [root@dhcp42-157 nfs-ganesha]# gluster vol info v1 Volume Name: v1 Type: Replicate Volume ID: 6f57ba0e-c09d-4b38-9a15-d21632af6e26 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.70.42.157:/bricks/brick0/b0 Brick2: 10.70.42.171:/bricks/brick0/b0 Options Reconfigured: performance.readdir-ahead: on nfs.disable: on features.cache-invalidation: on ganesha.enable: off nfs-ganesha: disable cluster.enable-shared-storage: enable >> Info file under /var/lib/glusterd/vols/v1 has ganesha.enable and features.cache-invalidation as on: type=2 count=2 status=1 sub_count=2 stripe_count=1 replica_count=2 disperse_count=0 redundancy_count=0 version=4 transport-type=0 volume-id=6f57ba0e-c09d-4b38-9a15-d21632af6e26 username=1899af9b-4751-47c8-9344-d60ba84d6310 password=130e4ff3-ee46-463b-a2a9-f3357e28b61b op-version=30700 client-op-version=30000 quota-version=0 parent_volname=N/A restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 performance.readdir-ahead=on nfs.disable=on features.cache-invalidation=on ganesha.enable=on brick-0=10.70.42.157:-bricks-brick0-b0 brick-1=10.70.42.171:-bricks-brick0-b0 >> ganesha.enable is on in v1.tcp-fuse.vol and trusted-v1.tcp-fuse.vol file under /var/lib/glusterd/vols/v1 volume v1-ganesha type features/ganesha option ganesha.enable on subvolumes v1-md-cache end-volume 3. So once we enable nfs-ganesha again on the same cluster, ganesha gets enabled automatically on all volumes, which should not happen. Actual results: ganesha.enable remains on in volume info file after we disable nfs-ganesha on the cluster. Expected results: ganesha should get disabled on all the volumes once we tear down the cluster. Additional info:
> 3. So once we enable nfs-ganesha again on the same cluster, ganesha gets > enabled automatically on all volumes, which should not happen. > I thought this was the case only with upgrade where glusterd reads persistent options. In case of a working node, 'glusterd' will read options from volinfo but not the volfile. Could you please confirm the same.
CCin Atin. Shashank, Just to know impact of this issue, could you please test below scenarios and provide us the results. Thanks! - 1) * Setup nfs-ganesha cluster * Export few volumes * Teardown cluster * Now re-setup nfs-ganesha cluster Do you see volumes automatically exported this time? 2) * Setup nfs-ganesha cluster * Export few volumes (say 5) * Now unexport few of them (say 3 of them) * Reboot few nodes of the cluster/Restart glusterd on few of the nodes. Check the behaviour of those unexported volumes on those nodes. 3) * Setup nfs-ganesha cluster * Export few volumes (say 5) * Now unexport few of them (say 3 of them) * Reboot all the nodes of the cluster/Restart glusterd on all the nodes. Check the behaviour of those unexported volumes on those nodes.
As I read through the issue, I don't think this is related to upgrade/update. IMO, we should have seen this issue in nfs ganesha disable workflow too if all the nodes in the cluster goes down post disabling and come back one after another. Can you please confirm the same?
We tried various scenarios and the only scenario where we see it fails is: * export some volumes * Disable nfs-ganesha * Stop glusterd * Start glusterd * Enable ganesha * try to export the volumes, it fails with ganesha.enable is already on
Thanks Shashank and Jiffin. The issue is that on the node, where glusterd is restarted has ganesha.enable=ON for those volumes where as for all other nodes, ganesha.enable was reset to OFF once the nfs-ganesha cluster is re-setup. The possible work-arounds seem to be 1) Manually edit the volinfo file on the node where ganesha.enable is 'ON' in volinfo and then restart glusterd or 2) Teardown and re-setup NFS-Ganesha cluster. Atin, From 'glusterd' perspective, please let us know if option (1) is acceptable. Thanks!
I checked the code. We do call glusterd_store_options () in glusterd_op_set_ganesha () which means the global options get persisted into the store. I am really not sure how did we end up in the situation where the changes were in memory but not persisted into the store. We need to RCA this completely to get to the final workaround.
I do see that the global options are persisted in /var/lib/glusterd/options file.
Its not the global option but the volume-level ganesha.enable option which do not get persisted.
Doc text looks good to me
Upstream patch http://review.gluster.org/#/c/14831/2 got merged 3.8 branch
This change already part of rebase
Verified this bug with below ganesha build and its working as expected: [root@dhcp43-110 ~]# rpm -qa|grep ganesha nfs-ganesha-debuginfo-2.4.0-2.el7rhgs.x86_64 nfs-ganesha-2.4.0-2.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.0-2.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-1.el7rhgs.x86_64 >>>>> When ganesha is enabled on volume: [root@dhcp43-110 ~]# gluster vol info testvolume | grep ganesha ganesha.enable: on >>>>> After disabling ganesha on volume: [root@dhcp43-110 ~]# gluster vol set testvolume ganesha.enable off volume set: success [root@dhcp43-110 ~]# gluster vol info testvolume | grep ganesha ganesha.enable: off [root@dhcp43-110 ~]# cat /var/lib/glusterd/vols/testvolume/info | grep ganesha ganesha.enable=off >>>>> After disabling ganesha on cluster when ganesha is enabled on volume. [root@dhcp43-110 ~]# gluster vol set testvolume ganesha.enable on volume set: success [root@dhcp43-110 ~]# cat /var/lib/glusterd/vols/testvolume/info | grep ganesha ganesha.enable=on [root@dhcp43-110 ~]# gluster vol info testvolume | grep ganesha ganesha.enable: on nfs-ganesha: enable [root@dhcp43-110 ~]# gluster nfs-ganesha disable Disabling NFS-Ganesha will tear down entire ganesha cluster across the trusted pool. Do you still want to continue? (y/n) y This will take a few minutes to complete. Please wait .. nfs-ganesha : success [root@dhcp43-110 ~]# cat /var/lib/glusterd/vols/testvolume/info | grep ganesha ganesha.enable=off [root@dhcp43-110 ~]# gluster vol info testvolume | grep ganesha ganesha.enable: off nfs-ganesha: disable ***************************************** Also, tried the scenario mentioned in comment 5: * export some volumes * Disable nfs-ganesha * Stop glusterd * Start glusterd * Enable ganesha * try to export the volumes, it fails with ganesha.enable is already on and it works as expected. i don't see any failures now. ***************************************** Based on the above observation, marking this bug as Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html