Description of problem: ======================= When a config value is set when one of the node's glusterd in cluster is offline, the config value is successfully set. But if the node (which had glusterd offline) is rebooted the config value on this node is set to default 256. Version-Release number of selected component (if applicable): ============================================================== glusterfs-3.4.1.7.snap.mar27.2014git-1.el6.x86_64 How reproducible: ================= 1/1 Steps to Reproduce: 1. Create a four node cluster (node1 to node4) 2. Create and start a volume (vol0 to vol3) 3. Stop a glusterd on node2 (service glusterd stop) 4. Set the config value to 20 for vol3 from node1. Should be successful. 5. Reboot node2 Actual results: =============== On node2, the config value is set to default 256, but on node1 the config value is 20. Expected results: ================= On node2 also the config value should be 20 Additional info: ================ Commands: ========= Initial Value on node1: +++++++++++++++++++++++ [root@snapshot-09 ~]# gluster snapshot config vol3 Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% Snapshot Volume Configuration: Volume : vol3 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 230 (90%) [root@snapshot-09 ~]# Service glusterd stop on node2: +++++++++++++++++++++++++++++++ [root@snapshot-10 ~]# service glusterd stop [root@snapshot-10 ~]# [ OK ] Set the config value from node1: +++++++++++++++++++++++++++++++++ [root@snapshot-09 ~]# gluster snapshot config vol3 snap-max-hard-limit 20 Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit. Do you want to continue? (y/n) y snapshot config: vol3 for snap-max-hard-limit set successfully [root@snapshot-09 ~]# gluster snapshot config vol3 Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% Snapshot Volume Configuration: Volume : vol3 snap-max-hard-limit : 20 Effective snap-max-hard-limit : 20 Effective snap-max-soft-limit : 18 (90%) [root@snapshot-09 ~]# Reboot a node2 and once the nod2 is back check the config value on node2: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ [root@snapshot-10 ~]# reboot [ OK ] Broadcast message from root@snapshot-10 (/dev/pts/0) at 5:54 ... The system is going down for reboot NOW! [root@snapshot-10 ~]# [root@snapshot-10 ~]# gluster snapshot config vol3 Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% Snapshot Volume Configuration: Volume : vol3 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 230 (90%) [root@snapshot-10 ~]#
Marking snapshot BZs to RHS 3.0.
Could you please let me know whether the iptables was off when you brought the node up. Because peer might not have connected if iptables is enabled. Because of that volinfo might not have been synced. Hence it might show the previously set value.
Found out that this happens when we set the config limit globally, as the glusterd.info is not synced. I'll update here when I find out the solution for this.
Moved this to new state so that anyone familiar with this code path can pick it up.
The patch which fixes this problem has been posted upstream, I will send the patch downstream once relevant patch gets merged upstream.
https://code.engineering.redhat.com/gerrit/#/c/26709/
Version : glusterfs-3.6.0.17-1.el6rhs.x86_64 ======= When snap-max hard limit is set when glusterd is stopped on one node and when that node comes back up, the snap-max hard limit is updated. But when a snap-max-soft limit is set with the same steps, it is not updated on node when it comes back up. Steps followed : ~~~~~~~~~~~~~~~ Hard Limit: ----------- 1) Create and start a volume (vol0) 2) Stop a glusterd on node2 (service glusterd stop) 3) Set the config value to 30 for vol0 from node3. gluster snapshot config vol0 snap-max-hard-limit 30 Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit. Do you want to continue? (y/n) y snapshot config: vol0 for snap-max-hard-limit set successfully gluster snapshot config vol0 Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% auto-delete : disable Snapshot Volume Configuration: Volume : vol0 snap-max-hard-limit : 30 Effective snap-max-hard-limit : 30 Effective snap-max-soft-limit : 27 (90%) 4)Reboot node2 Once node 2 is back, check the config value : gluster snapshot config vol0 Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% auto-delete : disable Snapshot Volume Configuration: Volume : vol0 snap-max-hard-limit : 30 Effective snap-max-hard-limit : 30 Effective snap-max-soft-limit : 27 (90%) The snap-max-hard-limit shows as '30' (As expected) ===================================================== Soft Limit: ----------- gluster snapshot config snap-max-soft-limit 10 Changing snapshot-max-soft-limit will lead to deletion of snapshots if they exceed the new limit. Do you want to continue? (y/n) y snapshot config: System for snap-max-soft-limit set successfully On Node1, Node3 and Node4 : =========================== gluster snapshot config Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 10% auto-delete : disable Snapshot Volume Configuration: Volume : vol0 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 25 (10%) Volume : vol1 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 25 (10%) Volume : vol2 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 25 (10%) Volume : vol3 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 25 (10%) On Node2 when its back up : ========================== It still shows 90% which is the default soft limit.All the other nodes shows snap-max-soft-limit as 10%, but Node2 shows 90% gluster snapshot config Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% auto-delete : disable Snapshot Volume Configuration: Volume : vol0 snap-max-hard-limit : 30 Effective snap-max-hard-limit : 30 Effective snap-max-soft-limit : 27 (90%) Volume : vol1 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 230 (90%) Volume : vol2 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 230 (90%) Volume : vol3 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 Effective snap-max-soft-limit : 230 (90%) [root@rhs-arch-srv2 ~]# gluster snapshot config vol0 Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% auto-delete : disable Snapshot Volume Configuration: Volume : vol0 snap-max-hard-limit : 30 Effective snap-max-hard-limit : 30 Effective snap-max-soft-limit : 27 (90%) Moving the bug back to 'Assigned' state
Logs are updated below (Comment 11): http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/snapshots/1082951/
Sorry, there was another patch which was dependent on this which was merged upstream. I think i moved the state of the bug too early This patch fixes it https://code.engineering.redhat.com/gerrit/#/c/27013/. Hence moving this bug to post.
Version : glusterfs 3.6.0.18 built on Jun 16 2014 ======== Setting hard limit for a volume when glusterd is down on one node and when that node is rebooted and gluster config <vol3> is checked, it still does not update the hard-limit value on that node gluster snapshot config vol3 snap-max-hard-limit 100 Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit. Do you want to continue? (y/n) y snapshot config: vol3 for snap-max-hard-limit set successfully From Node 1,3,4 : --------------- gluster snapshot config vol3 Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% auto-delete : disable Snapshot Volume Configuration: Volume : vol3 snap-max-hard-limit : 100 Effective snap-max-hard-limit : 100 ---------------> set to 100 Effective snap-max-soft-limit : 90 (90%) From Node 2 where glusterd was stopped and node was rebooted: ------------------------------------------------------------- gluster snapshot config vol3 Snapshot System Configuration: snap-max-hard-limit : 256 snap-max-soft-limit : 90% auto-delete : disable Snapshot Volume Configuration: Volume : vol3 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 256 -----------> still shows 256 Effective snap-max-soft-limit : 230 (90%)
When I investigated I found out that the above mentioned problem was not happening because of my fix. There is another patch which got merged downstream yesterday which passes the wrong key to friend import dict. Because of that volume information was not propagating properly. I'll talk with the relevant person and update more about that shortly.
Kaushal sent a fix for it, https://code.engineering.redhat.com/gerrit/#/c/27112/. Hopefully the above patch fixes the problem. Hence moving this bug to MODIFIED state.
I did a test run with above mentioned patch, I didn't face any problem. Looks good to me.
Tested after applying Kaushal's patch. Following are the results. I had a 2 node setup: Node 1: [root@snapshot-24 glusterfs]# gluster snapshot config Snapshot System Configuration: snap-max-hard-limit : 60 snap-max-soft-limit : 100% auto-delete : disable Snapshot Volume Configuration: Volume : vol1 snap-max-hard-limit : 50 Effective snap-max-hard-limit : 50 Effective snap-max-soft-limit : 50 (100%) Node 2: [root@snapshot-27 ~]# gluster snapshot config Snapshot System Configuration: snap-max-hard-limit : 60 snap-max-soft-limit : 100% auto-delete : disable Snapshot Volume Configuration: Volume : vol1 snap-max-hard-limit : 50 Effective snap-max-hard-limit : 50 Effective snap-max-soft-limit : 50 (100%) ----------------------------------------------------------------------- Node 2: [root@snapshot-27 ~]# service glusterd stop Node 1: [root@snapshot-24 glusterfs]# gluster snapshot config snap-max-hard-limit 100 Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit. Do you want to continue? (y/n) y snapshot config: System for snap-max-hard-limit set successfully [root@snapshot-24 glusterfs]# gluster snapshot config snap-max-soft-limit 80 Changing snapshot-max-soft-limit will lead to deletion of snapshots if they exceed the new limit. Do you want to continue? (y/n) y snapshot config: System for snap-max-soft-limit set successfully [root@snapshot-24 glusterfs]# gluster snapshot config vol1 snap-max-hard-limit 80 Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit. Do you want to continue? (y/n) y snapshot config: vol1 for snap-max-hard-limit set successfully Node 2: [root@snapshot-27 ~]# reboot After Node 2 came back, Following was the snapshot config output. Node 1: [root@snapshot-24 rhs-glusterfs]# gluster snapshot config Snapshot System Configuration: snap-max-hard-limit : 100 snap-max-soft-limit : 80% auto-delete : disable Snapshot Volume Configuration: Volume : vol1 snap-max-hard-limit : 80 Effective snap-max-hard-limit : 80 Effective snap-max-soft-limit : 64 (80%) Node 2: [root@snapshot-27 rhs-glusterfs]# gluster snapshot config Snapshot System Configuration: snap-max-hard-limit : 100 snap-max-soft-limit : 80% auto-delete : disable Snapshot Volume Configuration: Volume : vol1 snap-max-hard-limit : 80 Effective snap-max-hard-limit : 80 Effective snap-max-soft-limit : 64 (80%)
Version: glusterfs 3.6.0.19 built on Jun 18 2014 ======== Snap-max-hard limit and soft limit works as expected. But with the same steps when auto delete is ENABLED from Node1 when glusterd is stopped on Node2 and when Node 2 is rebooted, checking gluster snapshot config on Node2 shows auto delete is DISABLED. NODE1,NODE3,NODE4: ================= gluster snapshot config Snapshot System Configuration: snap-max-hard-limit : 35 snap-max-soft-limit : 12% auto-delete : enable ------------------>AUTO DELETE is enabled Snapshot Volume Configuration: Volume : vol0 snap-max-hard-limit : 20 Effective snap-max-hard-limit : 20 Effective snap-max-soft-limit : 2 (12%) Volume : vol1 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 35 Effective snap-max-soft-limit : 4 (12%) Volume : vol2 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 35 Effective snap-max-soft-limit : 4 (12%) Volume : vol3 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 35 Effective snap-max-soft-limit : 4 (12%) Volume : vol4 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 35 Effective snap-max-soft-limit : 4 (12%) NODE 2 : where glusterd was down and then rebooted: ================================================== gluster snapshot config Snapshot System Configuration: snap-max-hard-limit : 35 snap-max-soft-limit : 12% auto-delete : disable ------------------> AUTO DELETE is disabled Snapshot Volume Configuration: Volume : vol0 snap-max-hard-limit : 20 Effective snap-max-hard-limit : 20 Effective snap-max-soft-limit : 2 (12%) Volume : vol1 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 35 Effective snap-max-soft-limit : 4 (12%) Volume : vol2 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 35 Effective snap-max-soft-limit : 4 (12%) Volume : vol3 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 35 Effective snap-max-soft-limit : 4 (12%) Volume : vol4 snap-max-hard-limit : 256 Effective snap-max-hard-limit : 35 Effective snap-max-soft-limit : 4 (12%) Moving it back to 'Assigned'
https://code.engineering.redhat.com/gerrit/#/c/27241/1 Fixes the auto-delete problem.
Version : glusterfs 3.6.0.20 built on Jun 19 2014 Set snap-max-hard-limit and snap-max-soft-limit when glusterd is offline on one node Rebooted the node, it is updated with the config values. Also checked auto-delete with the similar steps, works as expected. Marking the bug 'Verified'!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html
*** Bug 1058821 has been marked as a duplicate of this bug. ***