1082951 – [SNAPSHOT]: Snapshot config set is not updated on a node which had glusterd offline and than rebooted

Bug 1082951 - [SNAPSHOT]: Snapshot config set is not updated on a node which had glusterd offline and than rebooted

Summary: [SNAPSHOT]: Snapshot config set is not updated on a node which had glusterd o...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	snapshot
Sub Component:
Version:	rhgs-3.0
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.0.0
Assignee:	Sachin Pandit
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:	SNAPSHOT
Duplicates (1):	1058821 (view as bug list)
Depends On:
Blocks:	1104642
TreeView+	depends on / blocked

Reported:	2014-04-01 07:37 UTC by Rahul Hinduja
Modified:	2016-09-17 12:54 UTC (History)
CC List:	10 users (show)
Fixed In Version:	glusterfs-3.6.0.20-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1104642 (view as bug list)
Environment:
Last Closed:	2014-09-22 19:33:13 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:1278	0	normal	SHIPPED_LIVE	Red Hat Storage Server 3.0 bug fix and enhancement update	2014-09-22 23:26:55 UTC

Description Rahul Hinduja 2014-04-01 07:37:14 UTC

Description of problem:
=======================

When a config value is set when one of the node's glusterd in cluster is offline, the config value is successfully set. But if the node (which had glusterd offline) is rebooted the config value on this node is set to default 256.

Version-Release number of selected component (if applicable):
==============================================================

glusterfs-3.4.1.7.snap.mar27.2014git-1.el6.x86_64


How reproducible:
=================
1/1


Steps to Reproduce:
1. Create a four node cluster (node1 to node4)
2. Create and start a volume (vol0 to vol3)
3. Stop a glusterd on node2 (service glusterd stop)
4. Set the config value to 20 for vol3 from node1. Should be successful.
5. Reboot node2

Actual results:
===============
On node2, the config value is set to default 256, but on node1 the config value is 20.


Expected results:
=================
On node2 also the config value should be 20


Additional info:
================

Commands:
=========

Initial Value on node1:
+++++++++++++++++++++++


[root@snapshot-09 ~]# gluster snapshot config vol3

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%

Snapshot Volume Configuration:

Volume : vol3
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 230 (90%)
[root@snapshot-09 ~]# 

Service glusterd stop on node2:
+++++++++++++++++++++++++++++++

[root@snapshot-10 ~]# service glusterd stop 
[root@snapshot-10 ~]#                                      [  OK  ]


Set the config value from node1:
+++++++++++++++++++++++++++++++++

[root@snapshot-09 ~]# gluster snapshot config vol3 snap-max-hard-limit 20
Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit.
Do you want to continue? (y/n) y
snapshot config: vol3 for snap-max-hard-limit set successfully
[root@snapshot-09 ~]# gluster snapshot config vol3

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%

Snapshot Volume Configuration:

Volume : vol3
snap-max-hard-limit : 20
Effective snap-max-hard-limit : 20
Effective snap-max-soft-limit : 18 (90%)
[root@snapshot-09 ~]#


Reboot a node2 and once the nod2 is back check the config value on node2:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

[root@snapshot-10 ~]# reboot                               [  OK  ]

Broadcast message from root@snapshot-10
	(/dev/pts/0) at 5:54 ...

The system is going down for reboot NOW!
[root@snapshot-10 ~]# 



[root@snapshot-10 ~]# gluster snapshot config vol3

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%

Snapshot Volume Configuration:

Volume : vol3
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 230 (90%)
[root@snapshot-10 ~]#

Comment 4 Nagaprasad Sathyanarayana 2014-04-21 06:17:54 UTC

Marking snapshot BZs to RHS 3.0.

Comment 5 Sachin Pandit 2014-05-28 10:23:27 UTC

Could you please let me know whether the iptables was off when
you brought the node up. Because peer might not have connected
if iptables is enabled. Because of that volinfo might not have
been synced. Hence it might show the previously set value.

Comment 6 Sachin Pandit 2014-05-30 06:43:02 UTC

Found out that this happens when we set the config limit globally,
as the glusterd.info is not synced. I'll update here when I find
out the solution for this.

Comment 7 Sachin Pandit 2014-05-30 07:10:44 UTC

Moved this to new state so that anyone familiar with this code path
can pick it up.

Comment 8 Sachin Pandit 2014-06-05 05:55:19 UTC

The patch which fixes this problem has been posted upstream,
I will send the patch downstream once relevant patch gets
merged upstream.

Comment 9 Sachin Pandit 2014-06-12 05:48:52 UTC

https://code.engineering.redhat.com/gerrit/#/c/26709/

Comment 11 senaik 2014-06-16 11:45:14 UTC

Version : glusterfs-3.6.0.17-1.el6rhs.x86_64
=======

When snap-max hard limit is set when glusterd is stopped on one node and when that node comes back up, the snap-max hard limit is updated. 
But when a snap-max-soft limit is set with the same steps, it is not updated on node when it comes back up.

Steps followed :
~~~~~~~~~~~~~~~

Hard Limit:
-----------
1) Create and start a volume (vol0)

2) Stop a glusterd on node2 (service glusterd stop)

3) Set the config value to 30 for vol0 from node3. 
gluster snapshot config vol0 snap-max-hard-limit 30
Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit.
Do you want to continue? (y/n) y
snapshot config: vol0 for snap-max-hard-limit set successfully


gluster snapshot config vol0

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol0
snap-max-hard-limit : 30
Effective snap-max-hard-limit : 30
Effective snap-max-soft-limit : 27 (90%)

4)Reboot node2

Once node 2 is back, check the config value :

gluster snapshot config vol0

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol0
snap-max-hard-limit : 30
Effective snap-max-hard-limit : 30
Effective snap-max-soft-limit : 27 (90%)

The snap-max-hard-limit shows as '30' (As expected)

=====================================================
Soft Limit:
-----------

gluster snapshot config snap-max-soft-limit 10
Changing snapshot-max-soft-limit will lead to deletion of snapshots if they exceed the new limit.
Do you want to continue? (y/n) y
snapshot config: System for snap-max-soft-limit set successfully


On Node1, Node3 and Node4 :
===========================

 gluster snapshot config

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 10%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol0
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 25 (10%)

Volume : vol1
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 25 (10%)

Volume : vol2
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 25 (10%)

Volume : vol3
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 25 (10%)



On Node2 when its back up :
==========================

It still shows 90% which is the default soft limit.All the other nodes shows snap-max-soft-limit as 10%, but Node2 shows 90%

gluster snapshot config

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol0
snap-max-hard-limit : 30
Effective snap-max-hard-limit : 30
Effective snap-max-soft-limit : 27 (90%)

Volume : vol1
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 230 (90%)

Volume : vol2
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 230 (90%)

Volume : vol3
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256
Effective snap-max-soft-limit : 230 (90%)
[root@rhs-arch-srv2 ~]# gluster snapshot config vol0

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol0
snap-max-hard-limit : 30
Effective snap-max-hard-limit : 30
Effective snap-max-soft-limit : 27 (90%)



Moving the bug back to 'Assigned' state

Comment 12 senaik 2014-06-16 11:48:25 UTC

Logs are updated below (Comment 11):

http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/snapshots/1082951/

Comment 13 Sachin Pandit 2014-06-16 12:03:32 UTC

Sorry, there was another patch which was dependent on this which was
merged upstream. I think i moved the state of the bug too early
This patch fixes it https://code.engineering.redhat.com/gerrit/#/c/27013/.

Hence moving this bug to post.

Comment 14 senaik 2014-06-17 07:37:23 UTC

Version : glusterfs 3.6.0.18 built on Jun 16 2014
========

Setting hard limit for a volume when glusterd is down on one node and when that node is rebooted and gluster config <vol3> is checked, it still does not update the hard-limit value on that node 

gluster snapshot config vol3 snap-max-hard-limit 100
Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit.
Do you want to continue? (y/n) y
snapshot config: vol3 for snap-max-hard-limit set successfully


From Node 1,3,4 :
---------------
gluster snapshot config vol3

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol3
snap-max-hard-limit : 100
Effective snap-max-hard-limit : 100 ---------------> set to 100
Effective snap-max-soft-limit : 90 (90%)


From Node 2 where glusterd was stopped and node was rebooted:
-------------------------------------------------------------
gluster snapshot config vol3

Snapshot System Configuration:
snap-max-hard-limit : 256
snap-max-soft-limit : 90%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol3
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 256 -----------> still shows 256
Effective snap-max-soft-limit : 230 (90%)

Comment 15 Sachin Pandit 2014-06-17 08:37:32 UTC

When I investigated I found out that the above mentioned problem was not
happening because of my fix. There is another patch which got merged
downstream yesterday which passes the wrong key to friend import dict.
Because of that volume information was not propagating properly.

I'll talk with the relevant person and update more about that shortly.

Comment 16 Sachin Pandit 2014-06-17 09:04:55 UTC

Kaushal sent a fix for it,
https://code.engineering.redhat.com/gerrit/#/c/27112/.

Hopefully the above patch fixes the problem.
Hence moving this bug to MODIFIED state.

Comment 17 Sachin Pandit 2014-06-17 09:06:43 UTC

I did a test run with above mentioned patch, I didn't face any problem.
Looks good to me.

Comment 18 Sachin Pandit 2014-06-17 09:56:21 UTC

Tested after applying Kaushal's patch.
Following are the results.

I had a 2 node setup:

Node 1:
[root@snapshot-24 glusterfs]# gluster snapshot config

Snapshot System Configuration:
snap-max-hard-limit : 60
snap-max-soft-limit : 100%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol1
snap-max-hard-limit : 50
Effective snap-max-hard-limit : 50
Effective snap-max-soft-limit : 50 (100%)

Node 2: 
[root@snapshot-27 ~]# gluster snapshot config

Snapshot System Configuration:
snap-max-hard-limit : 60
snap-max-soft-limit : 100%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol1
snap-max-hard-limit : 50
Effective snap-max-hard-limit : 50
Effective snap-max-soft-limit : 50 (100%)
-----------------------------------------------------------------------
Node 2:
[root@snapshot-27 ~]# service glusterd stop

Node 1:
[root@snapshot-24 glusterfs]# gluster snapshot config snap-max-hard-limit 100
Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit.
Do you want to continue? (y/n) y
snapshot config: System for snap-max-hard-limit set successfully
[root@snapshot-24 glusterfs]# gluster snapshot config snap-max-soft-limit 80
Changing snapshot-max-soft-limit will lead to deletion of snapshots if they exceed the new limit.
Do you want to continue? (y/n) y
snapshot config: System for snap-max-soft-limit set successfully
[root@snapshot-24 glusterfs]# gluster snapshot config vol1 snap-max-hard-limit 80
Changing snapshot-max-hard-limit will lead to deletion of snapshots if they exceed the new limit.
Do you want to continue? (y/n) y
snapshot config: vol1 for snap-max-hard-limit set successfully


Node 2:
[root@snapshot-27 ~]# reboot

After Node 2 came back, Following was the snapshot config output.

Node 1:
[root@snapshot-24 rhs-glusterfs]# gluster snapshot config

Snapshot System Configuration:
snap-max-hard-limit : 100
snap-max-soft-limit : 80%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol1
snap-max-hard-limit : 80
Effective snap-max-hard-limit : 80
Effective snap-max-soft-limit : 64 (80%)

Node 2:
[root@snapshot-27 rhs-glusterfs]# gluster snapshot config
Snapshot System Configuration:
snap-max-hard-limit : 100
snap-max-soft-limit : 80%
auto-delete : disable

Snapshot Volume Configuration:

Volume : vol1
snap-max-hard-limit : 80
Effective snap-max-hard-limit : 80
Effective snap-max-soft-limit : 64 (80%)

Comment 19 senaik 2014-06-18 11:10:47 UTC

Version: glusterfs 3.6.0.19 built on Jun 18 2014
========
Snap-max-hard limit and soft limit works as expected. 

But with the same steps when auto delete is ENABLED from Node1 when glusterd is stopped on Node2 and when Node 2 is rebooted, checking gluster snapshot config on Node2 shows auto delete is DISABLED. 

NODE1,NODE3,NODE4:
=================

 gluster snapshot config

Snapshot System Configuration:
snap-max-hard-limit : 35
snap-max-soft-limit : 12%
auto-delete : enable ------------------>AUTO DELETE is enabled

Snapshot Volume Configuration:

Volume : vol0
snap-max-hard-limit : 20
Effective snap-max-hard-limit : 20
Effective snap-max-soft-limit : 2 (12%)

Volume : vol1
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 35
Effective snap-max-soft-limit : 4 (12%)

Volume : vol2
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 35
Effective snap-max-soft-limit : 4 (12%)

Volume : vol3
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 35
Effective snap-max-soft-limit : 4 (12%)

Volume : vol4
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 35
Effective snap-max-soft-limit : 4 (12%)

NODE 2 : where glusterd was down and then rebooted:
==================================================
gluster snapshot config

Snapshot System Configuration:
snap-max-hard-limit : 35
snap-max-soft-limit : 12%
auto-delete : disable ------------------> AUTO DELETE is disabled

Snapshot Volume Configuration:

Volume : vol0
snap-max-hard-limit : 20
Effective snap-max-hard-limit : 20
Effective snap-max-soft-limit : 2 (12%)

Volume : vol1
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 35
Effective snap-max-soft-limit : 4 (12%)

Volume : vol2
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 35
Effective snap-max-soft-limit : 4 (12%)

Volume : vol3
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 35
Effective snap-max-soft-limit : 4 (12%)

Volume : vol4
snap-max-hard-limit : 256
Effective snap-max-hard-limit : 35
Effective snap-max-soft-limit : 4 (12%)

Moving it back to 'Assigned'

Comment 20 Sachin Pandit 2014-06-18 13:50:30 UTC

https://code.engineering.redhat.com/gerrit/#/c/27241/1 Fixes the
auto-delete problem.

Comment 21 senaik 2014-06-20 09:52:34 UTC

Version : glusterfs 3.6.0.20 built on Jun 19 2014

Set snap-max-hard-limit and snap-max-soft-limit when glusterd is offline on one node
Rebooted the node, it is updated with the config values. 
Also checked auto-delete with the similar steps, works as expected. 

Marking the bug 'Verified'!

Comment 23 errata-xmlrpc 2014-09-22 19:33:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html

Comment 24 Kaushal 2014-10-30 06:59:14 UTC

*** Bug 1058821 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.