Bug 1807431

Summary:	Setting cluster.heal-timeout requires volume restart
Product:	[Community] GlusterFS	Reporter:	Ravishankar N <ravishankar>
Component:	selfheal	Assignee:	Ravishankar N <ravishankar>
Status:	CLOSED NEXTRELEASE	QA Contact:
Severity:	low	Docs Contact:
Priority:	unspecified
Version:	5	CC:	bugs, glenk1973, ravishankar
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1744548	Environment:
Last Closed:	2020-02-28 08:55:21 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1743988, 1744548
Bug Blocks:	1747301

Description Ravishankar N 2020-02-26 10:47:38 UTC

+++ This bug was initially created as a clone of Bug #1744548 +++

+++ This bug was initially created as a clone of Bug #1743988 +++

Description of problem:
Setting the `cluster.heal-timeout` requires a volume restart to take effect.

Version-Release number of selected component (if applicable):
6.5

How reproducible:
Every time

Steps to Reproduce:
1. Provision a 3-peer replica volume (I used three docker containers).
2. Set `cluster.favorite-child-policy` to `mtime`.
3. Mount the volume on one of the containers (say `gluster-0`, serving as a server and a client).
4. Stop the self-heal daemon.
5. Set `cluster.entry-self-heal`, `cluster.data-self-heal` and `cluster.metadata-self-heal` to off.
6. Set `cluster.quorum-type` to none.
7. Write "first write" to file `test.txt` on the mounted volume.
8. Kill the brick process `gluster-2`.
9. Write "second write" to `test.txt`.
10. Force start the volume (`gluster volume start <volume> force`)
11. Kill brick processes `gluster-0` and `gluster-1`.
12. Write "third write" to `test.txt`.
13. Force start the volume.
14. Verify that "split-brain" appears in the output of `gluster volume heal <volume> info` command.
15. Set `cluster.heal-timeout` to `60`.
16. Start the self-heal daemon.
17. Issue `gluster volume heal <volume> info` command after 70 seconds.
18. Verify that the output at step 17 does not contain "split-brain".
19. Verify that the content of `test.txt` is "third write". 

Actual results:
The output at step 17 contains "split-brain".

Expected results:
The output at step 17 should _not_ contain "split-brain".


Additional info:
According to what Ravishankar N said on Slack (https://gluster.slack.com/archives/CH9M2KF60/p1566346818102000), changing volume options such as `cluster.heal-timeout` should not require a process restart. If I add a `gluster volume start <volume> force` command immediately after step 16 above, then I get the Expected results.

--- Additional comment from Glen K on 2019-08-21 06:04:23 UTC ---

I should add that `cluster.quorum-type` is set to `none` for the test.

--- Additional comment from Ravishankar N on 2019-08-21 09:56:54 UTC ---

Okay, so after some investigation, I don't think this is an issue. When you change the heal-timeout, it does get propagated to the self-heal daemon. But since the default value is 600 seconds, the threads that do the heal only wake up after that time. Once it wakes up, subsequent runs do seem to honour the new heal-timeout value.

On a glusterfs 6.5 setup:
#gluster v create testvol replica 2 127.0.0.2:/home/ravi/bricks/brick{1..2} force
#gluster v set testvol client-log-level DEBUG
#gluster v start testvol
#gluster v set testvol heal-timeout 5
#tail -f /var/log/glusterfs/glustershd.log|grep finished
You don't see anything in the log yet about the crawls.
But once you manually launch heal, the threads are woken up and further crawls happen every 5 seconds.
#gluster v heal testvol

Now in glustershd.log:
[2019-08-21 09:55:02.024160] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-0. 
[2019-08-21 09:55:02.024271] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-1.
[2019-08-21 09:55:08.023252] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-1.
[2019-08-21 09:55:08.023358] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-0.
[2019-08-21 09:55:14.024438] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-1.
[2019-08-21 09:55:14.024546] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-0.

Glen, could you check if that works for you? i.e. after setting the heal-timeout, manually launch heal via `gluster v heal testvol`.

--- Additional comment from Glen K on 2019-08-21 18:15:39 UTC ---

In my steps above, I set the heal-timeout while the self-heal daemon is stopped:

...
4. Stop the self-heal daemon.
...
15. Set `cluster.heal-timeout` to `60`.
16. Start the self-heal daemon.
...

I would expect that the configuration would certainly take effect after a restart of the self-heal daemon.

Yes, launching heal manually causes the heal to happen right away, but the purpose of the test is to verify the heal happens automatically. From a user perspective, the current behaviour of the heal-timeout setting appears to be at odds with the "configuration changes take effect without restart" feature; I think it is reasonable to request that changing the heal-timeout setting results in the thread sleeps being reset to the new setting.

--- Additional comment from Ravishankar N on 2019-08-22 07:11:53 UTC ---

(In reply to Glen K from comment #3)
> 
> I would expect that the configuration would certainly take effect after a
> restart of the self-heal daemon.

In step-4 and 16, I assume you toggled `cluster.self-heal-daemon` off and on respectively. This actually does not kill the shd process per se and just disables/enables the heal crawls. In 6.5, a volume start force does restart shd so changing the order of the tests should do the trick, i.e.

13. Set `cluster.heal-timeout` to `60`.
14. Force start the volume.
15. Verify that "split-brain" appears in the output of `gluster volume heal <volume> info` command.


> Yes, launching heal manually causes the heal to happen right away, but the
> purpose of the test is to verify the heal happens automatically. From a user
> perspective, the current behaviour of the heal-timeout setting appears to be
> at odds with the "configuration changes take effect without restart"
> feature; I think it is reasonable to request that changing the heal-timeout
> setting results in the thread sleeps being reset to the new setting.

Fair enough, I'll attempt a fix on master, let us see how the review goes.

--- Additional comment from Worker Ant on 2019-08-22 12:15:15 UTC ---

REVIEW: https://review.gluster.org/23288 (afr: wake up index healer threads) posted (#1) for review on master by Ravishankar N

--- Additional comment from Worker Ant on 2019-08-30 04:25:40 UTC ---

REVIEW: https://review.gluster.org/23288 (afr: wake up index healer threads) merged (#4) on master by Ravishankar N

Comment 1 Worker Ant 2020-02-26 10:51:32 UTC

REVIEW: https://review.gluster.org/24177 (afr: wake up index healer threads) posted (#1) for review on release-5 by Ravishankar N

Comment 2 Worker Ant 2020-02-28 08:55:21 UTC

REVIEW: https://review.gluster.org/24177 (afr: wake up index healer threads) merged (#2) on release-5 by Ravishankar N