Bug 1764091

Summary: Setting cluster.heal-timeout requires volume restart
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ravishankar N <ravishankar>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED ERRATA QA Contact: Arthy Loganathan <aloganat>
Severity: low Docs Contact:
Priority: unspecified    
Version: rhgs-3.5CC: pasik, pprakash, puebele, ravishankar, rhs-bugs, rkothiya, sheggodu, storage-qa-internal
Target Milestone: ---Keywords: Triaged
Target Release: RHGS 3.5.z Batch Update 3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-6.0-38 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1744548 Environment:
Last Closed: 2020-12-17 04:50:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ravishankar N 2019-10-22 08:52:02 UTC
+++ This bug was initially created as a clone of Bug #1744548 +++

+++ This bug was initially created as a clone of Bug #1743988 +++

Description of problem:
Setting the `cluster.heal-timeout` requires a volume restart to take effect.

Version-Release number of selected component (if applicable):
6.5

How reproducible:
Every time

Steps to Reproduce:
1. Provision a 3-peer replica volume (I used three docker containers).
2. Set `cluster.favorite-child-policy` to `mtime`.
3. Mount the volume on one of the containers (say `gluster-0`, serving as a server and a client).
4. Stop the self-heal daemon.
5. Set `cluster.entry-self-heal`, `cluster.data-self-heal` and `cluster.metadata-self-heal` to off.
6. Set `cluster.quorum-type` to none.
7. Write "first write" to file `test.txt` on the mounted volume.
8. Kill the brick process `gluster-2`.
9. Write "second write" to `test.txt`.
10. Force start the volume (`gluster volume start <volume> force`)
11. Kill brick processes `gluster-0` and `gluster-1`.
12. Write "third write" to `test.txt`.
13. Force start the volume.
14. Verify that "split-brain" appears in the output of `gluster volume heal <volume> info` command.
15. Set `cluster.heal-timeout` to `60`.
16. Start the self-heal daemon.
17. Issue `gluster volume heal <volume> info` command after 70 seconds.
18. Verify that the output at step 17 does not contain "split-brain".
19. Verify that the content of `test.txt` is "third write". 

Actual results:
The output at step 17 contains "split-brain".

Expected results:
The output at step 17 should _not_ contain "split-brain".


Additional info:
According to what Ravishankar N said on Slack (https://gluster.slack.com/archives/CH9M2KF60/p1566346818102000), changing volume options such as `cluster.heal-timeout` should not require a process restart. If I add a `gluster volume start <volume> force` command immediately after step 16 above, then I get the Expected results.

--- Additional comment from Glen K on 2019-08-21 06:04:23 UTC ---

I should add that `cluster.quorum-type` is set to `none` for the test.

--- Additional comment from Ravishankar N on 2019-08-21 09:56:54 UTC ---

Okay, so after some investigation, I don't think this is an issue. When you change the heal-timeout, it does get propagated to the self-heal daemon. But since the default value is 600 seconds, the threads that do the heal only wake up after that time. Once it wakes up, subsequent runs do seem to honour the new heal-timeout value.

On a glusterfs 6.5 setup:
#gluster v create testvol replica 2 127.0.0.2:/home/ravi/bricks/brick{1..2} force
#gluster v set testvol client-log-level DEBUG
#gluster v start testvol
#gluster v set testvol heal-timeout 5
#tail -f /var/log/glusterfs/glustershd.log|grep finished
You don't see anything in the log yet about the crawls.
But once you manually launch heal, the threads are woken up and further crawls happen every 5 seconds.
#gluster v heal testvol

Now in glustershd.log:
[2019-08-21 09:55:02.024160] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-0. 
[2019-08-21 09:55:02.024271] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-1.
[2019-08-21 09:55:08.023252] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-1.
[2019-08-21 09:55:08.023358] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-0.
[2019-08-21 09:55:14.024438] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-1.
[2019-08-21 09:55:14.024546] D [MSGID: 0] [afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished index sweep on subvol testvol-client-0.

Glen, could you check if that works for you? i.e. after setting the heal-timeout, manually launch heal via `gluster v heal testvol`.

--- Additional comment from Glen K on 2019-08-21 18:15:39 UTC ---

In my steps above, I set the heal-timeout while the self-heal daemon is stopped:

...
4. Stop the self-heal daemon.
...
15. Set `cluster.heal-timeout` to `60`.
16. Start the self-heal daemon.
...

I would expect that the configuration would certainly take effect after a restart of the self-heal daemon.

Yes, launching heal manually causes the heal to happen right away, but the purpose of the test is to verify the heal happens automatically. From a user perspective, the current behaviour of the heal-timeout setting appears to be at odds with the "configuration changes take effect without restart" feature; I think it is reasonable to request that changing the heal-timeout setting results in the thread sleeps being reset to the new setting.

--- Additional comment from Ravishankar N on 2019-08-22 07:11:53 UTC ---

(In reply to Glen K from comment #3)
> 
> I would expect that the configuration would certainly take effect after a
> restart of the self-heal daemon.

In step-4 and 16, I assume you toggled `cluster.self-heal-daemon` off and on respectively. This actually does not kill the shd process per se and just disables/enables the heal crawls. In 6.5, a volume start force does restart shd so changing the order of the tests should do the trick, i.e.

13. Set `cluster.heal-timeout` to `60`.
14. Force start the volume.
15. Verify that "split-brain" appears in the output of `gluster volume heal <volume> info` command.


> Yes, launching heal manually causes the heal to happen right away, but the
> purpose of the test is to verify the heal happens automatically. From a user
> perspective, the current behaviour of the heal-timeout setting appears to be
> at odds with the "configuration changes take effect without restart"
> feature; I think it is reasonable to request that changing the heal-timeout
> setting results in the thread sleeps being reset to the new setting.

Fair enough, I'll attempt a fix on master, let us see how the review goes.

--- Additional comment from Worker Ant on 2019-08-22 12:15:15 UTC ---

REVIEW: https://review.gluster.org/23288 (afr: wake up index healer threads) posted (#1) for review on master by Ravishankar N

--- Additional comment from Worker Ant on 2019-08-30 04:25:40 UTC ---

REVIEW: https://review.gluster.org/23288 (afr: wake up index healer threads) merged (#4) on master by Ravishankar N

Comment 7 Arthy Loganathan 2020-09-22 05:44:53 UTC
Followed the steps mentioned in the bug, and `gluster volume heal <volume> info` command after n seconds as per cluster.heal-timeout does not contain "split-brain" as expected.

Verified the fix in,
glusterfs-server-6.0-45.el7rhgs.x86_64

Comment 9 errata-xmlrpc 2020-12-17 04:50:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5603