Description of problem: I found the following 2 bugs: 1) ping-latency numbers are not flowing till afr because of wrong call in default_notify I missed during the review. 2) During the mount, I observed that the brick with lowest ping-timeout within the halo-max-latency was showed to be down. I found a bug in child-up even handling which also I fixed. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
REVIEW: https://review.gluster.org/19883 (cluster/afr: Make sure latency-arg is passed to afr) posted (#1) for review on master by Pranith Kumar Karampuri
REVIEW: https://review.gluster.org/19884 (cluster/afr: Keep child-up until ping-event) posted (#1) for review on master by Pranith Kumar Karampuri
COMMIT: https://review.gluster.org/19883 committed in master by "Pranith Kumar Karampuri" <pkarampu> with a commit message- cluster/afr: Make sure latency-arg is passed to afr xlator_notify doesn't pass the extra arguments that come in the input function, so XLATOR_NOTIFY macro should be used instead to pass the extra arguments to the function. BUG: 1567881 fixes bz#1567881 Change-Id: Ic15b6c446638cbacf3149693147a754219037c47 Signed-off-by: Pranith Kumar K <pkarampu>
COMMIT: https://review.gluster.org/19884 committed in master by "Pranith Kumar Karampuri" <pkarampu> with a commit message- cluster/afr: Keep child-up until ping-event Problem: If we have 2 bricks, brick-A and brick-B with brick-A within halo-max-latency and brick-B more than halo-max-latency. If we set both halo-min, halo-max replicas as '1'. In this case, brick-A comes online and then ping-latency will be updated for it. When brick-B comes online, we have 2 up-bricks, so the code tries to find the brick with worst latency to mark it down. Since Brick-B just came online it always had '0' latency so brick-B used to be marked offline and Brick-B would eventually be the one to be online even when brick-A is more suited. Fix: Consider latency of just-up child as HALO_MAX_LATENCY so that worst-child until ping-latency is found as the just-up brick. Also keep ping-latency as -1 until child-up during initialization. BUG: 1567881 fixes bz#1567881 Change-Id: I148262fe505468190f0eb99225d0f6d57cdb6f04 Signed-off-by: Pranith Kumar K <pkarampu>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v4.1.0, please open a new bug report. glusterfs-v4.1.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2018-June/000102.html [2] https://www.gluster.org/pipermail/gluster-users/