Description of problem:
in automated testing the volfile changes happen often and in one such situation, the IO goes to the volfile which doesn't know about the brick which has went down.
Ideally the way this happens is:
1) the brick is stopped.
2) Volfile changes are made
3) and then clients are notified.
Between the step 1 and step 3 there are IOs going to the brick
that has been stopped. and resulting in IO failures.
Version-Release number of selected component (if applicable):
3.12 and below
Being a race condition unless the IO happens in that window the failure
is hard to reproduce.
How ever with tiered volumes in the automated glusto tests the chances of this to happen are higher.
Steps to Reproduce:
1.run the glusto test for basic tier sanity
2.the glusto test will fail saying it got an io error.
IO error happens.
no io error is supposed to happen. IOs are supposed to be redirected to the new subvolume.
REVIEW: https://review.gluster.org/17757 (Glusterd: stop brick after volfile creation) posted (#1) for review on master by hari gowtham (firstname.lastname@example.org)
REVIEW: https://review.gluster.org/17757 (Glusterd: stop brick after volfile creation) posted (#2) for review on master by hari gowtham (email@example.com)
Patch https://review.gluster.org/#/c/glusterfs/+/21331/ removes tier
functionality from GlusterFS.
https://bugzilla.redhat.com/show_bug.cgi?id=1642807 is used as the tracking bug
for this. Recommendation is to convert your tier volume to regular volume
(either replicate, ec, or plain distribute) with "tier detach" command before
upgrade, and use backend features like dm-cache etc to utilize the caching from
backend to provide better performance and functionality.
With GD2 in development this become obsolete and tier being not support this fix is not necessary. Hence closing the bug.
This bug is moved to https://github.com/gluster/glusterfs/issues/1115, and will be tracked there from now on. Visit GitHub issues URL for further details