Red Hat Bugzilla – Bug 1470137
brick unavailable with the old volfile.
Last modified: 2017-07-13 02:51:56 EDT
Description of problem:
in automated testing the volfile changes happen often and in one such situation, the IO goes to the volfile which doesn't know about the brick which has went down.
Ideally the way this happens is:
1) the brick is stopped.
2) Volfile changes are made
3) and then clients are notified.
Between the step 1 and step 3 there are IOs going to the brick
that has been stopped. and resulting in IO failures.
Version-Release number of selected component (if applicable):
3.12 and below
Being a race condition unless the IO happens in that window the failure
is hard to reproduce.
How ever with tiered volumes in the automated glusto tests the chances of this to happen are higher.
Steps to Reproduce:
1.run the glusto test for basic tier sanity
2.the glusto test will fail saying it got an io error.
IO error happens.
no io error is supposed to happen. IOs are supposed to be redirected to the new subvolume.
REVIEW: https://review.gluster.org/17757 (Glusterd: stop brick after volfile creation) posted (#1) for review on master by hari gowtham (firstname.lastname@example.org)
REVIEW: https://review.gluster.org/17757 (Glusterd: stop brick after volfile creation) posted (#2) for review on master by hari gowtham (email@example.com)