Description of problem: ======================= During rebalance continuous "table not found" warning messages are seen in rebalance logs. Also, no files were rebalanced and rebalance completed with failures. Version-Release number of selected component (if applicable): 3.8.4-2.26.git0a405a4.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: =================== 1) Create a distributed replica volume and start it. 2) Enable md-cache required settings on the volume ( Please see gluster volume info output for enabled md-cache settings) 3) Fuse mount the volume on multiple clients. 4) Perform below tasks simultaneosuly from multiple clients, a) From client-1, touch --> for i in {1..20000};do touch f$i;done b) From client-2, create hardlinks for the created files , for i in {1..20000};do ln f$i fl$i;done c) From client-3, change the permissions for the created files, for i in {1..20000};do ln f$i fl$i;done d) From client-4, do a continuous lookup. 5) While the tasks in step-4 are in progress, add few bricks to the volume and start rebalance. Wait till step-4 and step-5 completes. Actual results: ============== During rebalance continuous "table not found" warning messages are seen in rebalance logs. Also, no files were rebalanced and rebalance completed with failures. Expected results: ================= There should not be any "table not found" warning messages in the rebalance logs and rebalancing files should happen without any failures/issues.
In Description -> Steps to reproduce -> 4-> c, command that was used for changing the file permission is for i in {1..20000};do chmod 660 f$i;done glusterfs version: 3.8.4-2.el7.x86_64 Repeated the same steps on an another setup without md-cache. During rebalance, I didn't see any "table not found" warning messages in rebalance logs but no files were rebalanced and rebalance completed with failures.
RCA: The upcall notification is sent to the Rebalance process as well, hence these log messages. Fix: Rebalance process doesn't load md-cache hence we do not need upcall to be sent to these processes. One way is to exclude the rebalance client from sending notifications to. Will discuss this with the upcall and rebalance maintainers and arrive at a conclusion.
Executed few rebalance tests with 3.8.4-2.el7.x86_64 and we are not seeing these "table not found" warning messages. We are seeing this issue only with the build loaded with md-cache.
The log messages are fixed, it should be part of the build provided on 7/9/2016. Putting it ON_QA
Upstream patch http://review.gluster.org/#/c/15398/ Downstream patch https://code.engineering.redhat.com/gerrit/#/c/87047/ It is fixed in version 3.8.4-3
Verified the fix against glusterfs version: 3.8.4-5.el7rhgs.x86_64. "table not found" warnings messages are not seen in rebalance logs during rebalance. Hence, moving this BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html