Description of problem: When both the bricks are up writing is at optimal speed and after killing a data brick the writes drastically slow down. Version-Release number of selected component (if applicable): Gluster version:- 3.8.4-9 How reproducible: 100% Logs and Volume profiles are placed at rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug> Steps to Reproduce: 1. To compare create a 1*(2+1) arbiter volume 2. Now write 2 gigs of data using FIO with below command fio /randomwritejob.ini --client=/clients.list 3. now kill a data brick and then write the same data using fio writing 2 gigs of data takes very long time to complete. Expected results: There should be no difference in writting same data in both scenario. Additional info: [root@dhcp46-206 /]# vim /randomwritejob.ini [root@dhcp46-206 /]# cat /randomwritejob.ini [global] rw=randrw io_size=1g fsync_on_close=1 size=1g bs=64k rwmixread=20 openfiles=1 startdelay=0 ioengine=sync verify=md5 [write] directory=/mnt/samsung nrfiles=1 filename_format=f.$jobnum.$filenum numjobs=2 [root@dhcp46-206 /]#
RCA: afr_replies_interpret() used the 'readable' matrix to trigger client side heals after inode refresh. But for arbiter, readable is always zero. So when `dd` is run with a data brick down, spurious data heals are are triggered repeatedly. These heals open an fd, causing eager lock to be disabled (open fd count >1) in afr transactions, leading to extra LOCK + FXATTROPS, slowing the throughput.
Upstream patch http://review.gluster.org/#/c/16277/
Downstream patch https://code.engineering.redhat.com/gerrit/#/c/93735
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html