Please provide a public description of the problem.
Description of problem: ------------------------ Testbed : 12*(4+2),6 servers,6 workload generating clients. Benchmark : 3.1.3 with io-threads enabled. 3.2 testing was done with io-threads enabled and mdcache parameters set. It looks like we have regressed with 3.2 on large file writes/rand writes : ****************** Sequential Writes ****************** 3.1.3 : 2838601.16 kB/sec 3.2 : 2506687.55 kB/sec Regression : ~ 12% ****************** Random Writes ****************** 3.1.3 : 617384.17 kB/sec 3.2 : 480226.17 kB/sec Regression : ~22% Version-Release number of selected component (if applicable): ------------------------------------------------------------- glusterfs-3.8.4-10.el7rhgs.x86_64 How reproducible: ------------------ 100% Actual results: ---------------- Regressions on sequential and random large file writes. Expected results: ----------------- Regression Threshold is within +-10%
REVIEW: http://review.gluster.org/16377 (cluster/ec: Do not start heal on good file while IO is going on) posted (#1) for review on master by Ashish Pandey (aspandey)
Just missed to mention this info Possible RCA - After implementing patch http://review.gluster.org/#/c/13733/, before writing on a file we set dirty flag and at the end we remove this flag. This creates an index entry in .glusterfs/indices/xattrop/ . which remains there through out write fop. every 60 seconds shd will come up and scan this entry and starts heal, Heal in turn takes a lot of locks to FIND and heal the file. Which raises the number of inodelk fop count and could be a possible culprit. I disabled the shd and wrote a file - time dd if=/dev/urandom of=a1 count=1024 bs=1M conv=fdatasync Profile shows only 4 calls inodelk. Brick: apandey:/brick/gluster/testvol-6 --------------------------------------- Cumulative Stats: Block Size: 32768b+ 65536b+ No. of Reads: 0 0 No. of Writes: 8188 2 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1 RELEASE 0.00 47.00 us 47.00 us 47.00 us 1 STATFS 0.00 49.50 us 46.00 us 53.00 us 2 FLUSH 0.00 38.00 us 26.00 us 52.00 us 4 INODELK 0.00 92.50 us 85.00 us 100.00 us 2 XATTROP 0.00 305.00 us 305.00 us 305.00 us 1 CREATE 0.00 138.00 us 32.00 us 395.00 us 4 FXATTROP 0.00 164.14 us 119.00 us 212.00 us 7 LOOKUP 0.92 72.73 us 43.00 us 8431.00 us 8190 WRITE 99.08 64142355.00 us 64142355.00 us 64142355.00 us 1 FSYNC With shd enable it is around 54- Brick: apandey:/brick/gluster/testvol-1 --------------------------------------- Cumulative Stats: Block Size: 32768b+ 65536b+ No. of Reads: 0 0 No. of Writes: 8190 1 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 7 RELEASE 0.00 0.00 us 0.00 us 0.00 us 21 RELEASEDIR 0.00 30.00 us 30.00 us 30.00 us 1 STATFS 0.00 5.76 us 2.00 us 9.00 us 21 OPENDIR 0.00 64.50 us 30.00 us 99.00 us 2 FLUSH 0.00 23.17 us 20.00 us 27.00 us 6 FSTAT 0.00 95.50 us 89.00 us 102.00 us 2 XATTROP 0.00 272.00 us 272.00 us 272.00 us 1 CREATE 0.00 61.67 us 42.00 us 85.00 us 6 OPEN 0.00 98.94 us 31.00 us 428.00 us 16 FXATTROP 0.00 79.92 us 22.00 us 190.00 us 38 LOOKUP 0.12 2379.48 us 1376.00 us 4600.00 us 42 READDIR 0.74 74.70 us 42.00 us 49556.00 us 8191 WRITE 10.29 163490.19 us 19.00 us 1405941.00 us 52 INODELK 19.02 320668.04 us 26.00 us 15705174.00 us 49 GETXATTR 69.83 57700430.00 us 57700430.00 us 57700430.00 us 1 FSY
REVIEW: http://review.gluster.org/16377 (cluster/ec: Do not start heal on good file while IO is going on) posted (#2) for review on master by Ashish Pandey (aspandey)
REVIEW: http://review.gluster.org/16377 (cluster/ec: Do not start heal on good file while IO is going on) posted (#3) for review on master by Ashish Pandey (aspandey)
COMMIT: http://review.gluster.org/16377 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 578e9b5b5b45245ed044bab066533411e2141db6 Author: Ashish Pandey <aspandey> Date: Wed Jan 11 17:19:30 2017 +0530 cluster/ec: Do not start heal on good file while IO is going on Problem: Write on a file has been slowed down significantly after http://review.gluster.org/#/c/13733/ RC : When update fop starts on a file, it sets dirty flag at the start and remove it at the end which make an index entry in indices/xattrop. During IO, SHD scans this and finds out an index and starts heal even if all the fragments are healthy and up tp date. This heal takes inodelk for different types of heal. If the IO is for long time this will happen in every 60 seconds. Due to this extra, unneccessary locking, IO gets slowed down. Solution: Before starting any type of heal check if file needs heal or not. Change-Id: Ib9519a43e7e4b2565d3f3153f9ca0fb92174fe51 BUG: 1409191 Signed-off-by: Ashish Pandey <aspandey> Reviewed-on: http://review.gluster.org/16377 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> Reviewed-by: Xavier Hernandez <xhernandez>
REVIEW: http://review.gluster.org/16444 (cluster/ec: Do not start heal on good file while IO is going on) posted (#1) for review on release-3.9 by Ashish Pandey (aspandey)
REVIEW: https://review.gluster.org/16551 (cluster/ec: Do not start heal on good file while IO is going on) posted (#1) for review on release-3.10 by Xavier Hernandez (xhernandez)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report. glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html [2] https://www.gluster.org/pipermail/gluster-users/