Description of problem: The storage system slows down dramatically as data is added to the volume. This problem was discovered by downloading and trying to un-tar the boost libraries source code. On an empty volume the extraction takes 1-minute. On a volume with 40-TB, the extraction takes 20-minutes. Version-Release number of selected component (if applicable): gluster 3.6.2 How reproducible: This problem is reproducible on my systems and it has been reproduced by Ben Turner on the RedHat systems (email traffic from 2/17/15). Ben was working with Shaym to resolve this issue. Debugging info was sent to Shaym on/around 2/10/2015 (strace, io-stats, etc). Steps to Reproduce: 1. Create a volume and add 20-TB of data. 2. time tar -xf boost.tar 3. Create a new volume with no data 4. time tar -xf boost.tar Actual results: volume with 20-TB: 20 minutes empty volume: 1 minute Expected results: Additional info:
Shyam, if you passed this on to someone else to work on it, please adjust the status/assignee. Thanks!
Assigned to self.
Updating some analysis done till date: See mail thread: http://www.gluster.org/pipermail/gluster-devel/2015-February/043792.html -------------------------------------- 1) Volumes classified as slow (i.e with a lot of pre-existing data) and fast (new volumes carved from the same backend file system that slow bricks are on, with little or no data) 2) We ran an strace of tar and also collected io-stats outputs from these volumes, both show that create and mkdir is slower on slow as compared to the fast volume. This seems to be the overall reason for slowness 3) The tarball extraction is to a new directory on the gluster mount, so all lookups etc. happen within this new name space on the volume 4) Checked memory footprints of the slow bricks and fast bricks etc. nothing untoward noticed there 5) Restarted the slow volume, just as a test case to do things from scratch, no improvement in performance (this was on David's setup). --------------- My base filesystem has 40-TB and the tar takes 19 minutes. I copied over 10-TB and it took the tar extraction from 1-minute to 7-minutes. My suspicion is that it is related to number of files and not necessarily file size. Shyam is looking into reproducing this behavior on a redhat system. ------------- On 02/12/2015 11:18 AM, David F. Robinson wrote: > Shyam, > > You asked me to stop/start the slow volume to see if it fixed the timing > issue. I stopped/started homegfs_backup (the production volume with 40+ > TB) and it didn't make it faster. I didn't stop/start the fast volume > to see if it made it slower. I just did that and sent out an email. I > saw a similar result as Pranith. -----------
Current dev update: ------------------- Patch (yet to be posted upstream) to make pruning of inode lists asynchronous to the FOP call is in progress. Pending testing the same on a setup with defined characteristics that can demonstrate the problem and hence the fix.
REVIEW: http://review.gluster.org/9936 (core: prune inode table asynchronously) posted (#1) for review on master by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/9936 (core: prune inode table asynchronously) posted (#2) for review on master by Shyamsundar Ranganathan (srangana)
Update (dev): ------------- With the change in comment #5 ran a series of tests to see if it improves the performance. Here are the current observations from the same. 1) Tried reproducing the issue in order to test the fix in a couple of setups, but issue of old volume always being slow is not observed always, so unsure of current ability to reproduce the issue. 2) In cases where the issue got reproduced, the behavior was as follows, - Have a large amount of data on the volume - Fill up various internal caches, using a listing of all files in the volume and also cat'ing the contents - First untar run post this is slow, the next run onward things catch up and are fine (these were on a pure distribute volume) - When the untar is slow, the mkdir and create are the FOPs that show the slowness, as in the investigation on the bug report - The slowness is almost completely due to the brick processes and the client stack does not seem to be contributing anything (io-stats output based). - With the prune fix the situation does not improve 3) Currently continuing to debug on the setup possible causes for this first run to slow down, as the slowness reason seems to be in alignment with what is initially reported and analyzed
Test that reproduces the issue: 1) Create ~5TB of 320KB files spread across several directories 2) Run "find . -exec cat {} \; > /dev/null" so that meta data and data is read from the gluster volume from a FUSE mount 3) After about 1-2 hours of (2) running, stop the command 4) Now extract the said tarball on the FUSE mount, this takes approximately 2x the time that it would take if this was done after step (1) and after restarting the volume and remounting the same on the clients. Analysis: 1) The slowness demonstrated in Step (4) is only for the first 1-2 tar extractions, after which it comes back to the good number 2) In these slow runs the mkdir and create are the FOPs that slow down the entire process 3) Placing io-stat on top of POSIX on the bricks, and checking time deltas between the FOP time on POSIX->Server/slator->FUSE, it is seen that the time is consumed in POSIX and that layer needs further investigation. @David, would it be possible to run a similar io-stat capture at your end on your volumes so that I can corroborate that this is a POSIX/xlator layer issue only?
Further observations and info ----------------------------- A) Replicated the test on David's (reporter) setup with io-stats xlator on top of the posix xlator and below server/xlator (as usual), and observed almost all of the latency from the posix xlator mkdir and create FOPs. This points to the slowness in POSIX xlator or below it. David had also run a tar extraction directly on the brick, and that was fast (10 seconds). So there is no real slowness on XFS directly. Current suspicion is on the link creation in the flat .glusterfs namespace, as that would have accumulated a lot of files (the cu setup holds about 50TB of data). B) Attempting an pure XFS test for a workflow that performs like gluster, i.e create file or directory and hard/soft link it within a flat .glusterfs name space to see, if a relatively filled up directory causes slowness C) Also requested David for an strace from the brick processes to see latency deviations between the same, in the fast and slow case
C) Strace from David points to the following, linkat and symlink take a high amount of average latency on the bricks that belong to the slow volume and are fast on the bricks that belong to the fast volume. The order of magnitude is very high as follows, Fast brick: Avg. linkat latency from strace: 0.000012 seconds #calls: 22025 Time taken for linkat: 0.000012 * #calls = 0.26914 seconds Slow brick: Avg. linkat latency from strace: 0.0103351812 seconds #calls: 22025 Time taken for linkat: 227 seconds Similar data is seen for symlink into the .glusterfs directories. Overall the latency here is for a single brick, extrapolating this (i.e multiplying this by 2) shows that we would save ~450 seconds on linkat if this was fast, which points to the problem being linkat and symlink into the .glusterfs directories. Looking at option B (in comment #9) and possibly getting xfs_metadump from David and also some entry counts in the 5 best and worst linkat directories to see how to proceed with looking at the same from an XFS/VFS perspective.
Data from David's setup on the 5 fast linkat directories and 5 slowest linkat directories, in terms of entry counts: Fast: /data/brick01bkp/homegfs_bkp/.glusterfs/82/93/ /data/brick01bkp/homegfs_bkp/.glusterfs/f1/34/ /data/brick01bkp/homegfs_bkp/.glusterfs/df/b0/ /data/brick01bkp/homegfs_bkp/.glusterfs/67/f7/ /data/brick01bkp/homegfs_bkp/.glusterfs/96/e0/ Slow: /data/brick01bkp/homegfs_bkp/.glusterfs/e0/24/ /data/brick01bkp/homegfs_bkp/.glusterfs/80/e1/ /data/brick01bkp/homegfs_bkp/.glusterfs/e3/f4/ /data/brick01bkp/homegfs_bkp/.glusterfs/da/80/ /data/brick01bkp/homegfs_bkp/.glusterfs/a9/ca/ Fast: [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/82/93/ | wc -l 81 [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/f1/34/ | wc -l 72 [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/df/b0/ | wc -l 77 [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/67/f7/ | wc -l 58 [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/96/e0/ | wc -l 57 Slow: [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/e0/24/ | wc -l 69 [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/80/e1/ | wc -l 56 [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/e3/f4/ | wc -l 61 [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/da/80/ | wc -l 64 [root@gfs01bkp ~]# ls -l /data/brick01bkp/homegfs_bkp/.glusterfs/a9/ca/ | wc -l 79
Information requested has been provided and uploaded to this bug in the comments above.
Created attachment 1034827 [details] Testing without gluster in the stack update for point (B) in comment #9 Attempted to create a simple program that runs on a local XFS mount to see if the problem can be replicated without gluster in the picture. The attached program was run for a few hours in both a local setup as well as in David's setup but did not show up any linkat/symlink creation slowness as observed with Gluster. This means of detecting or narrowing the problem does not seem to be bearing fruit.
Created attachment 1035401 [details] Further system information Some further information from David's setup: 1) [root@gfs01bkp homegfs]# cat /etc/redhat-release Scientific Linux release 6.6 (Carbon) 2) ext4 also possibly shows the same increased untar times (in a more controlled test) 3) Post reboot an rerun of the tests still shows the same slowness: [root@gfs01bkp homegfs]# date; time tar -xPf boost_1_57_0.tar ; date; time rm -rf boost_1_57_0; date Fri Jun 5 12:43:39 EDT 2015 real 26m47.391s user 0m0.642s sys 0m3.873s Fri Jun 5 13:10:26 EDT 2015 real 0m33.857s user 0m0.130s sys 0m1.048s Fri Jun 5 13:11:00 EDT 2015 attached file contains data from the following commands: - kernel version (uname -a) - xfsprogs version (xfs_repair -V) - number of CPUs - contents of /proc/mounts - contents of /proc/partitions - xfs_info output on the filesystem in question
This bug is still ASSIGNED, as the inode table pruning code does not help the situation.
I see a few items related to small-file performance planned for the near future: compound fops, tiering improvements for small file. IMO, we need to fix this in order for all those enhancements to really make sense. And in any case, we really need to understand what is going on here. I'm planning to spend some time on this bug to see if I can help figure out what is happening here.
Created attachment 1190117 [details] small-file performance over multiple iterations Each iteration here is creating and writing the same no. of files, and reporting the aggregate throughput for that iteration. So iteration 1 writes 128K files. Iteration 2 adds another 128K files on top of that and so on. sync is performed between iterations. Ideally, the throughput reported in each iteration should be the same. 1. Gluster-HDD is for a single-brick gluster volume on a hard disk, fuse mounted on clients. 2. kNFS-HDD is the same brick filesystem in 1., but this time kernel NFS exported and nfs mounted on the clients. 3. Gluster-NVMeSSD is a single-brick gluster volume, like in 1., but this time created on an NVMe SSD (note that the throughput is much higher in this case, and is shown by the secondary Y axis). The Gluster-HDD line shows the severe performance degradation that has been reported in this bz -- throughput in iteration 8 is less than 1/4th the throughput in iteration 1. The kNFS-HDD and Gluster-NVMeSSD do not show this degradation, though there is a slight drop in the case of kNFS. To me these results suggest that: (1) the performance degradation we are seeing in gluster is due to increased seeks, likely due to fragmentation or sub-optimal allocation. The seek hypothesis would explain why the degradation is not seen in the SSD case. (2) The fragmentation or sub-optimal allocation is primarily due to the extra metadata that gluster is maintaining. This would explain why it is not seen in the kNFS case.
(In reply to Manoj Pillai from comment #17) More details on the test: Versions: glusterfs*-3.8.2-1.el7.x86_64 RHEL 7.2 (server): kernel-3.10.0-327.el7.x86_64 RHEL 7.1 (clients): kernel-3.10.0-229.el7.x86_64 Benchmarking Tool: smallfile version 3.0 [https://github.com/bengland2/smallfile] smallfile_cli.py --top ${top_dir} --host-set ${hosts_str} --threads 4 --files 8192 --file-size 64 --record-size 64 --fsync Y --response-times N --operation create There are 4 clients, so no. of files in each iteration is: (4 clients) * (4 threads per client) * 8192 = 128K files. file size is 64KB. Each iteration creates a new "top directory" under the mountpoint and all files created in that iteration are created inside that sub-directory in a hierarchy with 100 files per directory (default). Gluster volume info: Volume Name: perfvol Type: Distribute <...> Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: <...> Options Reconfigured: cluster.lookup-optimize: on server.event-threads: 4 client.event-threads: 8 performance.client-io-threads: on transport.address-family: inet performance.readdir-ahead: on
Manoj, This is interesting data and I wonder if you got a chance to gather strace output from the brick using "strace -ff -T -p <pid-of-brick> -o <path-to-the-file-where-you-want-the-output-saved>`" This actually gives the time it spends in each of the syscalls. It would be interesting to know what kind of syscalls are leading to this problem. What do you think? Pranith
(In reply to Pranith Kumar K from comment #19) There is strace analysis in Shyam's comment #10. Please take a look. We can collect again if needed. In the meantime, I'd very much like to get a primer on the metadata gluster maintains on the brick: what, why, and how it figures in the I/O path. If you can point to a link where this is documented, or post here, or discuss offline, that would be very useful.
GlusterFS-3.6 is nearing its End-Of-Life, only important security bugs still make a chance on getting fixed. Moving this to the mainline 'version'. If this needs to get fixed in 3.7 or 3.8 this bug should get cloned
Some of the findings based on the tests I did based on Brian's inputs: We removed all the xlators between fuse and posix to make the volfile similar to what we have here: volume test-posix-0 type storage/posix option directory /home/brick/test20 end-volume Mounted this volfile using: glusterfs --volfile=/root/volfile.vol /d Small file CLI we used is: for i in {1..80}; do mkdir /mnt/smf/$i ; python smallfile_cli.py --top /mnt/smf/$i --threads 4 --files 8192 --file-size 64 --record-size 64 --fsync Y --response-times N --operation create >> /root/smf.out; done 1) In this test I ran the small file workload using the following CLI 25 times. We observed that the elapsed time kept increasing. These are the times: [root@gprfs039 ~]# grep 'sec elapsed' just-posix-smf.out 45.026515 sec elapsed time 45.630780 sec elapsed time 47.106841 sec elapsed time 48.994986 sec elapsed time 49.056766 sec elapsed time 47.511098 sec elapsed time 57.042012 sec elapsed time 57.128427 sec elapsed time 63.952732 sec elapsed time 77.634293 sec elapsed time 69.540255 sec elapsed time 84.495499 sec elapsed time 83.755443 sec elapsed time 76.225897 sec elapsed time 86.382487 sec elapsed time 113.626848 sec elapsed time 109.282297 sec elapsed time 95.230611 sec elapsed time 110.965802 sec elapsed time 90.876091 sec elapsed time 98.172049 sec elapsed time 106.055803 sec elapsed time 133.958628 sec elapsed time 104.454683 sec elapsed time 119.641330 sec elapsed time 2) In this test we did a remount and ran the test where we stopped from. The first run took 97 seconds which was in the same range as before. 3) In this test we did a drop-caches. What we observed is that the elapsed time started at around 440 seconds and came down to 95 seconds over time but hovered again at around 95-120 seconds 442.857238 sec elapsed time 314.401744 sec elapsed time 260.829069 sec elapsed time 219.472399 sec elapsed time 201.450154 sec elapsed time 178.373704 sec elapsed time 158.367833 sec elapsed time 151.658923 sec elapsed time 144.989891 sec elapsed time 126.476212 sec elapsed time 133.021995 sec elapsed time 106.850061 sec elapsed time 105.401238 sec elapsed time 118.126849 sec elapsed time 115.904676 sec elapsed time 94.786972 sec elapsed time 119.076134 sec elapsed time 4) In this test we kept creating new bricks on the same XFS partitions by generating new volfiles and running small file create using the following CLI: for i in {2..26}; do echo "volume test-posix-0" > /root/volfile.vol ; echo " type storage/posix" >> /root/volfile.vol; echo " option directory /home/brick/test$i" >> /root/volfile.vol; echo "end-volume" >> /root/volfile.vol; mkdir /home/brick/test$i; glusterfs --volfile=/root/volfile.vol /d; mkdir -p /d/smf/$i ; python smallfile_cli.py --top /d/smf/$i --threads 4 --files 8192 --file-size 64 --record-size 64 --fsync Y --response-times N --operation create >> /root/separate-bricks.out; umount /d; done Run time didn't degrade in this case!! Here are the times: [root@gprfs039 ~]# grep 'sec elapsed' /root/separate-bricks.out 48.763393 sec elapsed time 48.581379 sec elapsed time 48.777564 sec elapsed time 49.469651 sec elapsed time 48.915736 sec elapsed time 50.647200 sec elapsed time 48.117454 sec elapsed time 49.835503 sec elapsed time 48.549795 sec elapsed time 50.297137 sec elapsed time 51.053001 sec elapsed time 49.238791 sec elapsed time 49.562103 sec elapsed time 46.240161 sec elapsed time 50.360794 sec elapsed time 48.729912 sec elapsed time 51.763603 sec elapsed time 49.370971 sec elapsed time 48.930266 sec elapsed time 48.070540 sec elapsed time 48.472625 sec elapsed time 48.350298 sec elapsed time 47.316538 sec elapsed time 46.672402 sec elapsed time 47.304169 sec elapsed time Will update with more info once we find more about the reason why this is the case. Pranith
Ran strace on the bricks for the fast and slow case, these are the 10 maximum syscall latencies for all of the syscalls. I am seeing high latencies for fsync and stats in the slow case: Normal case: chown 0.000067 0.000070 0.000071 0.000079 0.000086 0.000099 0.000141 0.000147 0.000172 0.000249 close 0.000202 0.000202 0.000203 0.000204 0.000204 0.000204 0.000206 0.000207 0.000220 0.000505 fgetxattr 0.000143 0.000147 0.000151 0.000152 0.000155 0.000158 0.000160 0.000162 0.000162 0.000193 fstat 0.000130 0.000131 0.000132 0.000133 0.000133 0.000134 0.000134 0.000151 0.000156 0.000158 fsync 0.018422 0.018479 0.018490 0.018572 0.018582 0.033627 0.036283 0.047154 0.078602 0.695221 getdents 0.000108 0.000109 0.000109 0.000110 0.000112 0.000114 0.000114 0.000116 0.000118 0.000164 lgetxattr 0.000081 0.000081 0.000082 0.000082 0.000083 0.000084 0.000085 0.000089 0.000091 0.000094 linkat 0.000097 0.000099 0.000099 0.000100 0.000105 0.000113 0.000138 0.000152 0.000154 0.000182 lseek 0.000039 0.000074 0.000087 0.000103 0.000106 0.000113 0.000113 0.000114 0.000115 0.000134 lsetxattr 0.000095 0.000097 0.000098 0.000099 0.000100 0.000103 0.000106 0.000139 0.000167 0.000171 lstat 0.002063 0.002094 0.003154 0.003503 0.003998 0.004202 0.004760 0.005640 0.008268 0.015741 mkdir 0.000258 0.000259 0.000265 0.000268 0.000269 0.000269 0.000276 0.000279 0.000293 0.001713 openat 0.000118 0.000120 0.000123 0.000123 0.000123 0.000124 0.000124 0.000126 0.000126 0.000130 open 0.002263 0.003762 0.003774 0.005012 0.005164 0.005207 0.005640 0.005902 0.006392 0.193450 pwrite 0.000224 0.000224 0.000228 0.000248 0.000249 0.000250 0.000259 0.000266 0.000398 0.000636 read 0.000085 0.000086 0.000087 0.000088 0.000098 0.000124 0.000130 0.000145 0.000162 0.000238 readlink 0.000020 0.000021 0.000021 0.000023 0.000024 0.000028 readv 0.000956 0.006536 0.047662 0.061173 0.224371 0.332077 2.278480 48.556456 85.082750 127.490240 stat 0.000085 0.000089 0.000099 0.000116 0.000120 0.000128 0.000131 0.000132 0.000135 0.000145 symlink 0.000180 0.000183 0.000183 0.000183 0.000190 0.000195 0.000196 0.000202 0.000208 0.000222 write 0.000089 0.000127 0.000130 0.000135 0.000136 0.000136 0.000137 0.000140 0.000140 0.000258 writev 0.000148 0.000157 0.000157 0.000159 0.000160 0.000163 0.000170 0.000904 0.004772 0.088366 Slow case: chown 0.000169 0.000169 0.000170 0.000174 0.000175 0.000177 0.000179 0.000179 0.000193 0.000265 close 0.000191 0.000192 0.000205 0.000210 0.000218 0.000224 0.000315 0.000321 0.000416 0.000454 fgetxattr 0.000209 0.000232 0.000250 0.000279 0.000286 0.000361 0.000443 0.000519 0.000824 0.000877 fstat 0.000231 0.000264 0.000264 0.000275 0.000314 0.000363 0.000440 0.000640 0.001134 0.002430 fsync 1.500296 1.503076 1.513502 1.516628 1.522290 1.523135 1.535945 1.546670 1.548597 1.590475 getdents 0.000117 0.000118 0.000118 0.000119 0.000120 0.000120 0.000125 0.000125 0.000125 0.000126 lgetxattr 0.000155 0.000157 0.000157 0.000158 0.000164 0.000164 0.000171 0.000184 0.000205 0.000351 linkat 0.000214 0.000224 0.000228 0.000228 0.000231 0.000231 0.000234 0.000252 0.010239 0.070905 lremovexattr 0.000028 0.000028 0.000028 0.000029 0.000029 0.000029 0.000030 0.000031 0.000031 0.000032 lseek 0.000148 0.000151 0.000152 0.000158 0.000158 0.000162 0.000164 0.000178 0.000178 0.000188 lsetxattr 0.000161 0.000162 0.000164 0.000165 0.000165 0.000166 0.000167 0.000177 0.000292 0.000488 lstat 1.529056 1.529124 1.530052 1.531486 1.531578 1.531743 1.533177 1.722707 1.931263 2.287300 mkdir 0.000249 0.000259 0.000260 0.000268 0.000272 0.000306 0.000358 0.000415 0.000477 0.002294 openat 0.000121 0.000124 0.000126 0.000126 0.000127 0.000129 0.000131 0.000131 0.000135 0.000190 open 0.022604 0.024123 0.025090 0.030248 0.033538 0.034225 0.036417 0.037107 0.043395 0.047846 pwrite 0.000279 0.000285 0.000285 0.000290 0.000292 0.000294 0.000302 0.000336 0.000374 0.000374 read 0.000165 0.000165 0.000168 0.000169 0.000187 0.000194 0.000207 0.000212 0.000550 0.000552 readv 0.201163 0.242594 0.247809 0.256123 0.263686 0.271234 0.273045 0.274275 0.280773 0.286999 stat 0.000169 0.000170 0.000181 0.000230 0.000233 0.000245 0.000279 0.000321 0.000323 0.000775 symlink 0.000248 0.000251 0.000252 0.000253 0.000259 0.000263 0.000267 0.000274 0.000286 0.000298 truncate 0.000028 0.000028 0.000029 0.000031 0.000032 0.000032 0.000032 0.000033 0.000033 0.000034 write 0.000184 0.000184 0.000187 0.000193 0.000194 0.000204 0.000218 0.000221 0.000227 0.000229 writev 0.000497 0.000580 0.000603 0.000726 0.000804 0.001008 0.014676 0.021616 0.027423 0.030991
Oops, big comment :-). Just check for fsync/lstat headings in both cases.
(In reply to Pranith Kumar K from comment #23) > Ran strace on the bricks for the fast and slow case, these are the 10 > maximum syscall latencies for all of the syscalls. I am seeing high > latencies for fsync and stats in the slow case: > > Normal case: > ... > fsync > 0.018422 > 0.018479 > 0.018490 > 0.018572 > 0.018582 > 0.033627 > 0.036283 > 0.047154 > 0.078602 > 0.695221 ... > lstat > 0.002063 > 0.002094 > 0.003154 > 0.003503 > 0.003998 > 0.004202 > 0.004760 > 0.005640 > 0.008268 > 0.015741 ... > > Slow case: ... > fsync > 1.500296 > 1.503076 > 1.513502 > 1.516628 > 1.522290 > 1.523135 > 1.535945 > 1.546670 > 1.548597 > 1.590475 ... > lstat > 1.529056 > 1.529124 > 1.530052 > 1.531486 > 1.531578 > 1.531743 > 1.533177 > 1.722707 > 1.931263 > 2.287300 ... Ok, did we learn anything about whether the number of calls is roughly similar between the fast and slow runs? If so, the next best thing is probably to try and dig into where the latency originates via the function graph tracer.
I will need to rerun this test to get this information because I couldn't get the granularity of the tests right in the last run. What I will do this time is to run the test 25 times by creating new bricks and 25 times on the same brick. And capture the outputs.
Been exactly 2yrs since the last update! Not sure if the behavior is same on latest versions. Need to see and evaluate.
Lot of things have changed wrt performance improvements in gluster. Closing this bug for now, please retest with the latest and reopen if its still the case.