Condensed information from comment #5 and comment #6. Time is in seconds 331 total times | 340 total times ======================================================== entrylk-total-time 10750 |entrylk-total-time 10090.8 getxattr-total-time 93.4708 |getxattr-total-time 0.117269 opendir-total-time 3388.05 |opendir-total-time 3252.09 readdirp-total-time 3841.3 |readdirp-total-time 3436.54 inodelk-total-time 6131.27 |inodelk-total-time 202.09 finodelk-total-time 0.011168|finodelk-total-time 0.003563 rmdir-total-time 7849.47 |rmdir-total-time 6923.74 lookup-total-time 35545.3 |lookup-total-time 43867.4 331 total calls | 340 total calls ========================================================== entrylk-total-calls 69120612|entrylk-total-calls 70560624 getxattr-total-calls 383820 |getxattr-total-calls 864 opendir-total-calls 17280936|opendir-total-calls 17281152 readdirp-total-calls 5760317|readdirp-total-calls 5760213 inodelk-total-calls 34560300|inodelk-total-calls 1440162 finodelk-total-calls 48 |finodelk-total-calls 48 rmdir-total-calls 17280072 |rmdir-total-calls 17280072 lookup-total-calls 18604130 |lookup-total-calls 18201803 Some observations: * inodelk calls have reduced drastically in from 34560300 to 1440162, a difference of 33120138 (so is the time, a drop of 5929 seconds. This shows that bz 1369312 has reduced inodelk calls. Also note that entrylk time has actually decreased by 640 seconds. So, bz 1369312 has actually brought down synchronization costs to large extent. * rest of the times have actually decreased in 340, except for lookup time which has a huge increase by 8322 seconds (offsetting 5929 seconds saved in inodelk calls). Note that the total number of lookup calls has actually reduced by 402327 in 340. This shows there is a huge latency increase in lookup calls in brick stack. note that bz 1577750 also speaks about decrease in lookup performance. Both of these bugs might've a common cause of increased lookup latency on brick stack.
There is quite a significant difference in latency (a delta of 600s) between fastest bricks and slowest bricks (note the number of lookups calls have almost remained constant, but there is a variation in avg latency): bash-4.4$ for i in `cat 340-lookup-time.txt | cut -d" " -f 3 | sort | tail -5`; do grep $i 340-lookup-time.txt ; done 3039.95 252665 768.089 3144.69 252627 794.434 3514.66 252744 888.309 3623.13 252578 915.123 3866.47 252698 977.049 bash-4.4$ for i in `cat 340-lookup-time.txt | cut -d" " -f 3 | sort | head -5`; do grep $i 340-lookup-time.txt ; done 1265.85 252557 319.699 1281.93 255107 327.029 1625.88 252840 411.087 1700.64 252750 429.837 1717.36 252557 433.731
Thanks to concerns from Nithya, we realised that brick-mux is on in this setup and 12 bricks are multiplexed into a single process on each node. So, we are wondering whether brick-mux has an impact on performance. @Karan, Can you please schedule a run with bmux off?
Please always include the output of gluster volume info in the bug.
Reducing severity because this only happens on brick-mux setups. On a regular brick there's even an improvement.
Hi, The problem does not exist in the default setup(non-brick_mux). Though I have not generated data in 3.5.1 for both scenarios. As per my knowledge performance still hurt while brick_mux is enabled. To get a similar performance result in brick_mux we need to use global threading, currently global threading code. Currently, global threading is not stable. We have some issue so we need some time to stable the same. Without global threading, it is difficult to get the same performance with brick_mux. Thanks Mohit Agrawal