Description of problem: Our java application calls java.io.File.length() which results in a stat on the linux fuse mounted glusterfs. After upgrading clients from 3.10 to 3.12 we got very bad perf on these stat calls. Up to 20-30 sec. After downgrading clients from 3.12 to 3.10 we get normal perf of around 20ms per stat. The clients in this case mounts 5 different glusterfs. 4 of them are 3.10 with distributed config, while the newest one is running 3.12. The performance for stat from 3.12 clients to 3.10 distribute clusters remained normal. The one problem that we saw was these 3.12 clients accessing the 3.12 disperse volume. After noticing the problem we configured a standalone server as a 3.12 client and mounted the 3.12 disperse volume thinking it could be related to the combination of multiple serverside versions/clusters. The problem remained for this system also. After a while we downgraded all clients from 3.12 to 3.10 and the problem disappeared. An interesting discovery was that the problem also disappeared from the test system with only the 3.12 mount which was still running the 3.12 client. Version-Release number of selected component (if applicable): 3.10 and 3.12 How reproducible: By upgrading clients to 3.12 Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 1397863 [details] tcpdump of slow stat This dump shows a stat dir/dir2/dir3/filename call from linux taking a few seconds. It seems the info related to the stat call itself is returned quite fast, but it hangs waiting for an unlock action?
client side stats from 1x stat /mnt/glusterfs/dir1/dir2/dir3/file Fop Call Count Avg-Latency Min-Latency Max-Latency --- ---------- ----------- ----------- ----------- STAT 1 394130172.00 us 394130172.00 us 394130172.00 us SETXATTR 1 10501681.00 us 10501681.00 us 10501681.00 us LOOKUP 4 833.25 us 684.00 us 1098.00 us ------ ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
Can you upload the output of 'gluster volume info' and 'gluster volume status' for the problematic volume ? Is the same volume mounted on more than one directory/server or accessed from gfapi ? It would also be interesting to capture profile information from the test. Please, follow these steps: 1. gluster volume profile <volname> start (it can fail if it's already started) 2. gluster volume profile <volname> info clear 3. stat /path/to/file 4. gluster volume profile <volname> info > profile.txt And upload the profile.txt file.
The dump file dump_02_20_14_16.cap contains data for the following command: $ date;time stat /mnt/dfs-060/999/mdue/727/34f6bc006ecb4454fa0105f9a4040d65 Tue Feb 20 14:16:40 CET 2018 File: '/mnt/dfs-060/999/mdue/727/34f6bc006ecb4454fa0105f9a4040d65' Size: 239735 Blocks: 472 IO Block: 131072 regular file Device: 2ah/42d Inode: 11617233438169611692 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/ jotta) Gid: ( 1000/ jotta) Access: 2018-02-15 10:01:19.115894881 +0100 Modify: 2018-02-15 10:01:19.245054499 +0100 Change: 2018-02-15 10:01:21.059905172 +0100 Birth: - real 0m5.048s user 0m0.000s sys 0m0.000s client stats after running the cmd : io-stats-pre.txt.dfs-060.1 ------------------------- The dump file dump_02_20_14_22.cap contains data for the following command: $ date;time stat /mnt/dfs-060/999/mdue/727/34f6bc006ecb4454fa0105f9a4040d65 Tue Feb 20 14:22:39 CET 2018 File: '/mnt/dfs-060/999/mdue/727/34f6bc006ecb4454fa0105f9a4040d65' Size: 239735 Blocks: 472 IO Block: 131072 regular file Device: 2ah/42d Inode: 11617233438169611692 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/ jotta) Gid: ( 1000/ jotta) Access: 2018-02-15 10:01:19.115894881 +0100 Modify: 2018-02-15 10:01:19.245054499 +0100 Change: 2018-02-15 10:01:21.059905172 +0100 Birth: - real 0m7.398s user 0m0.000s sys 0m0.000s client stats after running the cmd : io-stats-pre.txt.dfs-060.2
Created attachment 1398222 [details] tcpdump from slow stat
Created attachment 1398223 [details] tcpdump from slow stat
Created attachment 1398224 [details] client side stats from after cmd1
Created attachment 1398225 [details] client side stats from after cmd2
Only thing I can see is that there seems to be contention from other clients on the file. Are the other clients that are accessing the file the same version (3.12) or are they 3.10 ? Just as a test, you can try to disable 'disperse.eager-lock' option and see if there's any difference.
setting disperse.eager-lock disable seems to have solved the performance problem. Albeit at a cost of slightly slower writes. See attachment
Created attachment 1398434 [details] before and after disperse.eager-lock:disable The top graph shows time to flush 4MB of data. Bottom graph shows time to issue java.io.file.length() The vertical line is when i disabled disperse.eager-lock
I'm still very much in the dark as to why I need to disable disperse.eager-lock when running 3.12 client side, but not when mounting with 3.10 client side.
The write performance reduction is expected when eager-lock is disabled. But it's useful to see that that's really the problem. So I'll need to check what have changed between 3.10 and 3.12 that could cause this increase in lock contention. I'll update when I find something.
Is there a way to monitor server side if there is lock contention?
Is it possible that http://lists.gluster.org/pipermail/gluster-users/2018-February/033432.html http://lists.gluster.org/pipermail/gluster-users/2018-February/033433.html are related to this?
I'm only using replicate, not disperse volumes. Should I be disabling cluster.eager-lock instead?
Moving this to mainline as 3.12 has reached EOL and the bug still needs a root cause. Encourage reporters to reproduce/check the behavior on later releases, to understand if other changes have fixed the issue at hand, and report back on the bug.
(In reply to Artem Russakovskii from comment #23) > I'm only using replicate, not disperse volumes. Should I be disabling > cluster.eager-lock instead? For replicate volume cluster.eager-lock is the option. However, in comment #4 you mentioned volume info of a disperse volume. Please update if you still face the issue. If not you may close the bug.
(In reply to Ashish Pandey from comment #25) > (In reply to Artem Russakovskii from comment #23) > > I'm only using replicate, not disperse volumes. Should I be disabling > > cluster.eager-lock instead? > > For replicate volume cluster.eager-lock is the option. > However, in comment #4 you mentioned volume info of a disperse volume. > > Please update if you still face the issue. If not you may close the bug. Sorry, but comment #4 does not belong to me.
ingard, are you still experiencing this issue ? have you tested with newer versions ?
Considering there are no update on the question posted, closing this as INSUFFICIENT_DATA. Please consider reopen of the bug if this is still an issue. Also, consider upgrading to higher version (glusterfs-6.x + ) before considering a test with 3.12 or something.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days