Created attachment 1200788 [details] Vmstat output for duration of random read test Description of problem: Hi, I am running random workload test using pbench_fio and passing Direct=1 flag to bypass the cache. On the server side I have performance.strict-o-direct: on. The expectation was that for all workload no activity would be seen on server cache. On the contrary I am seeing full 48G of server memory is being utilized for Random reads workload. I collected strace on the brick process and saw no files opened with o_direct flag. Also vmstat data collected on server shows full utilization of 48G of server memory. I have 6 node server setup and running workload on them from 5 clients. I have replica 2 volume. Here is vol info: Volume Name: rep2 Type: Distributed-Replicate Volume ID: 08bc410b-fba4-4a81-b918-fe5239947eef Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: 172.17.40.13:/bricks/b01/g Brick2: 172.17.40.14:/bricks/b01/g Brick3: 172.17.40.15:/bricks/b01/g Brick4: 172.17.40.16:/bricks/b01/g Brick5: 172.17.40.22:/bricks/b01/g Brick6: 172.17.40.24:/bricks/b01/g Options Reconfigured: performance.quick-read: off performance.open-behind: off performance.strict-o-direct: on transport.address-family: inet performance.readdir-ahead: on Version-Release number of selected component (if applicable): On server side: rpm -qa | grep gluster glusterfs-cli-3.8.2-1.el7.x86_64 glusterfs-3.8.2-1.el7.x86_64 glusterfs-api-3.8.2-1.el7.x86_64 glusterfs-libs-3.8.2-1.el7.x86_64 glusterfs-fuse-3.8.2-1.el7.x86_64 glusterfs-client-xlators-3.8.2-1.el7.x86_64 glusterfs-server-3.8.2-1.el7.x86_64 On client side: rpm -qa | grep gluster glusterfs-client-xlators-3.8.2-1.el7.x86_64 glusterfs-libs-3.8.2-1.el7.x86_64 glusterfs-3.8.2-1.el7.x86_64 glusterfs-fuse-3.8.2-1.el7.x86_64 How reproducible: Its reproducible every-time Steps to Reproduce: 1. Create a volume and set performance.strict-o-direct: on 2. Run random read workload using fio or any other equivalent workload generator 3. Observe vmstat and collect strace output using strace -ff -T -p <pid> -o <path-where-you-want-the-logs-captured>. To get PID ps aux | grep glusterfsd | grep <volname> Actual results: Full 48G of cache memory is being used. Expected results: With performance.strict-o-direct: on being set, expectation was it would bypass the cache completely. Additional info: Output of strace (See no file is being opened with o_direct flag) grep -i "o_direct" strace.log.* strace.log.13755:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000063> strace.log.13755:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/dirty", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000013> strace.log.13755:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/entry-changes", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000013> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000118> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000114> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000142> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000141> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000141> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 31 <0.000141> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000023> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000089> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000058> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000094> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 26 <0.000104> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000028> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000099> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000018> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 26 <0.000106> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000028> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000077> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000074> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000119> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000028> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000036> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000028> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000109> strace.log.13764:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/landfill", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000028> strace.log.13789:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000029> strace.log.13789:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000015> strace.log.13789:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 26 <0.000015> strace.log.13789:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000027> strace.log.13789:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000050> strace.log.13789:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 30 <0.000034> strace.log.13789:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000018> strace.log.13789:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 26 <0.000128> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000014> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000028> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000014> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000038> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 26 <0.000111> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000014> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000017> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000025> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000028> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000026> strace.log.14342:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000075> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000022> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000013> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000013> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000025> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000021> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 29 <0.000022> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000016> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 26 <0.000080> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000027> strace.log.14343:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000018> strace.log.14358:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000022> strace.log.14358:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 28 <0.000026> strace.log.14358:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000017> strace.log.14359:openat(AT_FDCWD, "/bricks/b01/g/.glusterfs/indices/xattrop", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 27 <0.000014>
Shekhar - please change the product of this BZ to GlusterFS as the testing was done with glusterfs 3.8.2 bits.
Shekhar, You need the following two options for O_DIRECT to be handled properly: 1) performance.strict-o-direct on 2) network.remote-dio off Pranith
Pranith, To enable o_direct I turned performance.strict-o-direct on and my network.remote-dio was disabled by default. performance.quick-read: off performance.open-behind: off are two options which I tried as part of troubleshooting exercise with Krutika and Raghavendra. I am reopening this issue. --Shekhar
(In reply to Pranith Kumar K from comment #3) > Shekhar, > You need the following two options for O_DIRECT to be handled properly: > > 1) performance.strict-o-direct on > 2) network.remote-dio off > > Pranith @Pranith - Part of the problem seems to be that the anonymous fds created by open-behind don't inherit the original open() flags. And when ob does wind reads and writes on these anon fds, posix winds up using the usual GF_ANON_FD_FLAGS (O_RDWR|O_LARGEFILE) value at the time of open() preceding the invoked fd operation (mostly read/write). FWIW, I did try the same test Shekhar ran on my test cluster and disabling open-behind in addition to the two already known o-direct options that you mentioned above, seemed to fix the issue of growing cache size as the test progresses (although it didn't have any effect as per Shekhar when he tried the same). I am yet to confirm the theory (and the presence/absence of O_DIRECT flag at the level of posix) through strace output of the bricks with and without open-behind. I will do the same tomorrow and update the bug. -Krutika
(In reply to Shekhar Berry from comment #4) > Pranith, > > To enable o_direct I turned performance.strict-o-direct on and my > network.remote-dio was disabled by default. > > performance.quick-read: off > performance.open-behind: off > > are two options which I tried as part of troubleshooting exercise with > Krutika and Raghavendra. > > I am reopening this issue. > > --Shekhar I was so confident this is not a bug, sorry about closing the bug before coming to a conclusion :-(
(In reply to Krutika Dhananjay from comment #5) > (In reply to Pranith Kumar K from comment #3) > > Shekhar, > > You need the following two options for O_DIRECT to be handled properly: > > > > 1) performance.strict-o-direct on > > 2) network.remote-dio off > > > > Pranith > > @Pranith - Part of the problem seems to be that the anonymous fds created by > open-behind don't inherit the original open() flags. And when ob does wind > reads and writes on these anon fds, posix winds up using the usual > GF_ANON_FD_FLAGS (O_RDWR|O_LARGEFILE) value at the time of open() preceding > the invoked fd operation (mostly read/write). As soon as the first write comes, an OPEN FOP with same flags that came at the time of open() are sent out from open-behind, from then on even the reads are served using this fd. Could you confirm if the test includes only reads and no writes at all? In that case we have a bug we need to look into. > > FWIW, I did try the same test Shekhar ran on my test cluster and disabling > open-behind in addition to the two already known o-direct options that you > mentioned above, seemed to fix the issue of growing cache size as the test > progresses (although it didn't have any effect as per Shekhar when he tried > the same). > > I am yet to confirm the theory (and the presence/absence of O_DIRECT flag at > the level of posix) through strace output of the bricks with and without > open-behind. I will do the same tomorrow and update the bug. Yes, this would be important to know. > > -Krutika
(In reply to Krutika Dhananjay from comment #5) > (In reply to Pranith Kumar K from comment #3) > > Shekhar, > > You need the following two options for O_DIRECT to be handled properly: > > > > 1) performance.strict-o-direct on > > 2) network.remote-dio off > > > > Pranith > > @Pranith - Part of the problem seems to be that the anonymous fds created by > open-behind don't inherit the original open() flags. And when ob does wind > reads and writes on these anon fds, posix winds up using the usual > GF_ANON_FD_FLAGS (O_RDWR|O_LARGEFILE) value at the time of open() preceding > the invoked fd operation (mostly read/write). > > FWIW, I did try the same test Shekhar ran on my test cluster and disabling > open-behind in addition to the two already known o-direct options that you > mentioned above, seemed to fix the issue of growing cache size as the test > progresses (although it didn't have any effect as per Shekhar when he tried > the same). > > I am yet to confirm the theory (and the presence/absence of O_DIRECT flag at > the level of posix) through strace output of the bricks with and without > open-behind. I will do the same tomorrow and update the bug. > > -Krutika I stand corrected. *Even* with open-behind disabled, like Shekhar rightly said, the issue still exists. I even disabled all perf xls and the problem still persists. I found that open()s are being issued on gfid handles with O_RDONLY|O_DIRECT flags in the bricks' strace output with all these xls disabled. I still need to check if pread()s are being issued on the same file descriptors though. Still have some more strace analysis to do. Will update the bug again once the issue is RC'd. -Krutika
So I checked the strace output again. The moment I disable open-behind, I start to see O_DIRECT in the strace logs. I grepped for other invocations of open() and realised there are all these files opened with O_WRONLY flag (without O_DIRECT of course) with a lot of pwrites being invoked on their fds. Took me a while to realise that these are filenames with patterns fio_bw*.log, fio_clat*.log, fio_lat*.log, etc - in other words log files created by fio itself which were ending up on the glusterfs mount point. Possible dumb question inbound - Shekhar, did you invoke the fio command when your current working directory was the glusterfs mount point itself? I made that mistake today which is why I saw the same results as you did. :) Could you please confirm this? I ran the test again now, this time from $HOME. Disabling open-behind caused things to work for me. The bug report however is still valid, I will send out a patch to make open-behind handle O_DIRECT. Meanwhile you can work around the issue by disabling open-behind for now. -Krutika
Krutika, Thanks for looking into the bug. Here are couple of my comments: 1) did you invoke the fio command when your current working directory was the glusterfs mount point itself? No, I am using pbench_fio for my testing and it stores run logs under /var/lib/pbench-agent. Just FYI, I am not invoking pbench_fio from gluster mount point itself :) 2) I redid the test with performance.open-behind: off and performance.strict-o-direct: on option. The results are no different in my setup.I am still seeing full cache utilization. I have not captured strace under this setting. I can capture and share that with you if required. --Shekhar
Yes, strace output would be great. 1. Could you capture the strace logs of the bricks (the smaller the brick count, the easier it would be for me :) ) for the following cases: i. strict-o-direct on, and everything else remains default (implying that open-behind will also be on) ii. strict-o-direct on and open-behind turned off iii. strict-o-direct on and all perf xls disabled (write-behind, open-behind, quick-read, read-ahead, io-cache, stat-prefetch). For capturing the strace output, you can use this: # strace -ff -f -p <pid-of-brick> -o <path-where-you-want-the-output-captured> Not that you may have to hit Ctrl+C on the strace terminal after every run. 2. Along with 1, also start the test script (fio in your case) itself using strace. For this, you need to run execute the following: # strace -ff -f -o fio-strace.log <your-test-invocation-command-here> -Krutika
I stand corrected. With 'Open-behind' off I am seeing files are now open with o_direct flag for both write and read test. I redid the test with following volume setting: performance.open-behind: off performance.strict-o-direct: on Here's the strace output: grep -i "o_direct" strace.log.* | egrep -v openat strace.log.25363:open("/bricks/b01/g/.glusterfs/ab/2e/ab2ea2ca-1f6e-494a-b817-faec8627afe3", O_RDWR|O_DIRECT) = 33 <0.000014> strace.log.25363:open("/bricks/b01/g/.glusterfs/10/f7/10f7cf7d-bd68-4414-a3c8-cb3b12092452", O_RDWR|O_DIRECT) = 35 <0.000016> strace.log.25363:open("/bricks/b01/g/.glusterfs/51/ac/51acd630-6518-42f8-a9cf-c936dc5c3b9c", O_RDWR|O_DIRECT) = 37 <0.000011> strace.log.25363:open("/bricks/b01/g/.glusterfs/a1/41/a141600d-8fbb-41f9-b0b8-34318125839e", O_RDWR|O_DIRECT) = 39 <0.000017> strace.log.25363:open("/bricks/b01/g/.glusterfs/fa/cd/facda6a3-53c4-484e-a599-2488d0683c31", O_RDWR|O_DIRECT) = 43 <0.000039> strace.log.25363:open("/bricks/b01/g/.glusterfs/52/02/5202698e-51d2-40d3-9a61-9b40ce2b427b", O_RDWR|O_DIRECT) = 44 <0.000014> strace.log.25363:open("/bricks/b01/g/.glusterfs/ac/2c/ac2ceda0-d035-4b94-87bc-c12bced59c6c", O_RDONLY|O_DIRECT) = 45 <0.000040> strace.log.25363:open("/bricks/b01/g/.glusterfs/51/ac/51acd630-6518-42f8-a9cf-c936dc5c3b9c", O_RDONLY|O_DIRECT) = 52 <0.000021> strace.log.25454:open("/bricks/b01/g/.glusterfs/ca/47/ca475b51-9f61-4838-ba2c-8930b65a0c04", O_RDWR|O_DIRECT) = 34 <0.000019> strace.log.25454:open("/bricks/b01/g/.glusterfs/11/d7/11d7083f-a4d0-41c2-9b48-efbd899b831a", O_RDWR|O_DIRECT) = 36 <0.000017> strace.log.25454:open("/bricks/b01/g/.glusterfs/84/63/8463c0d7-3e68-44c7-b360-0a6622070c42", O_RDWR|O_DIRECT) = 38 <0.000017> strace.log.25454:open("/bricks/b01/g/.glusterfs/9a/4c/9a4cb41f-487d-464f-9238-9db7385ecf53", O_RDWR|O_DIRECT) = 40 <0.000012> strace.log.25454:open("/bricks/b01/g/.glusterfs/30/f9/30f929ce-9b66-4d3e-ab0f-a73c60d4d817", O_RDWR|O_DIRECT) = 41 <0.000013> strace.log.25454:open("/bricks/b01/g/.glusterfs/85/50/8550a796-172b-4390-b976-b228eafd9ada", O_RDWR|O_DIRECT) = 42 <0.000016> strace.log.25454:open("/bricks/b01/g/.glusterfs/9d/b2/9db21c69-39bc-4565-8793-0833cf21887d", O_RDWR|O_DIRECT) = 45 <0.000020> strace.log.25454:open("/bricks/b01/g/.glusterfs/ac/2c/ac2ceda0-d035-4b94-87bc-c12bced59c6c", O_RDWR|O_DIRECT) = 48 <0.000026> strace.log.25454:open("/bricks/b01/g/.glusterfs/a5/b6/a5b62db9-0d1a-49cb-b188-a1be216de017", O_RDWR|O_DIRECT) = 50 <0.000021> strace.log.25454:open("/bricks/b01/g/.glusterfs/c1/fb/c1fb470d-63dc-4d1e-8079-93877bdcb2a5", O_RDONLY|O_DIRECT) = 36 <0.000026> strace.log.25454:open("/bricks/b01/g/.glusterfs/11/d7/11d7083f-a4d0-41c2-9b48-efbd899b831a", O_RDONLY|O_DIRECT) = 48 <0.000030> strace.log.25455:open("/bricks/b01/g/.glusterfs/81/04/810408fe-eb40-42e3-b53b-981dbf290af3", O_RDWR|O_DIRECT) = 49 <0.000037> strace.log.25455:open("/bricks/b01/g/.glusterfs/c1/fb/c1fb470d-63dc-4d1e-8079-93877bdcb2a5", O_RDWR|O_DIRECT) = 52 <0.000013> strace.log.25455:open("/bricks/b01/g/.glusterfs/30/f9/30f929ce-9b66-4d3e-ab0f-a73c60d4d817", O_RDONLY|O_DIRECT) = 33 <0.000022> strace.log.25455:open("/bricks/b01/g/.glusterfs/9d/b2/9db21c69-39bc-4565-8793-0833cf21887d", O_RDONLY|O_DIRECT) = 35 <0.000032> strace.log.25455:open("/bricks/b01/g/.glusterfs/84/63/8463c0d7-3e68-44c7-b360-0a6622070c42", O_RDONLY|O_DIRECT) = 49 <0.000027> strace.log.25456:open("/bricks/b01/g/.glusterfs/2d/35/2d352297-ab44-4c92-8b1d-cb314f5f693b", O_RDWR|O_DIRECT) = 51 <0.000029> strace.log.25456:open("/bricks/b01/g/.glusterfs/fa/cd/facda6a3-53c4-484e-a599-2488d0683c31", O_RDONLY|O_DIRECT) = 34 <0.000029> strace.log.25457:open("/bricks/b01/g/.glusterfs/e5/13/e5134455-eb71-43b8-8662-355c79d5ef1a", O_RDWR|O_DIRECT) = 46 <0.000018> strace.log.25457:open("/bricks/b01/g/.glusterfs/49/3b/493b7a5b-42f7-4c0a-9a4c-48956a05444f", O_RDWR|O_DIRECT) = 47 <0.000027> strace.log.25457:open("/bricks/b01/g/.glusterfs/81/04/810408fe-eb40-42e3-b53b-981dbf290af3", O_RDONLY|O_DIRECT) = 47 <0.000031> strace.log.25458:open("/bricks/b01/g/.glusterfs/52/02/5202698e-51d2-40d3-9a61-9b40ce2b427b", O_RDONLY|O_DIRECT) = 39 <0.000026> strace.log.25458:open("/bricks/b01/g/.glusterfs/ab/2e/ab2ea2ca-1f6e-494a-b817-faec8627afe3", O_RDONLY|O_DIRECT) = 46 <0.000037> strace.log.25458:open("/bricks/b01/g/.glusterfs/ca/47/ca475b51-9f61-4838-ba2c-8930b65a0c04", O_RDONLY|O_DIRECT) = 51 <0.000030> strace.log.25459:open("/bricks/b01/g/.glusterfs/e5/13/e5134455-eb71-43b8-8662-355c79d5ef1a", O_RDONLY|O_DIRECT) = 40 <0.000017> strace.log.25459:open("/bricks/b01/g/.glusterfs/10/f7/10f7cf7d-bd68-4414-a3c8-cb3b12092452", O_RDONLY|O_DIRECT) = 42 <0.000014> strace.log.25459:open("/bricks/b01/g/.glusterfs/49/3b/493b7a5b-42f7-4c0a-9a4c-48956a05444f", O_RDONLY|O_DIRECT) = 43 <0.000044> strace.log.25460:open("/bricks/b01/g/.glusterfs/85/50/8550a796-172b-4390-b976-b228eafd9ada", O_RDONLY|O_DIRECT) = 37 <0.000012> strace.log.25461:open("/bricks/b01/g/.glusterfs/a5/b6/a5b62db9-0d1a-49cb-b188-a1be216de017", O_RDONLY|O_DIRECT) = 38 <0.000039> strace.log.25461:open("/bricks/b01/g/.glusterfs/9a/4c/9a4cb41f-487d-464f-9238-9db7385ecf53", O_RDONLY|O_DIRECT) = 44 <0.000027> strace.log.25461:open("/bricks/b01/g/.glusterfs/a1/41/a141600d-8fbb-41f9-b0b8-34318125839e", O_RDONLY|O_DIRECT) = 50 <0.000015> strace.log.25462:open("/bricks/b01/g/.glusterfs/2d/35/2d352297-ab44-4c92-8b1d-cb314f5f693b", O_RDONLY|O_DIRECT) = 41 <0.000017> --Shekhar
REVIEW: http://review.gluster.org/15552 (performance/open-behind: Pass O_DIRECT flags for anon fd reads when required) posted (#1) for review on release-3.8 by Krutika Dhananjay (kdhananj)
COMMIT: http://review.gluster.org/15552 committed in release-3.8 by Raghavendra G (rgowdapp) ------ commit 2164d3fbf7301c8db8eaa3a6a37ab06225473664 Author: Krutika Dhananjay <kdhananj> Date: Tue Sep 20 12:05:23 2016 +0530 performance/open-behind: Pass O_DIRECT flags for anon fd reads when required Backport of: http://review.gluster.org/15537 cherry-picked from a412a4f50d8ca2ae68dbfa93b80757889150ce99 Writes are already passing the correct flags at the time of open(). Also, make io-cache honor direct-io for anon-fds with O_DIRECT flag during reads. Change-Id: I9eb89c3bda34f9287861eb3b53c3d6a7b967c105 BUG: 1375959 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/15552 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.5, please open a new bug report. glusterfs-3.8.5 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/announce/2016-October/000061.html [2] https://www.gluster.org/pipermail/gluster-users/