Bug 1393419
Summary: | read-ahead not working if open-behind is turned on | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Mohammed Rafi KC <rkavunga> | ||||||
Component: | read-ahead | Assignee: | Milind Changire <mchangir> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | mainline | CC: | atumball, bengland, bugs, hchen, jbyers, mchangir, mpillai, pgurusid, rgowdapp, sasundar, smohan | ||||||
Target Milestone: | --- | Keywords: | Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | 1084508 | ||||||||
: | 1426044 (view as bug list) | Environment: | |||||||
Last Closed: | 2019-05-11 13:05:51 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1084508 | ||||||||
Bug Blocks: | 1426044 | ||||||||
Attachments: |
|
Description
Mohammed Rafi KC
2016-11-09 13:57:29 UTC
REVIEW: http://review.gluster.org/15811 (glusterd/volgen: Changing the order of read-ahead xlator) posted (#1) for review on master by mohammed rafi kc (rkavunga) I don't know if this is relevant but just a heads up - I remember the option "root-squash" was dependent on or tied to the option "read-after-open". If you plan to change "read-after-open", please do take a look at the code that handles "server.root-squash" too. REVIEW: https://review.gluster.org/15811 (perf/read-ahead: read-ahead before open-behind) posted (#2) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/15811 (perf/read-ahead: read-ahead before open-behind) posted (#3) for review on master by Milind Changire (mchangir) here's output of a single run as per test case in comment #0 (without dropping kernel caches): (all nodes are virtual machines on a single laptop with SSD) 16:17:46 [root@f25node1 ~] # gluster volume info dist-rep Volume Name: dist-rep Type: Distributed-Replicate Volume ID: 54b4340d-cb1f-4742-a0ca-3d3ac499d84c Status: Started Snapshot Count: 0 Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: f25node1:/glustervols/dist-rep/brick Brick2: f25node2:/glustervols/dist-rep/brick Brick3: f25node3:/glustervols/dist-rep/brick Brick4: f25node4:/glustervols/dist-rep/brick Brick5: f25node5:/glustervols/dist-rep/brick Brick6: f25node6:/glustervols/dist-rep/brick Options Reconfigured: performance.open-behind: off performance.read-ahead-page-count: 16 transport.address-family: inet nfs.disable: on cluster.brick-multiplex: off 15:11:17 [root@f25client1 ~] # mount -t glusterfs f25node1:/dist-rep /mnt/dist-rep performance.read-ahead-page-count 4 15:13:46 [root@f25client1 ~] # dd if=/dev/zero of=/mnt/dist-rep/1g bs=1M count=1K 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 263.88 s, 4.1 MB/s performance.read-ahead-page-count 1 16:07:16 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 10.1683 s, 106 MB/s performance.read-ahead-page-count 16 16:14:31 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.162729 s, 6.6 GB/s performance.open-behind off performance.read-ahead-page-count 16 16:18:38 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 6.43123 s, 167 MB/s (In reply to Milind Changire from comment #5) > here's output of a single run as per test case in comment #0 > (without dropping kernel caches): Its better to drop caches. If we are doing a fresh mount for each test, I guess its ok. > performance.open-behind: off Was open-behind off in all runs of dd? (In reply to Raghavendra G from comment #6) > (In reply to Milind Changire from comment #5) > > here's output of a single run as per test case in comment #0 > > (without dropping kernel caches): > > Its better to drop caches. If we are doing a fresh mount for each test, I > guess its ok. I'll do another round of test by doing a fresh mount before each run of "dd" > > > performance.open-behind: off > > Was open-behind off in all runs of dd? open-behind was turned "off" only for the last run of "dd" Test re-run by doing fresh mount before every "dd" run: 09:05:47 [root@f25client1 ~] # mount -t glusterfs f25node1:/dist-rep /mnt/dist-rep 09:05:52 [root@f25client1 ~] # dd if=/dev/zero of=/mnt/dist-rep/1g bs=1M count=1K 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 77.2741 s, 13.9 MB/s 09:09:04 [root@f25client1 ~] # umount /mnt/dist-rep 09:06:12 [root@f25node1 ~] # gluster volume set dist-rep performance.read-ahead-page-count 1 volume set: success 09:06:39 [root@f25node1 ~] # gluster volume set dist-rep performance.open-behind on volume set: success 09:11:12 [root@f25client1 ~] # mount -t glusterfs f25node1:/dist-rep /mnt/dist-rep 09:11:25 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 8.16453 s, 132 MB/s 09:12:07 [root@f25client1 ~] # umount /mnt/dist-rep 09:06:54 [root@f25node1 ~] # gluster volume set dist-rep performance.read-ahead-page-count 16 09:13:40 [root@f25client1 ~] # mount -t glusterfs f25node1:/dist-rep /mnt/dist-rep 09:13:44 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 20.1027 s, 53.4 MB/s 09:13:21 [root@f25node1 ~] # gluster volume set dist-rep performance.open-behind off volume set: success 09:16:32 [root@f25client1 ~] # umount /mnt/dist-rep 09:40:07 [root@f25client1 ~] # mount -t glusterfs f25node1:/dist-rep /mnt/dist-rep 09:40:31 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 10.2474 s, 105 MB/s @Milind, I am curious to know whether you see same behavior by change read-ahead page count without applying the fix. Can you try that? regards, Raghavendra (In reply to Raghavendra G from comment #9) > @Milind, > > I am curious to know whether you see same behavior by change read-ahead page > count without applying the fix. Can you try that? > > regards, > Raghavendra Following is the test run *without* the patch: 13:50:23 [root@f25client1 ~] # dd if=/dev/zero of=/mnt/dist-rep/1g bs=1M count=1K 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 101.623 s, 10.6 MB/s 13:53:11 [root@f25client1 ~] # umount /mnt/dist-rep 13:54:30 [root@f25client1 ~] # mount -t glusterfs f25node1:/dist-rep /mnt/dist-rep 13:54:36 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 8.44487 s, 127 MB/s 13:50:40 [root@f25node1 ~] # gluster volume set dist-rep performance.read-ahead-page-count 16 13:56:10 [root@f25client1 ~] # umount /mnt/dist-rep 13:56:35 [root@f25client1 ~] # mount -t glusterfs f25node1:/dist-rep /mnt/dist-rep 13:56:59 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.23783 s, 205 MB/s 13:56:52 [root@f25node1 ~] # gluster volume set dist-rep performance.open-behind off 13:57:34 [root@f25client1 ~] # umount /mnt/dist-rep 13:57:57 [root@f25client1 ~] # mount -t glusterfs f25node1:/dist-rep /mnt/dist-rep 13:58:00 [root@f25client1 ~] # dd if=/mnt/dist-rep/1g bs=1M of=/dev/null 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.52333 s, 194 MB/s Created attachment 1304787 [details]
perf numbers with patch
Created attachment 1304790 [details]
perf numbers without patch
We can see that performance gets affected by the number of read-ahead page count in both scenarios - with/without fix. Milind will do the same tests by setting client log-level to TRACE and attach logs to the bz. Hence setting needinfo on him. Once logs are attached, csaba has agreed to take the ownership of bz. Hence assigning the bz to csaba. (In reply to Raghavendra G from comment #13) > We can see that performance gets affected by the number of read-ahead page > count in both scenarios - with/without fix. 1. If read-ahead is indeed turned off, there should be no effect on performance by tweaking page-count. 2. When read-ahead is enabled, performance with page-count 1 seems to be better than performance with higher page-count. Both of the above behavior need to be explained. (In reply to Raghavendra G from comment #13) > We can see that performance gets affected by the number of read-ahead page > count in both scenarios - with/without fix. Milind will do the same tests by > setting client log-level to TRACE and attach logs to the bz. Hence setting > needinfo on him. > > Once logs are attached, csaba has agreed to take the ownership of bz. Hence > assigning the bz to csaba. Perf numbers for the TRACEL level logging test run are available as attachments. TRACE level logs are available at: 1. https://nofile.io/f/is0XjWnaLFK/bz-1393419-test-run-without-patch-client-TRACE.log.gz 2. https://nofile.io/f/Oh3lkzaxMSe/bz-1393419-test-run-with-patch-client-TRACE.log.gz REVIEW: https://review.gluster.org/15811 (perf/read-ahead: read-ahead before open-behind) posted (#4) for review on master by mohammed rafi kc (rkavunga) REVIEW: https://review.gluster.org/15811 (perf/read-ahead: read-ahead before open-behind) posted (#5) for review on master by mohammed rafi kc (rkavunga) (In reply to Milind Changire from comment #15) > (In reply to Raghavendra G from comment #13) > > We can see that performance gets affected by the number of read-ahead page > > count in both scenarios - with/without fix. without fix, there is some percentage of opens where open-behind actually winds down and hence read-ahead will be working for those fds. I guess variance due to change of page-count can be attributed to this. Also, note that as found on bz 1214489, reads are interspersed with getattr. So, the cache is thrown away. With large value of page-count more pages are wasted and hence performance degrades for larger value of page count. I verified from logs attached that patch https://review.gluster.org/15811 makes sure that all opens are witnessed by read-ahead and hence a fix to current bug. To summarize patch #15811 is a necessary fix and solves the problem at hand. Problem with read-ahead performance are tracked by other bzs like bz 1214489. > > Milind will do the same tests by > > setting client log-level to TRACE and attach logs to the bz. Hence setting > > needinfo on him. > > > > Once logs are attached, csaba has agreed to take the ownership of bz. Hence > > assigning the bz to csaba. > > Perf numbers for the TRACEL level logging test run are available as > attachments. > > TRACE level logs are available at: > 1. > https://nofile.io/f/is0XjWnaLFK/bz-1393419-test-run-without-patch-client- > TRACE.log.gz > 2. > https://nofile.io/f/Oh3lkzaxMSe/bz-1393419-test-run-with-patch-client-TRACE. > log.gz REVIEW: https://review.gluster.org/15811 (perf/read-ahead: read-ahead before open-behind) posted (#6) for review on master by Milind Changire Status? Raghavendra G should be able to answer this aptly. Redirecting needinfo to him. |