Description of problem: tcmu-runner has support the tcmalloc, which could improve the small IO perf, please see [1] and the relative PRs are [2] and [3]. [1] https://github.com/open-iscsi/tcmu-runner/pull/555 [2] https://github.com/open-iscsi/tcmu-runner/issues/561 [3] https://github.com/open-iscsi/tcmu-runner/pull/558 The following are my simple test outputs: fio --filename=/dev/mapper/mpathar -iodepth=64 -ioengine=libaio --direct=1 --rw=rw --bs=512 --size=100M --numjobs=4 --runtime=9099999999999999990 --group_reporting --name=test-read Without tcmalloc: bs=512 ++++ read: IOPS=4344, BW=2172KiB/s (2224kB/s)(200MiB/94232msec) write: IOPS=4348, BW=2174KiB/s (2227kB/s)(200MiB/94232msec)----- read: IOPS=4286, BW=2143KiB/s (2194kB/s)(200MiB/95517msec) write: IOPS=4290, BW=2145KiB/s (2197kB/s)(200MiB/95517msec) ---- bs=1K ++++ read: IOPS=4104, BW=4104KiB/s (4203kB/s)(200MiB/49869msec) write: IOPS=4109, BW=4109KiB/s (4208kB/s)(200MiB/49869msec) read: IOPS=3961, BW=3962KiB/s (4057kB/s)(200MiB/51665msec) write: IOPS=3966, BW=3966KiB/s (4062kB/s)(200MiB/51665msec) ---- bs=4K ++++ read: IOPS=2852, BW=11.1MiB/s (11.7MB/s)(199MiB/17871msec) write: IOPS=2877, BW=11.2MiB/s (11.8MB/s)(201MiB/17871msec) read: IOPS=2886, BW=11.3MiB/s (11.8MB/s)(199MiB/17662msec) write: IOPS=2911, BW=11.4MiB/s (11.9MB/s)(201MiB/17662msec) ---- bs=8k ++++ read: IOPS=2326, BW=18.2MiB/s (19.1MB/s)(198MiB/10920msec) write: IOPS=2362, BW=18.5MiB/s (19.3MB/s)(202MiB/10920msec) read: IOPS=2276, BW=17.8MiB/s (18.6MB/s)(198MiB/11161msec) write: IOPS=2311, BW=18.1MiB/s (18.9MB/s)(202MiB/11161msec) ---- ===================================== With tcmalloc: bs=512 ++++ read: IOPS=5147, BW=2574KiB/s (2636kB/s)(200MiB/79533msec) write: IOPS=5152, BW=2576KiB/s (2638kB/s)(200MiB/79533msec) read: IOPS=4767, BW=2384KiB/s (2441kB/s)(200MiB/85876msec) write: IOPS=4772, BW=2386KiB/s (2443kB/s)(200MiB/85876msec) ----- bs = 1K ++++ read: IOPS=4283, BW=4283KiB/s (4386kB/s)(200MiB/47785msec) write: IOPS=4288, BW=4288KiB/s (4391kB/s)(200MiB/47785msec) read: IOPS=4223, BW=4223KiB/s (4325kB/s)(200MiB/48462msec) write: IOPS=4228, BW=4229KiB/s (4330kB/s)(200MiB/48462msec) ---- bs = 4K ++++ read: IOPS=3262, BW=12.7MiB/s (13.4MB/s)(199MiB/15625msec) write: IOPS=3290, BW=12.9MiB/s (13.5MB/s)(201MiB/15625msec) read: IOPS=2970, BW=11.6MiB/s (12.2MB/s)(199MiB/17160msec) write: IOPS=2996, BW=11.7MiB/s (12.3MB/s)(201MiB/17160msec) ---- bs=8K ++++ read: IOPS=2312, BW=18.1MiB/s (18.9MB/s)(198MiB/10985msec) write: IOPS=2348, BW=18.3MiB/s (19.2MB/s)(202MiB/10985msec) read: IOPS=2614, BW=20.4MiB/s (21.4MB/s)(198MiB/9716msec) write: IOPS=2654, BW=20.7MiB/s (21.7MB/s)(202MiB/9716msec) ----
PR: https://github.com/open-iscsi/tcmu-runner/pull/555
https://github.com/gluster/glusterfs/issues/237 was opened 2 years before in glusterfs too. Haven't proceeded further with that. Good time to re-visit it I guess. I am happy if the change is straight forward. I haven't analyzed the effort in glusterfs yet. But it is good to get started somewhere, and happy if gluster-block takes the first step.
Performed the steps below to verify the bug: ============================================== 1) Setup OCS 3.11.3 2) created 4 pods with 4 block pvcs attached 3) created a file called random-data1.log inside /var/lib/origin/openshift.local.volumes/pods/d90cd57a-b41e-11e9-8c60-005056b2f12a/volumes/kubernetes.io~iscsi/pvc-<name> 4) Created an fio job file, attached the same here for reference. 5) Ran the job file 4 times by changing the path of file name to different pvcs with 512, 1k, 4k and 8k. 1) Setup OCS 3.11.4 2) created 4 pods with 4 block pvcs attached 3) created a file called random-data1.log inside /var/lib/origin/openshift.local.volumes/pods/d90cd57a-b41e-11e9-8c60-005056b2f12a/volumes/kubernetes.io~iscsi/pvc-<name> 4) Created an fio job file, attached the same here for reference. 5) Ran the job file 4 times by changing the path of file name to different pvcs with 512, 1k, 4k and 8k. Results are as below for 3.11.3 & 3.11.4 setups : ================================================ 512 +++++++++++++++++++ READ: bw=181KiB/s (185kB/s), 181KiB/s-181KiB/s (185kB/s-185kB/s), io=50.0MiB (52.5MB), run=283357-283357msec WRITE: bw=181KiB/s (185kB/s), 181KiB/s-181KiB/s (185kB/s-185kB/s), io=49.0MiB (52.4MB), run=283357-283357msec Run status group 0 (all jobs): READ: bw=287KiB/s (294kB/s), 287KiB/s-287KiB/s (294kB/s-294kB/s), io=50.0MiB (52.5MB), run=178424-178424msec WRITE: bw=287KiB/s (294kB/s), 287KiB/s-287KiB/s (294kB/s-294kB/s), io=49.0MiB (52.4MB), run=178424-178424msec Improvement in performance => 58.5 % 1K +++++++++++++++++ READ: bw=275KiB/s (282kB/s), 275KiB/s-275KiB/s (282kB/s-282kB/s), io=50.1MiB (52.5MB), run=186220-186220msec WRITE: bw=274KiB/s (281kB/s), 274KiB/s-274KiB/s (281kB/s-281kB/s), io=49.9MiB (52.3MB), run=186220-186220msec READ: bw=653KiB/s (669kB/s), 653KiB/s-653KiB/s (669kB/s-669kB/s), io=50.1MiB (52.5MB), run=78553-78553msec WRITE: bw=651KiB/s (666kB/s), 651KiB/s-651KiB/s (666kB/s-666kB/s), io=49.9MiB (52.3MB), run=78553-78553msec Improvement in performance => 137 % 4K: ++++++++++++++++++++++++ READ: bw=2642KiB/s (2705kB/s), 2642KiB/s-2642KiB/s (2705kB/s-2705kB/s), io=50.0MiB (52.4MB), run=19381-19381msec WRITE: bw=2642KiB/s (2705kB/s), 2642KiB/s-2642KiB/s (2705kB/s-2705kB/s), io=50.0MiB (52.4MB), run=19381-19381msec READ: bw=7950KiB/s (8141kB/s), 7950KiB/s-7950KiB/s (8141kB/s-8141kB/s), io=50.0MiB (52.4MB), run=6440-6440msec WRITE: bw=7950KiB/s (8141kB/s), 7950KiB/s-7950KiB/s (8141kB/s-8141kB/s), io=50.0MiB (52.4MB), run=6440-6440msec Improvement in performance => 200% 8K: ++++++++++++++++++++++++++ READ: bw=2979KiB/s (3050kB/s), 2979KiB/s-2979KiB/s (3050kB/s-3050kB/s), io=49.9MiB (52.3MB), run=17136-17136msec WRITE: bw=2997KiB/s (3069kB/s), 2997KiB/s-2997KiB/s (3069kB/s-3069kB/s), io=50.1MiB (52.6MB), run=17136-17136msec READ: bw=13.1MiB/s (13.7MB/s), 13.1MiB/s-13.1MiB/s (13.7MB/s-13.7MB/s), io=49.9MiB (52.3MB), run=3812-3812msec WRITE: bw=13.2MiB/s (13.8MB/s), 13.2MiB/s-13.2MiB/s (13.8MB/s-13.8MB/s), io=50.1MiB (52.6MB), run=3812-3812msec Improvement in performance => 351 % 3.11.3 : =============== [root@dhcp47-16 ~]# oc exec -it glusterfs-storage-cd9f5 bash [root@dhcp47-58 /]# ldd /usr/bin/tcmu-runner | grep libtcmalloc [root@dhcp47-58 /]# ldd /usr/lib64/libtcmu.so.1 | grep libtcmalloc [root@dhcp47-58 /]# ldd /usr/lib64/tcmu-runner/handler_glfs.so | grep libtcmalloc 3.11.4: ============= [root@dhcp46-78 ~]# oc exec -it glusterfs-storage-9nd5k bash [root@dhcp47-66 /]# ldd /usr/bin/tcmu-runner | grep libtcmalloc libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fef9d23e000) [root@dhcp47-66 /]# ldd /usr/lib64/libtcmu.so.1 | grep libtcmalloc libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007facfb81f000) [root@dhcp47-66 /]# ldd /usr/lib64/tcmu-runner/handler_glfs.so | grep libtcmalloc libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fbcc55e6000) @prasanna, below are the numbers i see when i run fio workload here , should we wait for elvirs tests as well to move the bug to verified state ? Numbers look really interesting and exciting !!! That is a lot of improvement.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3256
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days