Description of problem: On an iozone sequential I/O test, an XFS file system on gluster-block device performs very poorly compared to gluster-fuse and gluster-nfs Version-Release number of selected component (if applicable): glusterfs-3.8.4-25.el7rhgs.x86_64 How reproducible: Consistently Steps to Reproduce: 1. single brick gluster volume created on an NVMe SSD. In the gluster-block case, create a block device of 100GiB (ha1). On client, create XFS filesystem on discovered gluster-block device and mount. For gluster-fuse and gluster-nfs case, mount the gluster volume on the client. Client connected via 10GbE. 2. Run an iozone test with the following parameters: iozone -+m <conf_file> -i 0 -w -+n -c -C -e -s 4g -r 1024k -t 8 <drop caches> iozone -+m <conf_file> -i 1 -w -+n -c -C -e -s 4g -r 1024k -t 8 [this iozone command is adapted from a multi-client test. It is being run on a single client here] Actual results: gluster-block: Children see throughput for 8 initial writers = 87671.01 kB/sec Children see throughput for 8 readers = 133351.47 kB/sec gluster-fuse: Children see throughput for 8 initial writers = 939760.11 kB/sec Children see throughput for 8 readers = 791956.57 kB/sec gluster-nfs: Children see throughput for 8 initial writers = 989822.15 kB/sec Children see throughput for 8 readers = 1203338.91 kB/sec Expected results: On this test gluster-block should be able to perform close to gluster-fuse and gluster-nfs Additional info: Varying the number of iozone threads does not have any significant impact on gluster-block performance. This could indicate too few "io threads", lock contention or some throttling. Increasing read-ahead on the discovered block device on the client from the default of 128KB to 4MB did not improve read performance.
I see poor performance with gluster-block in random I/O tests as well. For the same setup as in comment #0 > 1. single brick gluster volume created on an NVMe SSD. > In the gluster-block case, create a block device of 100GiB (ha1). On client, > create XFS filesystem on discovered gluster-block device and mount. > > For gluster-fuse and gluster-nfs case, mount the gluster volume on the > client. > > Client connected via 10GbE. I'm skipping the gluster-nfs comparison for random I/O. fio job keeps total size at 80g and data accessed at 5%. Number of threads (jobs) and other parameters vary. Here's the job file for 8 threads, 64k access size, random write: [global] rw=randwrite end_fsync=1 bs=64k startdelay=0 ioengine=sync direct=1 [lgf-randwrite] directory=/mnt/glustervol/${HOSTNAME} filename_format=f.$jobnum.$filenum numjobs=8 nrfiles=1 openfiles=1 size=10g io_size=512m Results for the above case: Fuse : write: io=4096.0MB, bw=792275K/s gluster-block: write: io=4096.0MB, bw=85010K/s For 8 threads, 8k access size, random write: Fuse : write: io=4096.0MB, bw=145499K/s gluster-block: write: io=4096.0MB, bw=13301K/s For 1 thread, 8k access size, random write: Fuse : write: io=4096.0MB, bw=33514KB/s gluster-block: write: io=4096.0MB, bw=10261KB/s Also, latency is much higher for gluster-block. Looking at "clat percentiles (usec):" reported: For 1 thread, 8k access size, random write: Fuse : 50.00th=[ 223] 95.00th=[ 290] gluster-block: 50.00th=[ 764] 95.00th=[ 964] For 8threads, 8k access size, random write: Fuse : 50.00th=[ 434] 95.00th=[ 502] gluster-block: 50.00th=[ 4768] 95.00th=[ 5600] Latency shoots up for multi-threaded writes with gluster-block. That is consistent with the hypothesis in comment #0 about serialization/contention being the cause of poor block performance.
Manoj, gluster-block is a file opened on gluster volume using O_DIRECT+O_SYNC. These are the results I get when I do a quick test on my laptop with different options on a plain distribute volume: root@dhcp35-190 - /mnt/d1 14:23:00 :) ⚡ dd if=/dev/zero of=1 bs=1M count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.141712 s, 740 MB/s root@dhcp35-190 - /mnt/d1 14:23:26 :) ⚡ dd if=/dev/zero of=1 bs=1M count=100 oflag=sync 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 11.0294 s, 9.5 MB/s root@dhcp35-190 - /mnt/d1 14:23:42 :) ⚡ dd if=/dev/zero of=1 bs=1M count=100 oflag=direct 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 2.71891 s, 38.6 MB/s So I think the first thing to do here would be to figure out what flags would satisfy the requirements for block but not put extra pressure on storage. Will update with more details once I find something better. Pranith
Manoj, I tried to do same dd inside the gluster-block and I get 1.6GBps which is twice the speed I got from normal fuse mount. I guess I will need to do debugging on your machine itself. Is that possible? Pranith
(In reply to Pranith Kumar K from comment #8) [...] > So I think the first thing to do here would be to figure out what flags > would satisfy the requirements for block but not put extra pressure on > storage. > My main concern in comment #0 and comment #6 is that gluster-block throughput does not increase when I increase the no. of threads doing I/O. Repeating some of the results from comment #6 here: For 8 threads, 8k access size, random write: Fuse : write: io=4096.0MB, bw=145499K/s gluster-block: write: io=4096.0MB, bw=13301K/s For 1 thread, 8k access size, random write: Fuse : write: io=4096.0MB, bw=33514KB/s gluster-block: write: io=4096.0MB, bw=10261KB/s fuse throughput jumps almost 5x going from single-threaded to 8 threads. gluster-block barely budges. Why? Does the open flags explain this behavior? I don't see that it does.
(In reply to Pranith Kumar K from comment #9) > Manoj, > I tried to do same dd inside the gluster-block and I get 1.6GBps which is > twice the speed I got from normal fuse mount. I guess I will need to do > debugging on your machine itself. Is that possible? > > Pranith Let's try to get you a separate setup. I'm moving to openshift-based testing to focus on the real workloads we are targeting this feature with. This is not a great time for me to be sharing my setup. I'll send you the link for bagl lab reservations so you can request a small allocation.
Manoj, So here are some results that Vijay and I found testing it with and without O_SYNC. fio comparison gluster-block with O_SYNC gf-randwrite: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 8 processes lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) Jobs: 8 (f=8): [F(8)] [100.0% done] [0KB/512KB/0KB /s] [0/8/0 iops] [eta 00m:00s] lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12860: Tue Jun 27 07:27:17 2017 write: io=524288KB, bw=737545B/s, iops=11, runt=727916msec clat (msec): min=23, max=416, avg=88.69, stdev=36.49 lat (msec): min=23, max=416, avg=88.69, stdev=36.49 clat percentiles (msec): | 1.00th=[ 34], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 68], | 30.00th=[ 76], 40.00th=[ 84], 50.00th=[ 92], 60.00th=[ 100], | 70.00th=[ 100], 80.00th=[ 109], 90.00th=[ 109], 95.00th=[ 117], | 99.00th=[ 293], 99.50th=[ 334], 99.90th=[ 375], 99.95th=[ 383], | 99.99th=[ 416] bw (KB /s): min= 172, max= 1109, per=12.62%, avg=726.91, stdev=132.69 lat (msec) : 50=11.84%, 100=60.82%, 250=25.94%, 500=1.40% cpu : usr=0.01%, sys=0.11%, ctx=8273, majf=0, minf=32 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12861: Tue Jun 27 07:27:17 2017 write: io=524288KB, bw=737545B/s, iops=11, runt=727916msec clat (msec): min=24, max=416, avg=88.69, stdev=36.50 lat (msec): min=24, max=416, avg=88.69, stdev=36.50 clat percentiles (msec): | 1.00th=[ 34], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 68], | 30.00th=[ 76], 40.00th=[ 84], 50.00th=[ 92], 60.00th=[ 100], | 70.00th=[ 100], 80.00th=[ 109], 90.00th=[ 109], 95.00th=[ 117], | 99.00th=[ 293], 99.50th=[ 334], 99.90th=[ 375], 99.95th=[ 383], | 99.99th=[ 416] bw (KB /s): min= 172, max= 1109, per=12.61%, avg=726.86, stdev=133.47 lat (msec) : 50=11.88%, 100=60.39%, 250=26.34%, 500=1.39% cpu : usr=0.02%, sys=0.11%, ctx=8276, majf=0, minf=31 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12862: Tue Jun 27 07:27:17 2017 write: io=524288KB, bw=737545B/s, iops=11, runt=727916msec clat (msec): min=23, max=416, avg=88.69, stdev=36.49 lat (msec): min=23, max=416, avg=88.69, stdev=36.49 clat percentiles (msec): | 1.00th=[ 34], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 68], | 30.00th=[ 76], 40.00th=[ 84], 50.00th=[ 92], 60.00th=[ 100], | 70.00th=[ 100], 80.00th=[ 109], 90.00th=[ 109], 95.00th=[ 117], | 99.00th=[ 293], 99.50th=[ 334], 99.90th=[ 375], 99.95th=[ 383], | 99.99th=[ 416] bw (KB /s): min= 172, max= 1109, per=12.61%, avg=726.63, stdev=133.45 lat (msec) : 50=11.84%, 100=60.42%, 250=26.33%, 500=1.40% cpu : usr=0.01%, sys=0.11%, ctx=8272, majf=0, minf=30 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12863: Tue Jun 27 07:27:17 2017 write: io=524288KB, bw=737545B/s, iops=11, runt=727916msec clat (msec): min=23, max=416, avg=88.69, stdev=36.49 lat (msec): min=23, max=416, avg=88.69, stdev=36.49 clat percentiles (msec): | 1.00th=[ 34], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 68], | 30.00th=[ 76], 40.00th=[ 84], 50.00th=[ 92], 60.00th=[ 100], | 70.00th=[ 100], 80.00th=[ 109], 90.00th=[ 109], 95.00th=[ 117], | 99.00th=[ 293], 99.50th=[ 334], 99.90th=[ 375], 99.95th=[ 383], | 99.99th=[ 416] bw (KB /s): min= 172, max= 1116, per=12.61%, avg=726.82, stdev=133.62 lat (msec) : 50=11.84%, 100=60.30%, 250=26.46%, 500=1.39% cpu : usr=0.01%, sys=0.10%, ctx=8275, majf=0, minf=32 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12864: Tue Jun 27 07:27:17 2017 write: io=524288KB, bw=737545B/s, iops=11, runt=727916msec clat (msec): min=23, max=416, avg=88.69, stdev=36.49 lat (msec): min=23, max=416, avg=88.69, stdev=36.49 clat percentiles (msec): | 1.00th=[ 34], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 68], | 30.00th=[ 76], 40.00th=[ 84], 50.00th=[ 92], 60.00th=[ 100], | 70.00th=[ 100], 80.00th=[ 109], 90.00th=[ 109], 95.00th=[ 117], | 99.00th=[ 293], 99.50th=[ 334], 99.90th=[ 375], 99.95th=[ 383], | 99.99th=[ 416] bw (KB /s): min= 172, max= 1080, per=12.61%, avg=726.64, stdev=134.92 lat (msec) : 50=11.84%, 100=60.67%, 250=26.10%, 500=1.39% cpu : usr=0.01%, sys=0.11%, ctx=8272, majf=0, minf=32 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12865: Tue Jun 27 07:27:17 2017 write: io=524288KB, bw=737545B/s, iops=11, runt=727916msec clat (msec): min=24, max=416, avg=88.69, stdev=36.49 lat (msec): min=24, max=416, avg=88.69, stdev=36.49 clat percentiles (msec): | 1.00th=[ 34], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 68], | 30.00th=[ 76], 40.00th=[ 84], 50.00th=[ 92], 60.00th=[ 100], | 70.00th=[ 100], 80.00th=[ 109], 90.00th=[ 109], 95.00th=[ 117], | 99.00th=[ 293], 99.50th=[ 334], 99.90th=[ 375], 99.95th=[ 383], | 99.99th=[ 416] bw (KB /s): min= 172, max= 1109, per=12.61%, avg=726.72, stdev=132.65 lat (msec) : 50=11.83%, 100=60.63%, 250=26.15%, 500=1.39% cpu : usr=0.01%, sys=0.10%, ctx=8279, majf=0, minf=30 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12866: Tue Jun 27 07:27:17 2017 write: io=524288KB, bw=737545B/s, iops=11, runt=727916msec clat (msec): min=23, max=416, avg=88.69, stdev=36.49 lat (msec): min=23, max=416, avg=88.69, stdev=36.49 clat percentiles (msec): | 1.00th=[ 34], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 68], | 30.00th=[ 76], 40.00th=[ 84], 50.00th=[ 92], 60.00th=[ 100], | 70.00th=[ 100], 80.00th=[ 109], 90.00th=[ 109], 95.00th=[ 117], | 99.00th=[ 293], 99.50th=[ 334], 99.90th=[ 375], 99.95th=[ 383], | 99.99th=[ 416] bw (KB /s): min= 172, max= 1066, per=12.61%, avg=726.62, stdev=132.42 lat (msec) : 50=11.83%, 100=60.44%, 250=26.34%, 500=1.39% cpu : usr=0.01%, sys=0.11%, ctx=8275, majf=0, minf=31 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12867: Tue Jun 27 07:27:17 2017 write: io=524288KB, bw=737545B/s, iops=11, runt=727916msec clat (msec): min=23, max=416, avg=88.69, stdev=36.49 lat (msec): min=23, max=416, avg=88.69, stdev=36.49 clat percentiles (msec): | 1.00th=[ 34], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 68], | 30.00th=[ 76], 40.00th=[ 84], 50.00th=[ 92], 60.00th=[ 100], | 70.00th=[ 100], 80.00th=[ 109], 90.00th=[ 109], 95.00th=[ 117], | 99.00th=[ 293], 99.50th=[ 334], 99.90th=[ 375], 99.95th=[ 383], | 99.99th=[ 416] bw (KB /s): min= 172, max= 1080, per=12.61%, avg=726.52, stdev=134.76 lat (msec) : 50=11.85%, 100=60.46%, 250=26.29%, 500=1.39% cpu : usr=0.01%, sys=0.10%, ctx=8272, majf=0, minf=31 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=4096.0MB, aggrb=5762KB/s, minb=720KB/s, maxb=720KB/s, mint=727916msec, maxt=727916msec Disk stats (read/write): sdb: ios=0/66577, merge=0/1, ticks=0/6116759, in_queue=6116766, util=100.00% ========================================================================== without O_SYNC and just O_DIRECT fio ./random.cfg lgf-randwrite: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 8 processes lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) Jobs: 7 (f=7): [w(2),_(1),w(5)] [91.5% done] [0KB/21248KB/0KB /s] [0/332/0 iops] [eta 00m:1Jobs: 7 (f=7): [w(2),_(1),w(5)] [92.0% done] [0KB/19200KB/0KB /s] [0/300/0 iops] [eta 00m:1Jobs: 7 (f=7): [w(2),_(1),w(5)] [92.6% done] [0KB/19136KB/0KB /s] [0/299/0 iops] [eta 00m:1Jobs: 7 (f=7): [w(2),_(1),w(5)] [93.1% done] [0KB/20288KB/0KB /s] [0/317/0 iops] [eta 00m:1Jobs: 7 (f=7): [w(2),_(1),w(5)] [93.6% done] [0KB/20544KB/0KB /s] [0/321/0 iops] [eta 00m:1Jobs: 7 (f=7): [w(2),_(1),w(5)] [94.1% done] [0KB/21824KB/0KB /s] [0/341/0 iops] [eta 00m:1Jobs: 6 (f=6): [w(2),_(1),w(1),_(1),w(3)] [94.7% done] [0KB/21312KB/0KB /s] [0/333/0 iops] Jobs: 6 (f=6): [w(2),_(1),w(1),_(1),w(3)] [95.2% done] [0KB/23552KB/0KB /s] [0/368/0 iops] Jobs: 5 (f=5): [w(2),_(1),w(1),_(1),F(2),_(1)] [96.3% done] [0KB/19968KB/0KB /s] [0/312/0 iJobs: 3 (f=3): [w(1),F(1),_(1),w(1),_(4)] [97.3% done] [0KB/20608KB/0KB /s] [0/322/0 iops] Jobs: 1 (f=1): [_(3),w(1),_(4)] [100.0% done] [0KB/17408KB/0KB /s] [0/272/0 iops] [eta 00m:00s] lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12902: Tue Jun 27 07:32:49 2017 write: io=524288KB, bw=2905.7KB/s, iops=45, runt=180439msec clat (msec): min=1, max=286, avg=22.01, stdev=14.72 lat (msec): min=1, max=286, avg=22.02, stdev=14.72 clat percentiles (usec): | 1.00th=[ 1208], 5.00th=[ 1528], 10.00th=[ 2608], 20.00th=[12096], | 30.00th=[18048], 40.00th=[21376], 50.00th=[23936], 60.00th=[25984], | 70.00th=[27776], 80.00th=[29824], 90.00th=[33024], 95.00th=[36096], | 99.00th=[43776], 99.50th=[52480], 99.90th=[207872], 99.95th=[264192], | 99.99th=[288768] bw (KB /s): min= 1013, max=10645, per=12.58%, avg=2913.65, stdev=1062.92 lat (msec) : 2=8.07%, 4=4.63%, 10=5.44%, 20=17.05%, 50=64.27% lat (msec) : 100=0.22%, 250=0.23%, 500=0.09% cpu : usr=0.07%, sys=0.44%, ctx=8277, majf=0, minf=32 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12903: Tue Jun 27 07:32:49 2017 write: io=524288KB, bw=2911.6KB/s, iops=45, runt=180074msec clat (msec): min=1, max=287, avg=21.96, stdev=14.74 lat (msec): min=1, max=287, avg=21.96, stdev=14.74 clat percentiles (usec): | 1.00th=[ 1208], 5.00th=[ 1480], 10.00th=[ 2352], 20.00th=[11840], | 30.00th=[18048], 40.00th=[21376], 50.00th=[23936], 60.00th=[25984], | 70.00th=[27776], 80.00th=[29824], 90.00th=[33024], 95.00th=[35584], | 99.00th=[43776], 99.50th=[51456], 99.90th=[207872], 99.95th=[264192], | 99.99th=[288768] bw (KB /s): min= 1097, max=10645, per=12.61%, avg=2919.07, stdev=997.21 lat (msec) : 2=8.56%, 4=4.35%, 10=5.43%, 20=16.98%, 50=64.16% lat (msec) : 100=0.21%, 250=0.23%, 500=0.09% cpu : usr=0.07%, sys=0.44%, ctx=8273, majf=0, minf=32 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12904: Tue Jun 27 07:32:49 2017 write: io=524288KB, bw=3073.6KB/s, iops=48, runt=170581msec clat (msec): min=1, max=274, avg=20.78, stdev=15.05 lat (msec): min=1, max=274, avg=20.79, stdev=15.05 clat percentiles (usec): | 1.00th=[ 1224], 5.00th=[ 1384], 10.00th=[ 1640], 20.00th=[ 4384], | 30.00th=[16064], 40.00th=[20352], 50.00th=[23424], 60.00th=[25728], | 70.00th=[27520], 80.00th=[29568], 90.00th=[33024], 95.00th=[35584], | 99.00th=[43776], 99.50th=[50944], 99.90th=[195584], 99.95th=[264192], | 99.99th=[272384] bw (KB /s): min= 1013, max=10645, per=13.33%, avg=3087.23, stdev=985.85 lat (msec) : 2=13.48%, 4=4.96%, 10=4.92%, 20=15.52%, 50=60.62% lat (msec) : 100=0.22%, 250=0.22%, 500=0.07% cpu : usr=0.06%, sys=0.41%, ctx=8269, majf=0, minf=30 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12905: Tue Jun 27 07:32:49 2017 write: io=524288KB, bw=2894.7KB/s, iops=45, runt=181124msec clat (msec): min=1, max=287, avg=22.10, stdev=14.76 lat (msec): min=1, max=287, avg=22.10, stdev=14.76 clat percentiles (usec): | 1.00th=[ 1176], 5.00th=[ 1480], 10.00th=[ 2672], 20.00th=[12096], | 30.00th=[18048], 40.00th=[21632], 50.00th=[23936], 60.00th=[25984], | 70.00th=[28032], 80.00th=[30080], 90.00th=[33024], 95.00th=[36096], | 99.00th=[44288], 99.50th=[50944], 99.90th=[207872], 99.95th=[264192], | 99.99th=[288768] bw (KB /s): min= 1043, max=13952, per=12.44%, avg=2881.36, stdev=1222.90 lat (msec) : 2=7.87%, 4=4.00%, 10=6.10%, 20=17.10%, 50=64.37% lat (msec) : 100=0.22%, 250=0.24%, 500=0.09% cpu : usr=0.07%, sys=0.44%, ctx=8276, majf=0, minf=32 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12906: Tue Jun 27 07:32:49 2017 write: io=524288KB, bw=2970.1KB/s, iops=46, runt=176473msec clat (usec): min=989, max=287010, avg=21509.37, stdev=15038.56 lat (usec): min=990, max=287014, avg=21512.91, stdev=15038.61 clat percentiles (usec): | 1.00th=[ 1208], 5.00th=[ 1448], 10.00th=[ 1880], 20.00th=[ 9152], | 30.00th=[17536], 40.00th=[21120], 50.00th=[23936], 60.00th=[25728], | 70.00th=[27776], 80.00th=[29824], 90.00th=[33024], 95.00th=[35584], | 99.00th=[43264], 99.50th=[50944], 99.90th=[207872], 99.95th=[264192], | 99.99th=[288768] bw (KB /s): min= 1013, max=10645, per=12.88%, avg=2982.87, stdev=958.42 lat (usec) : 1000=0.01% lat (msec) : 2=10.77%, 4=5.09%, 10=4.50%, 20=15.88%, 50=63.23% lat (msec) : 100=0.20%, 250=0.23%, 500=0.09% cpu : usr=0.06%, sys=0.45%, ctx=8271, majf=0, minf=33 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12907: Tue Jun 27 07:32:49 2017 write: io=524288KB, bw=2928.6KB/s, iops=45, runt=179028msec clat (msec): min=1, max=286, avg=21.84, stdev=14.87 lat (msec): min=1, max=286, avg=21.84, stdev=14.87 clat percentiles (usec): | 1.00th=[ 1208], 5.00th=[ 1480], 10.00th=[ 2096], 20.00th=[11584], | 30.00th=[17792], 40.00th=[21376], 50.00th=[23936], 60.00th=[25984], | 70.00th=[27776], 80.00th=[29824], 90.00th=[33024], 95.00th=[36096], | 99.00th=[43776], 99.50th=[51456], 99.90th=[205824], 99.95th=[264192], | 99.99th=[288768] bw (KB /s): min= 887, max=11025, per=12.68%, avg=2937.45, stdev=1052.21 lat (msec) : 2=9.69%, 4=4.22%, 10=5.00%, 20=16.69%, 50=63.85% lat (msec) : 100=0.22%, 250=0.23%, 500=0.09% cpu : usr=0.07%, sys=0.40%, ctx=8275, majf=0, minf=30 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12908: Tue Jun 27 07:32:49 2017 write: io=524288KB, bw=2928.6KB/s, iops=45, runt=179028msec clat (msec): min=1, max=286, avg=21.84, stdev=14.83 lat (msec): min=1, max=286, avg=21.84, stdev=14.83 clat percentiles (usec): | 1.00th=[ 1208], 5.00th=[ 1480], 10.00th=[ 2160], 20.00th=[11712], | 30.00th=[18048], 40.00th=[21376], 50.00th=[23936], 60.00th=[25984], | 70.00th=[27776], 80.00th=[29824], 90.00th=[33024], 95.00th=[35584], | 99.00th=[43776], 99.50th=[51456], 99.90th=[207872], 99.95th=[264192], | 99.99th=[288768] bw (KB /s): min= 1013, max=10518, per=12.68%, avg=2935.97, stdev=943.13 lat (msec) : 2=9.20%, 4=4.87%, 10=4.70%, 20=16.48%, 50=64.21% lat (msec) : 100=0.22%, 250=0.23%, 500=0.09% cpu : usr=0.07%, sys=0.44%, ctx=8276, majf=0, minf=30 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12909: Tue Jun 27 07:32:49 2017 write: io=524288KB, bw=2931.2KB/s, iops=45, runt=178876msec clat (msec): min=1, max=286, avg=21.81, stdev=14.86 lat (msec): min=1, max=286, avg=21.81, stdev=14.86 clat percentiles (usec): | 1.00th=[ 1208], 5.00th=[ 1464], 10.00th=[ 2040], 20.00th=[11584], | 30.00th=[18048], 40.00th=[21376], 50.00th=[23936], 60.00th=[25984], | 70.00th=[27776], 80.00th=[29824], 90.00th=[33024], 95.00th=[35584], | 99.00th=[43776], 99.50th=[52480], 99.90th=[207872], 99.95th=[264192], | 99.99th=[288768] bw (KB /s): min= 1013, max=10265, per=12.70%, avg=2940.48, stdev=940.95 lat (msec) : 2=9.86%, 4=4.42%, 10=4.64%, 20=16.60%, 50=63.94% lat (msec) : 100=0.23%, 250=0.22%, 500=0.09% cpu : usr=0.06%, sys=0.44%, ctx=8271, majf=0, minf=32 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=4096.0MB, aggrb=23157KB/s, minb=2894KB/s, maxb=3073KB/s, mint=170581msec, maxt=181124msec Disk stats (read/write): sdb: ios=0/66068, merge=0/552, ticks=0/1437760, in_queue=1437743, util=100.00% ===================================================================== fuse lgf-randwrite: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 8 processes lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) lgf-randwrite: Laying out IO file(s) (1 file(s) / 10240MB) Jobs: 8 (f=8): [F(1),w(1),F(4),w(1),F(1)] [100.0% done] [0KB/7488KB/0KB /s] [0/117/0 iops] Jobs: 8 (f=8): [F(8)] [100.0% done] [0KB/896KB/0KB /s] [0/14/0 iops] [eta 00m:00s] lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12943: Tue Jun 27 07:40:16 2017 write: io=524288KB, bw=1839.4KB/s, iops=28, runt=285046msec clat (usec): min=224, max=587842, avg=34633.95, stdev=55296.35 lat (usec): min=227, max=587844, avg=34636.73, stdev=55296.35 clat percentiles (usec): | 1.00th=[ 494], 5.00th=[ 1608], 10.00th=[ 3120], 20.00th=[ 4704], | 30.00th=[ 7264], 40.00th=[ 9536], 50.00th=[13376], 60.00th=[18304], | 70.00th=[32128], 80.00th=[48384], 90.00th=[80384], 95.00th=[183296], | 99.00th=[232448], 99.50th=[272384], 99.90th=[493568], 99.95th=[552960], | 99.99th=[585728] bw (KB /s): min= 584, max= 4434, per=12.68%, avg=1865.61, stdev=317.21 lat (usec) : 250=0.04%, 500=1.04%, 750=0.89%, 1000=1.07% lat (msec) : 2=2.84%, 4=9.27%, 10=26.05%, 20=21.17%, 50=18.41% lat (msec) : 100=11.43%, 250=7.23%, 500=0.48%, 750=0.10% cpu : usr=0.09%, sys=0.20%, ctx=16388, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12944: Tue Jun 27 07:40:16 2017 write: io=524288KB, bw=1839.4KB/s, iops=28, runt=285046msec clat (usec): min=208, max=606969, avg=34673.66, stdev=59059.36 lat (usec): min=211, max=606972, avg=34676.41, stdev=59059.37 clat percentiles (usec): | 1.00th=[ 470], 5.00th=[ 1656], 10.00th=[ 2832], 20.00th=[ 4128], | 30.00th=[ 5152], 40.00th=[ 7520], 50.00th=[ 9664], 60.00th=[15424], | 70.00th=[32384], 80.00th=[48896], 90.00th=[92672], 95.00th=[158720], | 99.00th=[264192], 99.50th=[309248], 99.90th=[428032], 99.95th=[468992], | 99.99th=[610304] bw (KB /s): min= 358, max= 4447, per=12.63%, avg=1858.40, stdev=342.69 lat (usec) : 250=0.01%, 500=1.37%, 750=0.98%, 1000=1.25% lat (msec) : 2=2.48%, 4=12.93%, 10=31.75%, 20=13.60%, 50=16.02% lat (msec) : 100=10.16%, 250=7.59%, 500=1.83%, 750=0.05% cpu : usr=0.13%, sys=0.18%, ctx=16390, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12945: Tue Jun 27 07:40:16 2017 write: io=524288KB, bw=1839.4KB/s, iops=28, runt=285046msec clat (usec): min=190, max=629174, avg=34633.51, stdev=64418.51 lat (usec): min=192, max=629177, avg=34636.20, stdev=64418.52 clat percentiles (usec): | 1.00th=[ 470], 5.00th=[ 1432], 10.00th=[ 2864], 20.00th=[ 4384], | 30.00th=[ 6368], 40.00th=[ 8160], 50.00th=[ 9792], 60.00th=[13120], | 70.00th=[18304], 80.00th=[27008], 90.00th=[156672], 95.00th=[201728], | 99.00th=[240640], 99.50th=[309248], 99.90th=[481280], 99.95th=[544768], | 99.99th=[626688] bw (KB /s): min= 125, max= 4722, per=12.66%, avg=1863.43, stdev=345.39 lat (usec) : 250=0.02%, 500=1.28%, 750=0.84%, 1000=1.49% lat (msec) : 2=3.19%, 4=10.14%, 10=33.90%, 20=21.25%, 50=13.78% lat (msec) : 100=1.32%, 250=12.01%, 500=0.67%, 750=0.10% cpu : usr=0.09%, sys=0.20%, ctx=16388, majf=0, minf=27 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12946: Tue Jun 27 07:40:16 2017 write: io=524288KB, bw=1839.4KB/s, iops=28, runt=285046msec clat (usec): min=130, max=473626, avg=34632.84, stdev=42395.06 lat (usec): min=134, max=473629, avg=34635.58, stdev=42395.05 clat percentiles (usec): | 1.00th=[ 524], 5.00th=[ 2224], 10.00th=[ 3632], 20.00th=[ 6304], | 30.00th=[ 8512], 40.00th=[12096], 50.00th=[17280], 60.00th=[22400], | 70.00th=[45824], 80.00th=[59648], 90.00th=[90624], 95.00th=[121344], | 99.00th=[187392], 99.50th=[211968], 99.90th=[382976], 99.95th=[395264], | 99.99th=[473088] bw (KB /s): min= 240, max= 5322, per=12.64%, avg=1860.39, stdev=387.64 lat (usec) : 250=0.06%, 500=0.81%, 750=0.61%, 1000=1.12% lat (msec) : 2=1.92%, 4=7.32%, 10=23.99%, 20=19.86%, 50=16.99% lat (msec) : 100=18.41%, 250=8.58%, 500=0.33% cpu : usr=0.08%, sys=0.24%, ctx=16389, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12947: Tue Jun 27 07:40:16 2017 write: io=524288KB, bw=1839.4KB/s, iops=28, runt=285046msec clat (usec): min=116, max=499696, avg=34634.29, stdev=46594.74 lat (usec): min=119, max=499697, avg=34637.10, stdev=46594.74 clat percentiles (usec): | 1.00th=[ 490], 5.00th=[ 2064], 10.00th=[ 3184], 20.00th=[ 4640], | 30.00th=[ 7392], 40.00th=[10304], 50.00th=[15040], 60.00th=[20864], | 70.00th=[34048], 80.00th=[59648], 90.00th=[94720], 95.00th=[136192], | 99.00th=[191488], 99.50th=[222208], 99.90th=[391168], 99.95th=[432128], | 99.99th=[497664] bw (KB /s): min= 245, max= 4481, per=12.64%, avg=1860.13, stdev=379.98 lat (usec) : 250=0.07%, 500=0.99%, 750=0.61%, 1000=0.79% lat (msec) : 2=2.31%, 4=10.21%, 10=24.17%, 20=19.85%, 50=17.02% lat (msec) : 100=14.77%, 250=8.80%, 500=0.42% cpu : usr=0.10%, sys=0.22%, ctx=16389, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12948: Tue Jun 27 07:40:16 2017 write: io=524288KB, bw=1839.4KB/s, iops=28, runt=285046msec clat (usec): min=162, max=625086, avg=34632.58, stdev=55390.48 lat (usec): min=165, max=625088, avg=34635.37, stdev=55390.46 clat percentiles (usec): | 1.00th=[ 478], 5.00th=[ 1736], 10.00th=[ 3472], 20.00th=[ 6496], | 30.00th=[ 8896], 40.00th=[11456], 50.00th=[14016], 60.00th=[17280], | 70.00th=[22912], 80.00th=[37120], 90.00th=[119296], 95.00th=[166912], | 99.00th=[214016], 99.50th=[305152], 99.90th=[432128], 99.95th=[528384], | 99.99th=[626688] bw (KB /s): min= 120, max= 5417, per=12.67%, avg=1863.67, stdev=401.18 lat (usec) : 250=0.05%, 500=0.99%, 750=0.54%, 1000=0.98% lat (msec) : 2=3.32%, 4=5.96%, 10=22.53%, 20=31.95%, 50=17.72% lat (msec) : 100=3.91%, 250=11.44%, 500=0.56%, 750=0.06% cpu : usr=0.12%, sys=0.24%, ctx=16389, majf=0, minf=54 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12949: Tue Jun 27 07:40:16 2017 write: io=524288KB, bw=1839.4KB/s, iops=28, runt=285046msec clat (usec): min=202, max=958358, avg=34672.45, stdev=100428.77 lat (usec): min=204, max=958360, avg=34675.23, stdev=100428.81 clat percentiles (usec): | 1.00th=[ 414], 5.00th=[ 1048], 10.00th=[ 2256], 20.00th=[ 3216], | 30.00th=[ 3920], 40.00th=[ 4512], 50.00th=[ 5344], 60.00th=[ 7200], | 70.00th=[ 9536], 80.00th=[14400], 90.00th=[46848], 95.00th=[350208], | 99.00th=[460800], 99.50th=[518144], 99.90th=[716800], 99.95th=[733184], | 99.99th=[954368] bw (KB /s): min= 117, max= 3952, per=12.76%, avg=1877.50, stdev=321.70 lat (usec) : 250=0.02%, 500=2.33%, 750=1.05%, 1000=1.43% lat (msec) : 2=3.59%, 4=22.52%, 10=40.45%, 20=13.21%, 50=5.88% lat (msec) : 100=2.92%, 250=0.71%, 500=5.32%, 750=0.51%, 1000=0.05% cpu : usr=0.11%, sys=0.24%, ctx=16388, majf=0, minf=58 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 lgf-randwrite: (groupid=0, jobs=1): err= 0: pid=12950: Tue Jun 27 07:40:16 2017 write: io=524288KB, bw=1839.4KB/s, iops=28, runt=285046msec clat (usec): min=198, max=642362, avg=34632.47, stdev=60796.16 lat (usec): min=201, max=642365, avg=34635.21, stdev=60796.14 clat percentiles (usec): | 1.00th=[ 490], 5.00th=[ 1928], 10.00th=[ 2992], 20.00th=[ 4640], | 30.00th=[ 8384], 40.00th=[12608], 50.00th=[16512], 60.00th=[20352], | 70.00th=[24704], 80.00th=[33536], 90.00th=[73216], 95.00th=[220160], | 99.00th=[248832], 99.50th=[317440], 99.90th=[544768], 99.95th=[585728], | 99.99th=[643072] bw (KB /s): min= 344, max= 4740, per=12.69%, avg=1867.42, stdev=329.03 lat (usec) : 250=0.04%, 500=1.03%, 750=0.77%, 1000=0.73% lat (msec) : 2=2.77%, 4=10.46%, 10=18.64%, 20=25.06%, 50=25.68% lat (msec) : 100=7.64%, 250=6.27%, 500=0.77%, 750=0.13% cpu : usr=0.12%, sys=0.20%, ctx=16388, majf=0, minf=30 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8192/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=4096.0MB, aggrb=14714KB/s, minb=1839KB/s, maxb=1839KB/s, mint=285046msec, maxt=285046msec ========================================================================== While this proves our theory about O_SYNC causing significant delays. We also want to verify what is the bottleneck. To run the fio job with 1 thread, is it enough to change "numjobs=1" in the job file you gave?
(In reply to Pranith Kumar K from comment #12) [...] > While this proves our theory about O_SYNC causing significant delays. We > also want to verify what is the bottleneck. To run the fio job with 1 > thread, is it enough to change "numjobs=1" in the job file you gave? I change a few other options as well to keep data and access set size the same: numjobs=1 nrfiles=1 openfiles=1 size=80g io_size=4096m [A suggestion, that is admittedly based on a personal preference :). Could we paste just the key portion of the output in the bz comments and include the detailed output as attachments? Keeps the bz navigable]
(In reply to Manoj Pillai from comment #13) > (In reply to Pranith Kumar K from comment #12) > [...] > > While this proves our theory about O_SYNC causing significant delays. We > > also want to verify what is the bottleneck. To run the fio job with 1 > > thread, is it enough to change "numjobs=1" in the job file you gave? > > I change a few other options as well to keep data and access set size the > same: > numjobs=1 > nrfiles=1 > openfiles=1 > size=80g > io_size=4096m > > [A suggestion, that is admittedly based on a personal preference :). Could > we paste just the key portion of the output in the bz comments and include > the detailed output as attachments? Keeps the bz navigable] Yeah :-), even Prasanna said the same.
Hmm.. some stats I don't understand this. Looking at the fio run from comment #6, randwrite, with 8 threads, bs=8k: Bandwidth reported is about 13MB/s: gluster-block: write: io=4096.0MB, bw=13301K/s When I look at iostat on the block device at the client during this interval, I see: <excerpt> 06/16/2017 03:05:44 AM Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util <snip> sdc 0.00 0.00 0.00 1643.80 0.00 13149.40 16.00 7.95 4.84 0.00 4.84 0.61 100.00 </excerpt> This matches expectation: wkB/s corresponds to what the app is reporting; avgrq-sz is 16, which is basically the 8k request size expressed as 512b sectors. But when I look at iostat on the back-end device, I see: <excerpt> 06/16/2017 03:05:46 AM Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util nvme0n1 0.00 0.00 0.00 5086.00 0.00 445277.33 175.10 0.44 0.09 0.00 0.09 0.05 25.63 <snip> rhsvg1-rhslv1 0.00 0.00 0.00 6772.33 0.00 445192.00 131.47 0.64 0.09 0.00 0.09 0.06 38.67 </excerpt> Request sizes are much higher. Writes/s are much higher. And a lot of device bandwidth is being used. I need to try to understand the transformation of the I/O stream between these two points.
(In reply to Manoj Pillai from comment #15) > Hmm.. some stats I don't understand this. Looking at the fio run from > comment #6, randwrite, with 8 threads, bs=8k: > > Bandwidth reported is about 13MB/s: > gluster-block: write: io=4096.0MB, bw=13301K/s > > When I look at iostat on the block device at the client during this > interval, I see: > > <excerpt> > 06/16/2017 03:05:44 AM > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > <snip> > sdc 0.00 0.00 0.00 1643.80 0.00 13149.40 16.00 > 7.95 4.84 0.00 4.84 0.61 100.00 > </excerpt> > > This matches expectation: wkB/s corresponds to what the app is reporting; > avgrq-sz is 16, which is basically the 8k request size expressed as 512b > sectors. > > But when I look at iostat on the back-end device, I see: > > <excerpt> > 06/16/2017 03:05:46 AM > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > nvme0n1 0.00 0.00 0.00 5086.00 0.00 445277.33 175.10 > 0.44 0.09 0.00 0.09 0.05 25.63 > <snip> > rhsvg1-rhslv1 0.00 0.00 0.00 6772.33 0.00 445192.00 131.47 > 0.64 0.09 0.00 0.09 0.06 38.67 > </excerpt> > > Request sizes are much higher. Writes/s are much higher. And a lot of device > bandwidth is being used. I need to try to understand the transformation of > the I/O stream between these two points. By default write-behind still caches writes and merges them into a bigger write. So you may want to do "gluster volume set <volname> performance.strict-o-direct on" to instruct write-behind not to cache/merge the writes.
(In reply to Pranith Kumar K from comment #16) > By default write-behind still caches writes and merges them into a bigger > write. So you may want to do "gluster volume set <volname> > performance.strict-o-direct on" to instruct write-behind not to cache/merge > the writes. Yes it is on. In this case, gluster volume info: Volume Name: perfvol Type: Distribute Volume ID: 7e0e12d4-3d2b-4209-91f8-00788dafc424 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: smerf04-10ge:/mnt/rhs_brick1 Options Reconfigured: network.remote-dio: disable performance.strict-o-direct: on cluster.lookup-optimize: on performance.io-cache: off performance.write-behind: off performance.read-ahead: off transport.address-family: inet nfs.disable: on
(In reply to Manoj Pillai from comment #17) > (In reply to Pranith Kumar K from comment #16) > > > By default write-behind still caches writes and merges them into a bigger > > write. So you may want to do "gluster volume set <volname> > > performance.strict-o-direct on" to instruct write-behind not to cache/merge > > the writes. > > Yes it is on. In this case, gluster volume info: > > Volume Name: perfvol > Type: Distribute > Volume ID: 7e0e12d4-3d2b-4209-91f8-00788dafc424 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 > Transport-type: tcp > Bricks: > Brick1: smerf04-10ge:/mnt/rhs_brick1 > Options Reconfigured: > network.remote-dio: disable > performance.strict-o-direct: on > cluster.lookup-optimize: on > performance.io-cache: off > performance.write-behind: off > performance.read-ahead: off > transport.address-family: inet > nfs.disable: on Hey, write-behind is off. So it is not even related to write-behind then.
I am doing a round of testing on the said tcmu-runner and gluster-block bits to verify the functionality impact. Manoj, please do update the perf results once you have, post which we can move this bug to its final state.
Is there a date by which this needs to be done? see previous comment.
Re-evaluating with the following version: glusterfs-client-xlators-3.8.4-43.el7rhgs.x86_64 glusterfs-cli-3.8.4-43.el7rhgs.x86_64 glusterfs-libs-3.8.4-43.el7rhgs.x86_64 glusterfs-3.8.4-43.el7rhgs.x86_64 glusterfs-fuse-3.8.4-43.el7rhgs.x86_64 glusterfs-server-3.8.4-43.el7rhgs.x86_64 glusterfs-debuginfo-3.8.4-41.el7rhgs.x86_64 glusterfs-api-3.8.4-43.el7rhgs.x86_64 gluster-block-0.2.1-9.el7rhgs.x86_64 tcmu-runner-1.2.0-12.el7rhgs.x86_64 libtcmu-1.2.0-12.el7rhgs.x86_64 kernel-3.10.0-693.el7.x86_64 Comparing fuse and gluster-block on an fio tests with a single client, 3 servers, 1x3 volume on nvme bricks. 8 fio jobs running in parallel. gluster-block with "gluster v set <vol> group gluster-block" and "gluster-block create ... prealloc full...". Seq Write, bs=1024k: fuse: 398 MB/s block: 338 MB/s: (85% of fuse) Seq Read, bs=1024k: fuse: 1173 MB/s block: 624 MB/s: (53% of fuse) Random Read, bs=8K, direct=1: fuse: 102 MB/s block: 80 MB/s: (78% of fuse) Random Write, bs=8K, direct=1: fuse: 63 MB/s block: 26 MB/s: (41% of fuse) Analysis in progress. Some of these results are affected by: bz #1487495. So the picture from the above results may not be accurate.
On an fio random I/O test with libaio, and multiple files per job I see gluster-block at about par with glusterfs-fuse on random read. (glusterfs-fuse is still affected by bz #1487495, so that inaccuracy still exists). The job file for rand write is below (rand read job is similar): <quote> [global] rw=randwrite end_fsync=1 startdelay=0 ioengine=libaio direct=1 bs=8k [randwrite] directory=/mnt/glustervol/${HOSTNAME} filename_format=f.$jobnum.$filenum iodepth=16 numjobs=2 nrfiles=4 openfiles=4 filesize=10g size=40g io_size=2048m </quote> Random write performance is still poor with gluster-block: fuse: 8434 iops gluster-block: 4321 iops (51% of fuse)
(In reply to Manoj Pillai from comment #46) > On an fio random I/O test with libaio, and multiple files per job I see > gluster-block at about par with glusterfs-fuse on random read. > (glusterfs-fuse is still affected by bz #1487495, so that inaccuracy still > exists). Here's the job file and numbers for random read on this test: <quote> [global] rw=randread startdelay=0 ioengine=libaio direct=1 bs=8k [randread] directory=/mnt/glustervol/${HOSTNAME} filename_format=f.$jobnum.$filenum iodepth=16 numjobs=2 nrfiles=4 openfiles=4 filesize=10g size=40g io_size=2048m </quote> fuse: 14500 iops gluster-block: 16200 iops
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:2773