+++ This bug was initially created as a clone of Bug #1073763 +++ Description of problem: I have two volumes configured that are basically identical except name, path and network.compression configured as on (no other compression changes were made). Running test #1 from this page ( http://docs.gz.ro/fio-perf-tool-nutshell.html ) fails with compression-enabled volume, but the same test succeeds wonderfully on the volume without wire compression. At first, both were configured with direct-io-mode=enable, but I disabled this on the compressed mount with hopes that this would help. Omitting the direct-io-mode option in /etc/fstab did not help. FYI: btrfs is used everywhere. fio test #1 on plain mount (healthy state): [root@core-n1 ssd-vol-benchmark-n001]# fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1 fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1 fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1 fio-2.0.13 Starting 1 process fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB) Jobs: 1 (f=1): [W] [100.0% done] [0K/532.3M/0K /s] [0 /8516 /0 iops] [eta 00m:00s] fio.write.out.1: (groupid=0, jobs=1): err= 0: pid=25928: Fri Mar 7 00:52:17 2014 write: io=20480MB, bw=546347KB/s, iops=8536 , runt= 38385msec clat (usec): min=27 , max=1300.1K, avg=115.08, stdev=2277.05 lat (usec): min=27 , max=1300.1K, avg=116.23, stdev=2277.05 clat percentiles (usec): | 1.00th=[ 32], 5.00th=[ 33], 10.00th=[ 34], 20.00th=[ 35], | 30.00th=[ 40], 40.00th=[ 51], 50.00th=[ 95], 60.00th=[ 163], | 70.00th=[ 175], 80.00th=[ 187], 90.00th=[ 203], 95.00th=[ 213], | 99.00th=[ 235], 99.50th=[ 249], 99.90th=[ 342], 99.95th=[ 350], | 99.99th=[ 426] bw (KB/s) : min=41277, max=604032, per=100.00%, avg=558639.53, stdev=69053.48 lat (usec) : 50=37.10%, 100=13.49%, 250=48.91%, 500=0.49%, 750=0.01% lat (usec) : 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 100=0.01%, 2000=0.01% cpu : usr=2.65%, sys=4.15%, ctx=327684, majf=0, minf=166 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=327680/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=20480MB, aggrb=546346KB/s, minb=546346KB/s, maxb=546346KB/s, mint=38385msec, maxt=38385msec [root@core-n1 ssd-vol-benchmark-n001]# fio test #1 on network.compression on mount (fail state -- presumably because of wire compression): [root@core-n1 ssd-vol-benchmark-n002]# fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1 fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1 fio-2.0.13 Starting 1 process fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB) fio: pid=26846, err=5/file:engines/sync.c:67, func=xfer, error=Input/output error fio.write.out.1: (groupid=0, jobs=1): err= 5 (file:engines/sync.c:67, func=xfer, error=Input/output error): pid=26846: Fri Mar 7 01:04:43 2014 write: io=262144 B, bw=36571KB/s, iops=714 , runt= 7msec clat (usec): min=35 , max=990 , avg=293.50, stdev=464.98 lat (usec): min=36 , max=994 , avg=295.00, stdev=466.62 clat percentiles (usec): | 1.00th=[ 35], 5.00th=[ 35], 10.00th=[ 35], 20.00th=[ 35], | 30.00th=[ 55], 40.00th=[ 55], 50.00th=[ 55], 60.00th=[ 94], | 70.00th=[ 94], 80.00th=[ 988], 90.00th=[ 988], 95.00th=[ 988], | 99.00th=[ 988], 99.50th=[ 988], 99.90th=[ 988], 99.95th=[ 988], | 99.99th=[ 988] lat (usec) : 50=20.00%, 100=40.00%, 1000=20.00% cpu : usr=0.00%, sys=0.00%, ctx=7, majf=0, minf=46 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=16.7%, 4=83.3%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=5/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=256KB, aggrb=36571KB/s, minb=36571KB/s, maxb=36571KB/s, mint=7msec, maxt=7msec [root@core-n1 ssd-vol-benchmark-n002]# [root@core-n1 ~]# gluster pool list UUID Hostname State 3e49dd15-c05f-4ef8-b1e0-c29c59623b45 core-n2.storage-s0.example.vpn Connected 2a637a4b-b9a9-4dd5-80b4-78474e9e33cb core-n5.storage-s0.example.vpn Connected 6cc8c574-a237-4819-a175-e7218c8606d8 core-n6.storage-s0.example.vpn Connected 36201442-b264-4078-a2d3-ff61a266f9d3 core-n4.storage-s0.example.vpn Connected 10ac9d8a-ced5-4964-b8f3-802e7ccd2f2f core-n3.storage-s0.example.vpn Connected 7e46d31a-cd08-4677-8fd6-a5e7b9d7e7fe localhost Connected [root@core-n1 ~]# [root@core-n1 ~]# gluster volume info ssd-vol-benchmark-n001 Volume Name: ssd-vol-benchmark-n001 Type: Distributed-Replicate Volume ID: 30dde773-bcba-4bd1-8ed9-6865571283db Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: core-n1.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001 Brick2: core-n2.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001 Brick3: core-n3.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001 Brick4: core-n4.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001 Brick5: core-n5.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001 Brick6: core-n6.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001 Options Reconfigured: cluster.server-quorum-type: server server.allow-insecure: on performance.io-thread-count: 12 auth.allow: 10.30.* cluster.server-quorum-ratio: 51% [root@core-n1 ssd-vol-benchmark-n002]# gluster volume info ssd-vol-benchmark-n002 Volume Name: ssd-vol-benchmark-n002 Type: Distributed-Replicate Volume ID: 809535b2-19d1-457c-a9f7-8b66094b358b Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: core-n1.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002 Brick2: core-n2.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002 Brick3: core-n3.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002 Brick4: core-n4.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002 Brick5: core-n5.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002 Brick6: core-n6.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002 Options Reconfigured: network.compression.mode: server network.compression: on cluster.server-quorum-type: server server.allow-insecure: on performance.io-thread-count: 12 auth.allow: 10.30.* cluster.server-quorum-ratio: 51% [root@core-n1 ssd-vol-benchmark-n002]# Relevant /etc/fstab entries: # sda UUID=3d3b896d-67ef-444c-9f48-4bef621144b6 /boot ext4 defaults 1 2 UUID=d92d59fe-4e85-4619-8b72-793297d4c076 swap swap defaults 0 0 UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e / btrfs autodefrag,compress=zlib,ssd,thread_pool=12,subvol=root 0 0 UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e /home btrfs autodefrag,compress=zlib,ssd,thread_pool=12,subvol=home 0 0 UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e /export/ssd-brick-n001 btrfs autodefrag,compress=zlib,ssd,thread_pool=12,subvol=ssd-brick-n001 0 0 # sdb LABEL=hdd-n001 /export/hdd-brick-n001 btrfs autodefrag,compress=zlib,thread_pool=12,noatime,subvol=hdd-brick-n001 0 0 # sdc LABEL=hdd-n002 /export/hdd-brick-n002 btrfs autodefrag,compress=zlib,thread_pool=12,noatime,subvol=hdd-brick-n002 0 0 # ssd-vol-benchmark-n001 core-n1.storage-s0.example.vpn:/ssd-vol-benchmark-n001 /import/gluster/ssd-vol-benchmark-n001 glusterfs defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-benchmark-n001 0 0 # ssd-vol-benchmark-n002 core-n1.storage-s0.example.vpn:/ssd-vol-benchmark-n002 /import/gluster/ssd-vol-benchmark-n002 glusterfs defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-benchmark-n002 0 0 # ssd-vol-ovirt-iops-n001 core-n1.storage-s0.example.vpn:/ssd-vol-ovirt-iops-n001 /import/gluster/ssd-vol-ovirt-iops-n001 glusterfs defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-ovirt-iops-n001 0 0 Running: [root@core-n1 ~]# yum list installed|grep gluster glusterfs.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-api.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-api-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-api-devel.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-api-devel-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-cli.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-cli-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-devel.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-devel-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-fuse.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-fuse-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-geo-replication.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-geo-replication-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-libs.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-libs-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-rdma.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-rdma-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-regression-tests.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-regression-tests-3.5.0-0.5.beta3.fc19.x86_64 glusterfs-server.x86_64 3.5.0-0.5.beta3.fc19 @/glusterfs-server-3.5.0-0.5.beta3.fc19.x86_64 [root@core-n1 ~]# uname -a Linux core-n1.example.com 3.13.5-101.fc19.x86_64 #1 SMP Tue Feb 25 21:25:32 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux [root@core-n1 ~]# --- Additional comment from josh on 2014-03-07 07:43:17 CET --- I just ran the same two tests on my HDD bricks (the first two tests were on SSD bricks). I obtained the same result (volumes ending in -n002 have compression enabled, where volumes ending in -n001 do not): [root@core-n1 hdd-vol-benchmark-n002]# fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1 fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1 fio-2.0.13 Starting 1 process fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB) fio: pid=28809, err=5/file:engines/sync.c:67, func=xfer, error=Input/output error fio.write.out.1: (groupid=0, jobs=1): err= 5 (file:engines/sync.c:67, func=xfer, error=Input/output error): pid=28809: Fri Mar 7 01:40:19 2014 write: io=262144 B, bw=32000KB/s, iops=625 , runt= 8msec clat (usec): min=40 , max=1037 , avg=441.50, stdev=462.21 lat (usec): min=43 , max=1041 , avg=444.75, stdev=462.63 clat percentiles (usec): | 1.00th=[ 40], 5.00th=[ 40], 10.00th=[ 40], 20.00th=[ 40], | 30.00th=[ 114], 40.00th=[ 114], 50.00th=[ 114], 60.00th=[ 572], | 70.00th=[ 572], 80.00th=[ 1032], 90.00th=[ 1032], 95.00th=[ 1032], | 99.00th=[ 1032], 99.50th=[ 1032], 99.90th=[ 1032], 99.95th=[ 1032], | 99.99th=[ 1032] lat (usec) : 50=20.00%, 250=20.00%, 750=20.00% lat (msec) : 2=20.00% cpu : usr=0.00%, sys=0.00%, ctx=8, majf=0, minf=47 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=16.7%, 4=83.3%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=5/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=256KB, aggrb=32000KB/s, minb=32000KB/s, maxb=32000KB/s, mint=8msec, maxt=8msec [root@core-n1 hdd-vol-benchmark-n002]#
Same fate as https://bugzilla.redhat.com/show_bug.cgi?id=1073763 ?
This problem may still exist, if it is indeed caused by the network.compression (cdc) xlator. This feature is rarely used, and not much tested. There is currently no intention to improve the compression functionality. Of course we'll happily accept patches, but there is no plan to look into this bug any time soon.