Bug 1073763 - network.compression fails simple '--ioengine=sync' fio test
Summary: network.compression fails simple '--ioengine=sync' fio test
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: compression-xlator
Version: 3.5.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1174016
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-07 06:20 UTC by josh@wrale.com
Modified: 2016-06-17 15:56 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
: 1174016 (view as bug list)
Environment:
Last Closed: 2016-06-17 15:56:43 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description josh@wrale.com 2014-03-07 06:20:41 UTC
Description of problem:

I have two volumes configured that are basically identical except name, path and network.compression configured as on (no other compression changes were made).

Running test #1 from this page ( http://docs.gz.ro/fio-perf-tool-nutshell.html ) fails with compression-enabled volume, but the same test succeeds wonderfully on the volume without wire compression.  At first, both were configured with direct-io-mode=enable, but I disabled this on the compressed mount with hopes that this would help.  Omitting the direct-io-mode option in /etc/fstab did not help.

FYI: btrfs is used everywhere.

fio test #1 on plain mount (healthy state):

[root@core-n1 ssd-vol-benchmark-n001]# fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1
fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1
fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 1 process
fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB)
Jobs: 1 (f=1): [W] [100.0% done] [0K/532.3M/0K /s] [0 /8516 /0  iops] [eta 00m:00s]
fio.write.out.1: (groupid=0, jobs=1): err= 0: pid=25928: Fri Mar  7 00:52:17 2014
  write: io=20480MB, bw=546347KB/s, iops=8536 , runt= 38385msec
    clat (usec): min=27 , max=1300.1K, avg=115.08, stdev=2277.05
     lat (usec): min=27 , max=1300.1K, avg=116.23, stdev=2277.05
    clat percentiles (usec):
     |  1.00th=[   32],  5.00th=[   33], 10.00th=[   34], 20.00th=[   35],
     | 30.00th=[   40], 40.00th=[   51], 50.00th=[   95], 60.00th=[  163],
     | 70.00th=[  175], 80.00th=[  187], 90.00th=[  203], 95.00th=[  213],
     | 99.00th=[  235], 99.50th=[  249], 99.90th=[  342], 99.95th=[  350],
     | 99.99th=[  426]
    bw (KB/s)  : min=41277, max=604032, per=100.00%, avg=558639.53, stdev=69053.48
    lat (usec) : 50=37.10%, 100=13.49%, 250=48.91%, 500=0.49%, 750=0.01%
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 100=0.01%, 2000=0.01%
  cpu          : usr=2.65%, sys=4.15%, ctx=327684, majf=0, minf=166
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=327680/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=20480MB, aggrb=546346KB/s, minb=546346KB/s, maxb=546346KB/s, mint=38385msec, maxt=38385msec
[root@core-n1 ssd-vol-benchmark-n001]#


fio test #1 on network.compression on mount (fail state -- presumably because of wire compression):

[root@core-n1 ssd-vol-benchmark-n002]# fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1
fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 1 process
fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB)
fio: pid=26846, err=5/file:engines/sync.c:67, func=xfer, error=Input/output error

fio.write.out.1: (groupid=0, jobs=1): err= 5 (file:engines/sync.c:67, func=xfer, error=Input/output error): pid=26846: Fri Mar  7 01:04:43 2014
  write: io=262144 B, bw=36571KB/s, iops=714 , runt=     7msec
    clat (usec): min=35 , max=990 , avg=293.50, stdev=464.98
     lat (usec): min=36 , max=994 , avg=295.00, stdev=466.62
    clat percentiles (usec):
     |  1.00th=[   35],  5.00th=[   35], 10.00th=[   35], 20.00th=[   35],
     | 30.00th=[   55], 40.00th=[   55], 50.00th=[   55], 60.00th=[   94],
     | 70.00th=[   94], 80.00th=[  988], 90.00th=[  988], 95.00th=[  988],
     | 99.00th=[  988], 99.50th=[  988], 99.90th=[  988], 99.95th=[  988],
     | 99.99th=[  988]
    lat (usec) : 50=20.00%, 100=40.00%, 1000=20.00%
  cpu          : usr=0.00%, sys=0.00%, ctx=7, majf=0, minf=46
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=16.7%, 4=83.3%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=5/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=256KB, aggrb=36571KB/s, minb=36571KB/s, maxb=36571KB/s, mint=7msec, maxt=7msec
[root@core-n1 ssd-vol-benchmark-n002]# 



[root@core-n1 ~]# gluster pool list
UUID					Hostname	State
3e49dd15-c05f-4ef8-b1e0-c29c59623b45	core-n2.storage-s0.example.vpn	Connected
2a637a4b-b9a9-4dd5-80b4-78474e9e33cb	core-n5.storage-s0.example.vpn	Connected
6cc8c574-a237-4819-a175-e7218c8606d8	core-n6.storage-s0.example.vpn	Connected
36201442-b264-4078-a2d3-ff61a266f9d3	core-n4.storage-s0.example.vpn	Connected
10ac9d8a-ced5-4964-b8f3-802e7ccd2f2f	core-n3.storage-s0.example.vpn	Connected
7e46d31a-cd08-4677-8fd6-a5e7b9d7e7fe	localhost	Connected
[root@core-n1 ~]#



[root@core-n1 ~]# gluster volume info ssd-vol-benchmark-n001

Volume Name: ssd-vol-benchmark-n001
Type: Distributed-Replicate
Volume ID: 30dde773-bcba-4bd1-8ed9-6865571283db
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: core-n1.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick2: core-n2.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick3: core-n3.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick4: core-n4.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick5: core-n5.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick6: core-n6.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Options Reconfigured:
cluster.server-quorum-type: server
server.allow-insecure: on
performance.io-thread-count: 12
auth.allow: 10.30.*
cluster.server-quorum-ratio: 51%
[root@core-n1 ssd-vol-benchmark-n002]# gluster volume info ssd-vol-benchmark-n002

Volume Name: ssd-vol-benchmark-n002
Type: Distributed-Replicate
Volume ID: 809535b2-19d1-457c-a9f7-8b66094b358b
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: core-n1.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick2: core-n2.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick3: core-n3.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick4: core-n4.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick5: core-n5.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick6: core-n6.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Options Reconfigured:
network.compression.mode: server
network.compression: on
cluster.server-quorum-type: server
server.allow-insecure: on
performance.io-thread-count: 12
auth.allow: 10.30.*
cluster.server-quorum-ratio: 51%
[root@core-n1 ssd-vol-benchmark-n002]#


Relevant /etc/fstab entries:


# sda
UUID=3d3b896d-67ef-444c-9f48-4bef621144b6  /boot  ext4   defaults  1 2
UUID=d92d59fe-4e85-4619-8b72-793297d4c076  swap   swap   defaults  0 0
UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e  /                       btrfs  autodefrag,compress=zlib,ssd,thread_pool=12,subvol=root            0 0
UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e  /home                   btrfs  autodefrag,compress=zlib,ssd,thread_pool=12,subvol=home            0 0
UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e  /export/ssd-brick-n001  btrfs  autodefrag,compress=zlib,ssd,thread_pool=12,subvol=ssd-brick-n001  0 0

# sdb
LABEL=hdd-n001  /export/hdd-brick-n001  btrfs  autodefrag,compress=zlib,thread_pool=12,noatime,subvol=hdd-brick-n001  0 0

# sdc
LABEL=hdd-n002  /export/hdd-brick-n002  btrfs  autodefrag,compress=zlib,thread_pool=12,noatime,subvol=hdd-brick-n002  0 0

# ssd-vol-benchmark-n001
core-n1.storage-s0.example.vpn:/ssd-vol-benchmark-n001  /import/gluster/ssd-vol-benchmark-n001  glusterfs  defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-benchmark-n001  0 0

# ssd-vol-benchmark-n002
core-n1.storage-s0.example.vpn:/ssd-vol-benchmark-n002  /import/gluster/ssd-vol-benchmark-n002  glusterfs  defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-benchmark-n002  0 0

# ssd-vol-ovirt-iops-n001
core-n1.storage-s0.example.vpn:/ssd-vol-ovirt-iops-n001  /import/gluster/ssd-vol-ovirt-iops-n001  glusterfs  defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-ovirt-iops-n001  0 0





Running:

[root@core-n1 ~]# yum list installed|grep gluster
glusterfs.x86_64                     3.5.0-0.5.beta3.fc19              @/glusterfs-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-api.x86_64                 3.5.0-0.5.beta3.fc19              @/glusterfs-api-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-api-devel.x86_64           3.5.0-0.5.beta3.fc19              @/glusterfs-api-devel-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-cli.x86_64                 3.5.0-0.5.beta3.fc19              @/glusterfs-cli-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-devel.x86_64               3.5.0-0.5.beta3.fc19              @/glusterfs-devel-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-fuse.x86_64                3.5.0-0.5.beta3.fc19              @/glusterfs-fuse-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-geo-replication.x86_64     3.5.0-0.5.beta3.fc19              @/glusterfs-geo-replication-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-libs.x86_64                3.5.0-0.5.beta3.fc19              @/glusterfs-libs-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-rdma.x86_64                3.5.0-0.5.beta3.fc19              @/glusterfs-rdma-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-regression-tests.x86_64    3.5.0-0.5.beta3.fc19              @/glusterfs-regression-tests-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-server.x86_64              3.5.0-0.5.beta3.fc19              @/glusterfs-server-3.5.0-0.5.beta3.fc19.x86_64
[root@core-n1 ~]# uname -a
Linux core-n1.example.com 3.13.5-101.fc19.x86_64 #1 SMP Tue Feb 25 21:25:32 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
[root@core-n1 ~]#

Comment 1 josh@wrale.com 2014-03-07 06:43:17 UTC
I just ran the same two tests on my HDD bricks (the first two tests were on SSD bricks).  I obtained the same result (volumes ending in -n002 have compression enabled, where volumes ending in -n001 do not):

[root@core-n1 hdd-vol-benchmark-n002]# fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1
fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 1 process
fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB)
fio: pid=28809, err=5/file:engines/sync.c:67, func=xfer, error=Input/output error

fio.write.out.1: (groupid=0, jobs=1): err= 5 (file:engines/sync.c:67, func=xfer, error=Input/output error): pid=28809: Fri Mar  7 01:40:19 2014
  write: io=262144 B, bw=32000KB/s, iops=625 , runt=     8msec
    clat (usec): min=40 , max=1037 , avg=441.50, stdev=462.21
     lat (usec): min=43 , max=1041 , avg=444.75, stdev=462.63
    clat percentiles (usec):
     |  1.00th=[   40],  5.00th=[   40], 10.00th=[   40], 20.00th=[   40],
     | 30.00th=[  114], 40.00th=[  114], 50.00th=[  114], 60.00th=[  572],
     | 70.00th=[  572], 80.00th=[ 1032], 90.00th=[ 1032], 95.00th=[ 1032],
     | 99.00th=[ 1032], 99.50th=[ 1032], 99.90th=[ 1032], 99.95th=[ 1032],
     | 99.99th=[ 1032]
    lat (usec) : 50=20.00%, 250=20.00%, 750=20.00%
    lat (msec) : 2=20.00%
  cpu          : usr=0.00%, sys=0.00%, ctx=8, majf=0, minf=47
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=16.7%, 4=83.3%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=5/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=256KB, aggrb=32000KB/s, minb=32000KB/s, maxb=32000KB/s, mint=8msec, maxt=8msec
[root@core-n1 hdd-vol-benchmark-n002]#

Comment 2 Niels de Vos 2014-12-14 20:22:21 UTC
Bug 1174016 has been filed to get this fixed in the mainline version. When patches become available, we can backport these to the release-3.5 branch.

Comment 3 Niels de Vos 2016-06-17 15:56:43 UTC
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.


Note You need to log in before you can comment on or make changes to this bug.