Bug 1117957

Summary: [perf] Bonnie++ rewrites on glusterfs are much slower compared to NFS mounts.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ben Turner <bturner>
Component: glusterfsAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED WORKSFORME QA Contact: Ben Turner <bturner>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.0CC: amukherj, sander, sankarshan
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-06 02:42:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben Turner 2014-07-09 17:23:10 UTC
Description of problem:

The benchmark tool bonnie++ has several different tests it runs, one of them is rewrite.  In the rewrite test bonnie++:

Rewriting...done 

This gets a little interesting. It actually reads 8K, lseek back to the start of the block, overwrites the 8K with new data and loops. 

https://blogs.oracle.com/roch/entry/decoding_bonnie

Here is what it looks like in strace:

read(3, "\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250"..., 8192) = 8192
lseek(3, -8192, SEEK_CUR)               = 11262992384
write(3, "\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250"..., 8192) = 8192

When I run this test on gluster FS mounts I get:

Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
TEST            16G           103850  15  4653   6           285507  14 198.4  77
Latency                         524ms   40756ms              1451ms     893ms

1.96,1.96,TEST,1,1404917776,16G,,,,103850,15,4653,6,,,285507,14,198.4,77,,,,,,,,,,,,,,,,,,,524ms,40756ms,,1451ms,893ms,,,,,,

real    64m32.521s
user    0m19.037s
sys     4m26.480s


And on NFS mounts I get:

Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
TEST            16G           64129   3 83589  13           930327  50 357.4 375
Latency                       32177ms   24486ms             49685us    3258ms

1.96,1.96,TEST,1,1404919200,16G,,,,64129,3,83589,13,,,930327,50,357.4,375,,,,,,,,,,,,,,,,,,,32177ms,24486ms,,49685us,3258ms,,,,,,

real    8m28.655s
user    0m1.705s
sys     1m2.725s

The same group of tests take ~8 minutes on NFS mounts and ~64 minutes on gluster mounts.  Almost all of this time on gluster mounts is spent in the rewrite test.  Rewrites on NFS are seeing ~85 MB/sec where as glusterfs rewrites are seeing ~4 MB/sec.  Now I know that some of this has todo with the small record size bonnie uses but this seems to be exacerbated in the rewrite test.  

Version-Release number of selected component (if applicable):

glusterfs-3.6.0.24-1.el6_5.x86_64

How reproducible:

Every time.

Steps to Reproduce:
1.  Install bonnie++ from EPEL
2.  bonnie++ -d /gluster-mount -r 16G -s 16G -n 0 -m TEST -f -b -u root
3.

Actual results:

Rewrites on 10G systems are 4 KB / sec.

Expected results:

Rewrites performing closer to what I see in the read and write tests of the same block size.

Additional info:

Comment 3 Sander Hoentjen 2015-05-20 07:50:31 UTC
I see the same behaviour with upstream gluster 3.7