Bug 1117957

Summary:	[perf] Bonnie++ rewrites on glusterfs are much slower compared to NFS mounts.
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Ben Turner <bturner>
Component:	glusterfs	Assignee:	Bug Updates Notification Mailing List <rhs-bugs>
Status:	CLOSED WORKSFORME	QA Contact:	Ben Turner <bturner>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.0	CC:	amukherj, sander, sankarshan
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-02-06 02:42:54 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ben Turner 2014-07-09 17:23:10 UTC

Description of problem:

The benchmark tool bonnie++ has several different tests it runs, one of them is rewrite.  In the rewrite test bonnie++:

Rewriting...done 

This gets a little interesting. It actually reads 8K, lseek back to the start of the block, overwrites the 8K with new data and loops. 

https://blogs.oracle.com/roch/entry/decoding_bonnie

Here is what it looks like in strace:

read(3, "\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250"..., 8192) = 8192
lseek(3, -8192, SEEK_CUR)               = 11262992384
write(3, "\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250\250"..., 8192) = 8192

When I run this test on gluster FS mounts I get:

Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
TEST            16G           103850  15  4653   6           285507  14 198.4  77
Latency                         524ms   40756ms              1451ms     893ms

1.96,1.96,TEST,1,1404917776,16G,,,,103850,15,4653,6,,,285507,14,198.4,77,,,,,,,,,,,,,,,,,,,524ms,40756ms,,1451ms,893ms,,,,,,

real    64m32.521s
user    0m19.037s
sys     4m26.480s


And on NFS mounts I get:

Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
TEST            16G           64129   3 83589  13           930327  50 357.4 375
Latency                       32177ms   24486ms             49685us    3258ms

1.96,1.96,TEST,1,1404919200,16G,,,,64129,3,83589,13,,,930327,50,357.4,375,,,,,,,,,,,,,,,,,,,32177ms,24486ms,,49685us,3258ms,,,,,,

real    8m28.655s
user    0m1.705s
sys     1m2.725s

The same group of tests take ~8 minutes on NFS mounts and ~64 minutes on gluster mounts.  Almost all of this time on gluster mounts is spent in the rewrite test.  Rewrites on NFS are seeing ~85 MB/sec where as glusterfs rewrites are seeing ~4 MB/sec.  Now I know that some of this has todo with the small record size bonnie uses but this seems to be exacerbated in the rewrite test.  

Version-Release number of selected component (if applicable):

glusterfs-3.6.0.24-1.el6_5.x86_64

How reproducible:

Every time.

Steps to Reproduce:
1.  Install bonnie++ from EPEL
2.  bonnie++ -d /gluster-mount -r 16G -s 16G -n 0 -m TEST -f -b -u root
3.

Actual results:

Rewrites on 10G systems are 4 KB / sec.

Expected results:

Rewrites performing closer to what I see in the read and write tests of the same block size.

Additional info:

Comment 3 Sander Hoentjen 2015-05-20 07:50:31 UTC

I see the same behaviour with upstream gluster 3.7