Bug 1459101

Summary: [GSS] low sequential write performance on distributed dispersed volume on RHGS 3.2
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Prerna Sony <psony>
Component: disperseAssignee: Ashish Pandey <aspandey>
Status: CLOSED ERRATA QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: medium Docs Contact:
Priority: medium    
Version: rhgs-3.2CC: amukherj, aspandey, ccalhoun, pkarampu, psony, rhinduja, rhs-bugs, sheggodu, srmukher, storage-qa-internal, ubansal
Target Milestone: ---Keywords: FutureFeature, Performance
Target Release: RHGS 3.4.0Flags: aspandey: needinfo-
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-2 Doc Type: Enhancement
Doc Text:
Previously, the performance of write FOPs was affected due to FOPs modifying a non overlapping range of offset of the same file. This behavior prevented optimum performance, especially, if a brick was slow that caused each FOP taking more time to return. With the implementation of the parallel-writes feature, the FOPs performance is significantly improved.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-04 06:32:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1408949, 1472361, 1503132    

Description Prerna Sony 2017-06-06 09:52:13 UTC
Description of problem:
     High Write response time when performing sequential writes to  distributed dispersed gluster volume
     
     
Version-Release number of selected component (if applicable):
	RHGS 3.2
     
     
How reproducible:
	The issue is reproducible in customer environment
	
           
Actual results:
	Write response time is very high.
     
     
Expected results:
	Write response time is low.
     
     
Additional info:

Customer is using glusterfs fuse from client to mount the volume. 
Sequential writes are getting performed on the volume.The number of directories that are being written is not high. large files are being written which vary in size from few MBs to 100s of GB.

Comment 30 Nag Pavan Chilakam 2018-04-02 15:30:32 UTC
checked with different block sizes, and notice that performance improves once  blocksize is 6k or above



[root@rhs-client18 zen]# mkdir test
[root@rhs-client18 zen]# cd test
[root@rhs-client18 test]# dd if=/dev/zero of=odirectParallelOFF64KbBlock bs=64k count=10000 oflag=direct10000+0 records in
10000+0 records out
655360000 bytes (655 MB) copied, 13.3659 s, 49.0 MB/s
[root@rhs-client18 test]# cd 
[root@rhs-client18 ~]# cd /mnt/zen
[root@rhs-client18 zen]# cd test
[root@rhs-client18 test]# ls
odirectParallelOFF64KbBlock
[root@rhs-client18 test]# dd if=/dev/zero of=odirectParallelON64KbBlock bs=64k count=10000 oflag=direct
10000+0 records in
10000+0 records out
655360000 bytes (655 MB) copied, 9.14361 s, 71.7 MB/s
[root@rhs-client18 test]# 
[root@rhs-client18 test]# dd if=/dev/zero of=z{1..10}_odirectParallelON64KbBlock bs=64k count=10000 oflag=direct
10000+0 records in
10000+0 records out
655360000 bytes (655 MB) copied, 8.91855 s, 73.5 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z1_odirectParallelON64KbBlock bs=64k count=10000 oflag=direct
10000+0 records in
10000+0 records out
655360000 bytes (655 MB) copied, 9.48154 s, 69.1 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON64KbBlock_{1..10} bs=64k count=10000 oflag=direct
10000+0 records in
10000+0 records out
655360000 bytes (655 MB) copied, 9.02281 s, 72.6 MB/s
[root@rhs-client18 test]# 
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=2048 count=10000 oflag=direct
10000+0 records in
10000+0 records out
20480000 bytes (20 MB) copied, 0.639359 s, 32.0 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=1k count=10000 oflag=direct
10000+0 records in
10000+0 records out
10240000 bytes (10 MB) copied, 0.594572 s, 17.2 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=6k count=10000 oflag=direct
10000+0 records in
10000+0 records out
61440000 bytes (61 MB) copied, 0.818316 s, 75.1 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=4k count=10000 oflag=direct
10000+0 records in
10000+0 records out
40960000 bytes (41 MB) copied, 0.699695 s, 58.5 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=3k count=10000 oflag=direct
10000+0 records in
10000+0 records out
30720000 bytes (31 MB) copied, 0.902583 s, 34.0 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=6k count=10000 oflag=direct
10000+0 records in
10000+0 records out
61440000 bytes (61 MB) copied, 0.829471 s, 74.1 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=5k count=10000 oflag=direct
10000+0 records in
10000+0 records out
51200000 bytes (51 MB) copied, 0.777616 s, 65.8 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=1000k count=10 oflag=direct
10+0 records in
10+0 records out
10240000 bytes (10 MB) copied, 0.161007 s, 63.6 MB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=123 count=100 oflag=direct
100+0 records in
100+0 records out
12300 bytes (12 kB) copied, 0.0277637 s, 443 kB/s
[root@rhs-client18 test]# dd if=/dev/zero of=z_odirectParallelON2048bBlock_{1..10} bs=123 count=100000 oflag=direct
100000+0 records in
100000+0 records out
12300000 bytes (12 MB) copied, 8.16425 s, 1.5 MB/s
[root@rhs-client18 test]#

Comment 31 Nag Pavan Chilakam 2018-04-03 13:13:23 UTC
based on comment#30 and doc-text, the perf increases with bigger block size.
Hence moving to verified sanity.
If i see anyother issues, will raise seperate bug

Comment 34 errata-xmlrpc 2018-09-04 06:32:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607