Bug 1397854

Summary: [Perf] 10% and 20% drop in sequential writes on SMB v1 and V3 with RHEL 6.8 with a *2 deployment
Product: Red Hat Gluster Storage Reporter: Karan Sandha <ksandha>
Component: sambaAssignee: Poornima G <pgurusid>
Status: CLOSED WONTFIX QA Contact: Karan Sandha <ksandha>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, bturner, ksandha, madam, pgurusid, rcyriac, rhinduja, rhs-smb
Target Milestone: ---Keywords: Regression, ZStream
Target Release: ---Flags: pgurusid: needinfo? (ksandha)
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-20 03:56:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Fuse volume profile none

Comment 4 rjoseph 2016-11-25 06:41:44 UTC
Can you please provide some more information on this?

1) You said you seen 10% & 20% drop in sequential writes on 3.8.4.5 with SMB. Have you seen similar drop in fuse performance as well? I am saying this because there is no significant change from SMB in that build. Want to make sure that it is indeed SMB issue and not Gluster. It would be great to attach volume profile info from Fuse mount as well for comparison.

2) You said the regression is seen in RHEL 6.8. So can I assume other RHEL versions are working fine?

Comment 5 Karan Sandha 2016-11-25 12:33:55 UTC
rjoseph,

1) I am not seeing any performance drop for fuse on RHEL 6.8 with respect to the baseline. i have performed the tests with md-cache and without md-cache numbers are pretty much same for large files. I am attaching the volume profile of fuse mount for your reference with md-cache enabled.

2) The regression is only bound to RHEL 6.8, the numbers with 7.3 are similiar with base line.

3) I took performance numbers with mdcache enabled for SMB v1, SMB v3 as asked. Below are the numbers:- 



Performance numbers:-

                         3.1.3        3.8.4.5 with md-cache 
Sequential Write     v1 1410909           1152805
                     v3 1640096           1397110     



Please let me know if you want any further information regarding this. 


Thanks & Regards
Karan Sandha

Comment 6 Karan Sandha 2016-11-25 12:34:29 UTC
Created attachment 1224260 [details]
Fuse volume profile

Comment 8 surabhi 2016-11-29 08:58:33 UTC
As discussed in bug triage meeting providing qa_ack.

Comment 13 Poornima G 2016-11-30 11:21:17 UTC
Both server and clients are RHEL 6.8 is it? Also just to be sure, is "aio write" option disabled in smb.conf ?

Comment 14 Poornima G 2016-11-30 11:22:09 UTC
Also is client IO-threads enabled? Is there any difference if IO-threads is disabled?

Comment 17 Poornima G 2016-12-02 04:36:09 UTC
Thats a very helpful data, Thank You. Clearly enabling client IO threads has decreased performance in SMB setup. But the strange thing is client IO threads has nothing specific to RHEL version, need to check why the perf difference between RHEL 7.3 and 6.8.

Comment 18 Poornima G 2016-12-05 10:34:30 UTC
So, is the sequential write test run for every downstream build? This bug is for 3.8.4-5, was the test run on 3.8.4-1, 3.8.4-2, 3.8.4-3 ? If so can you share the data? This will help us identify where the regression was introduced.

Comment 19 Karan Sandha 2016-12-07 10:04:06 UTC
Sequential Writes	3.1.3 	        3.8.4.3  	3.8.4.5 
SMb v1	              1387254.18	1232018	        1261230.103
SMB v3	              1635201.65	1304528.339	1307121.996

here are the numbers for 3.8.4.3 when we first saw this issue. For 3.8.4.2 that we were using 3.1.3 samba bits for performance. Let me know poornima if you need any more data.

Comment 29 Poornima G 2018-11-19 05:28:12 UTC
Can this be retested on RHEL 6 latest? to check if the issue is still seen.

Comment 30 Poornima G 2018-11-20 03:56:55 UTC
As per the discussion with the team, cifs kernel client is not the priority at the moment.  Hence closing the bug.