Bug 1059989 - Large writes performance regression between 3.4.0.57rhs-1.el6rhs and 3.4.0.58rhs-1.el6rhs
Summary: Large writes performance regression between 3.4.0.57rhs-1.el6rhs and 3.4.0.5...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: fuse
Version: 2.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Vivek Agarwal
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-31 08:41 UTC by Anush Shetty
Modified: 2023-09-14 02:03 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-10-16 10:44:26 UTC
Embargoed:


Attachments (Terms of Use)
Regression script (7.70 KB, text/x-python)
2014-01-31 08:42 UTC, Anush Shetty
no flags Details

Description Anush Shetty 2014-01-31 08:41:58 UTC
Description of problem: We see a large file performance regression between 3.4.0.57rhs-1.el6rhs and 3.4.0.58rhs-1.el6rhs


Version-Release number of selected component (if applicable): 3.4.0.58rhs-1.el6rhs


How reproducible: Consistently


Steps to Reproduce:
1. Create 2x2 Distributed-Replicate volume and mount 2 fuse clients
2. Run iozone in clustered mode with the following optionsL: -w -c -e -i 0 -+n -r 64k -s 10g -t 8
3. Run regression script (Attached)
   python is-regression-v2.py throughput 95 10 10 baseline_sample test_sample

Actual results:

Regression in large file writes



Expected results:


Additional info:

# ./calc_avg 69 71

run69 - glusterfs - 3.4.0.57rhs-1.el6rhs - IOZONE - [-w -c -e -i 0 -+n -r 64k -s 10g -t 8] - distrep - (quota off, gsync off)
run71 - glusterfs - 3.4.0.58rhs-1.el6rhs - IOZONE - [-w -c -e -i 0 -+n -r 64k -s 10g -t 8] - distrep - (quota off, gsync off)

Operations                      RUN69   RUN71
-------------------------       ------- -------
write                           113783  97224
read                            180242  175290

======= Throughput write =========

decision parameters:
  sample type = throughput
  confidence threshold =  95.00 %
  max. pct. deviation =  10.00 %
  regression threshold =  10.00 % 
sample stats for baseline:
  min = 112456.210000
  max = 115952.230000
  mean = 113783.530000
  sd = 1893.800198
  pct.dev. =  1.66 %
sample stats for current:
  min = 97048.210000
  max = 97358.390000
  mean = 97224.083333
  sd = 159.212904
  pct.dev. =  0.16 %
CHANGE -14.55 percent
magnitude of change is at least 12.89%
/usr/lib64/python2.6/site-packages/scipy/stats/stats.py:420: DeprecationWarning: scipy.stats.mean is deprecated; please update your code to use numpy.mean.
Please note that:
    - numpy.mean axis argument defaults to None, not 0
    - numpy.mean has a ddof argument to replace bias in a more general manner.
      scipy.stats.mean(a, bias=True) can be replaced by numpy.mean(x,
axis=0, ddof=1).
  axis=0, ddof=1).""", DeprecationWarning)
t-test t-statistic = 15.091865 probability = 0.000112
t-test says that mean of two sample sets differs with probability  99.99%
probability that sample sets have same mean = 0.0001
declaring a performance regression test FAILURE because of lower throughput
RESULT:10


======= Throughput read =========

decision parameters:
  sample type = throughput
  confidence threshold =  95.00 %
  max. pct. deviation =  10.00 %
  regression threshold =  10.00 % 
sample stats for baseline:
  min = 179548.590000
  max = 181565.140000
  mean = 180242.366667
  sd = 1146.013124
  pct.dev. =  0.64 %
sample stats for current:
  min = 174850.550000
  max = 176092.530000
  mean = 175290.820000
  sd = 695.419108
  pct.dev. =  0.40 %
CHANGE -2.75 percent
magnitude of change is at least  2.11%
/usr/lib64/python2.6/site-packages/scipy/stats/stats.py:420: DeprecationWarning: scipy.stats.mean is deprecated; please update your code to use numpy.mean.
Please note that:
    - numpy.mean axis argument defaults to None, not 0
    - numpy.mean has a ddof argument to replace bias in a more general manner.
      scipy.stats.mean(a, bias=True) can be replaced by numpy.mean(x,
axis=0, ddof=1).
  axis=0, ddof=1).""", DeprecationWarning)
t-test t-statistic = 6.397835 probability = 0.003065
t-test says that mean of two sample sets differs with probability  99.69%
probability that sample sets have same mean = 0.0031
RESULT:0

Comment 1 Anush Shetty 2014-01-31 08:42:52 UTC
Created attachment 857751 [details]
Regression script

Comment 3 Vivek Agarwal 2014-01-31 10:51:44 UTC
Between 57 and 58 we have fixed 4 bugs : 977492 1026787 829734 1056204. None of these are in the IO path. Discussed the same with Anush, planning to re-run these tests.

Comment 4 santosh pradhan 2014-06-18 06:17:48 UTC
Because of I/O throttling fix (BZ 977492) which was gone into 3.4.0.58rhs-1.el6, the performance might be impacted. Your workload was not very big (-s 10g -t 8 which means 8 threads write of 10g size I/O = 80g).

To get the same performance as the previous build (i.e. 3.4.0.57rhs-1.el6, the one you compared with), you can set following parameters to turn the I/O throttling OFF.

gluster volume set <volname> nfs.outstanding-rpc-limit 0
gluster volume set <volname> server.outstanding-rpc-limit 0

And restart the I/O workload, ideally you should get the same performance.

It should not be a bug.

Thanks,
Santosh

Comment 5 santosh pradhan 2014-07-22 17:40:15 UTC
No update after a month. If no issue, should it be closed?

Comment 7 Vivek Agarwal 2014-10-16 10:44:26 UTC
Please reopen if it reoccurs, closing it for now.

Comment 8 Red Hat Bugzilla 2023-09-14 02:03:01 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.