Bug 713754 - dm-multipath default rr_min_io too high?
Summary: dm-multipath default rr_min_io too high?
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: device-mapper-multipath
Version: 6.1
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Ben Marzinski
QA Contact: Gris Ge
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-06-16 13:38 UTC by Johan Broman
Modified: 2014-06-26 23:24 UTC (History)
16 users (show)

Fixed In Version: device-mapper-multipath-0.4.9-43.el6
Doc Type: Bug Fix
Doc Text:
device-mapper-multipath now has a new multipath.conf parameter, rr_min_io_rq, in the defaults, devices and multipaths section of multipath.conf. rr_min_io no longer does anything in rhel6. It is only for older kernels. rr_min_io_rq sets the number of requests to route to a path before switching to the next in the same path group. It defaults to 1.
Clone Of:
Environment:
Last Closed: 2011-12-06 18:07:46 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1527 normal SHIPPED_LIVE device-mapper-multipath bug fix and enhancement update 2011-12-06 01:02:12 UTC

Description Johan Broman 2011-06-16 13:38:19 UTC
Description of problem:

dm-multipath sets a high rr_min_io value for many storage arrays (100 to 1000). This configuration parameter value seems way to high in RHEL6 when we are using rq based dm-multipath. 

Should this parameter be removed or set lower?

Benchmarking high-end storage arrays has shown a low value (1) to be a much better value.

Regards
Johan

Comment 1 Alasdair Kergon 2011-06-16 13:49:07 UTC
Well I'd be surprised if we ever have "one size fits all" here.
It's a tunable for a good reason, in the same way that the queue depth can, and often should, be tuned according to the configuration and the workload.

What is needed, I think, is an evidence-based tuning guide.  This bugzilla could be used to gather test results, perhaps.

Comment 2 Mike Snitzer 2011-06-16 13:52:22 UTC
This upstream multipath-tools.git commit should be backported:

commit 2b68b839565e38d8b73f1ec79cc6c84f7f3bade4
Author: Christophe Varoqui <christophe.varoqui@opensvc.com>
Date:   Tue Feb 1 00:21:17 2011 +0100

    Support different 'minio' values for rq and bio based dm-multipath
    
    rq based dm-multipath wants a low minio value for optimal performance
    (1-2?) whereas bio-based dm-multipath wants a greater value (128-1000)
    
    Introduce a internal DEFAULT_MINIO_RQ set to 1, and new configuration
    parameter name 'rr_min_io_rq' useable in 'default', 'device' and
    'multipath' sections. The internal hardware table entries also have
    the new 'minio_rq' field.
    
    When dm-multipath driver version is detected >= 1.1.0, only the
    rr_min_io_rq (cf), minio_rq (hwe) and DEFAULT_MINIO_RQ (default) are
    used. Else, preserve the legacy behaviour.

Comment 4 Alasdair Kergon 2011-06-16 14:13:28 UTC
I remain unconvinced that a number as low as 1 or 2 is a sensible default, and suspect that if it makes a significant difference in a given scenario, it's a clue that there's a bottleneck somewhere else you need to seek out and address.

Some real-world examples of tuning a stack for different workloads would be helpful.

Comment 5 Alasdair Kergon 2011-06-16 14:24:19 UTC
Should multipath-tools take on responsibility (or provide options) for more of this tuning?  (It could start with 'nr_requests', where I think the default value is generally regarded as too low for 'enterprise' set-ups.)

Comment 15 Ben Marzinski 2011-08-14 20:32:30 UTC
I've back-ported the upstream patch. If we come to a better understanding of this, I open to changing things. But until then, this change makes customer have improved performance almost universally, so it belongs in.

Additionally, I don't see why a size of 1 doesn't make sense.  Since multipath is now below the elevators, no more merging is going to happen, and there isn't a large overhead for checking which path to use next. So picking the optimum path for every merged request doesn't seem like it should be a bad idea.

Comment 18 Tom Coughlan 2011-08-24 16:37:07 UTC
Hi Ben,

I see you have posted an update in BZ 707638 to describe this change in the dm-multipath manual. This change is significant enough that you should also mention it in the Tech. Notes. Unless I am mistaken, the best way to do that is to set the flag here (done) and add some text to the box, above. Thanks for handling this.

Tom

Comment 19 Ben Marzinski 2011-08-25 16:36:03 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
device-mapper-multipath now has a new multipath.conf parameter, rr_min_io_rq, in the defaults, devices and multipaths section of multipath.conf.  rr_min_io no longer does anything in rhel6.  It is only for older kernels. rr_min_io_rq sets the number of requests to route to a path before switching to the next in the same path group. It defaults to 1.

Comment 20 Gris Ge 2011-10-20 07:58:20 UTC
Performed simple I/O test on FC 4Gb HBA via 2x4 paths (target is NetApp ONTAP).

Netapp LUN build config in "devices" section is "rr_min_io 128"

After this change, this line got ignored and goes to default setting "rr_min_io_rq 1"

I got a slight performance regression (Same command has been kicked off for 5 times, got almost the same results):

====================================================================
============= New device-mapper-multipath-0.4.9-45.el6 =============
====================================================================
[root@storageqe-06 ~]# dd if=/dev/mapper/mpathf of=/dev/null bs=4096
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 7.57309 s, 284 MB/s
[root@storageqe-06 ~]# dd if=/dev/zero of=/dev/mapper/mpathf  bs=4096
dd: writing `/dev/mapper/mpathf': No space left on device
524289+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 13.2821 s, 162 MB/s

[root@storageqe-06 ~]# dt disable=eof,pstats oncerr=abort min=b max=256k flags=direct of=/dev/mapper/mpathf
Total Statistics:
     Output device/file name: /dev/mapper/mpathf (device type=block)
     Type of I/O's performed: sequential (forward)
   Data pattern read/written: 0x39c39c39
     Total records processed: 32736 with min=512, max=262144, incr=512
     Total bytes transferred: 4294967296 (4194304.000 Kbytes, 4096.000 Mbytes)
      Average transfer rates: 78246808 bytes/sec, 76412.899 Kbytes/sec
     Number I/O's per second: 596.393
      Total passes completed: 1/1
       Total errors detected: 0/1
          Total elapsed time: 00m54.89s
           Total system time: 00m01.09s
             Total user time: 00m15.47s
               Starting time: Thu Oct 20 03:38:26 2011
                 Ending time: Thu Oct 20 03:39:21 2011

====================================================================
============= Old device-mapper-multipath-0.4.9-31.el6 =============
====================================================================
[root@storageqe-06 ~]# dd if=/dev/mapper/mpathf of=/dev/null bs=4096
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 7.44613 s, 288 MB/s
[root@storageqe-06 ~]# dd if=/dev/zero of=/dev/mapper/mpathf  bs=4096
524289+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 13.3511 s, 161 MB/s

[root@storageqe-06 ~]# iostat -k 3 | egrep "sde|sdl|sdab|sdan|sds|dm-5|tps"
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sde             579.33     74154.67         0.00     222464          0
sdl             597.33     76458.67         0.00     229376          0
sds               0.00         0.00         0.00          0          0
sdab            585.33     74922.67         0.00     224768          0
sdan            554.67     70997.33         0.00     212992          0
dm-5           2317.00    296576.00         0.00     889728          0


[root@storageqe-06 ~]# dt disable=eof,pstats oncerr=abort min=b max=256k flags=direct of=/dev/mapper/mpathf
Total Statistics:
     Output device/file name: /dev/mapper/mpathf (device type=block)
     Type of I/O's performed: sequential (forward)
   Data pattern read/written: 0x39c39c39
     Total records processed: 32736 with min=512, max=262144, incr=512
     Total bytes transferred: 4294967296 (4194304.000 Kbytes, 4096.000 Mbytes)
      Average transfer rates: 79301464 bytes/sec, 77442.836 Kbytes/sec
     Number I/O's per second: 604.431
      Total passes completed: 1/1
       Total errors detected: 0/1
          Total elapsed time: 00m54.16s
           Total system time: 00m01.03s
             Total user time: 00m15.29s
               Starting time: Thu Oct 20 03:25:17 2011
                 Ending time: Thu Oct 20 03:26:11 2011

========================================================================

Ben,

After this change, all the build config from vendor got ignored. Should we respect vendor's config by rename all build config from "rr_min_io" to "rr_min_io_rq".

Comment 21 Ben Marzinski 2011-10-21 16:29:32 UTC
For the majority of users, setting this very low seems to give the best results.  Now that this change has been made, it's up to the hardware vendors to give us configs that use this new parameter.

Comment 22 Gris Ge 2011-10-26 07:01:15 UTC
If you say so, VERIFY this bug. /usr/doc/ also updated.

Vendor need submit new configs if they want keep "rr_min_io" work for RHEL 6 on their product.

Comment 23 errata-xmlrpc 2011-12-06 18:07:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1527.html


Note You need to log in before you can comment on or make changes to this bug.