RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 755046 - max_segments in dm is always 128
Summary: max_segments in dm is always 128
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Mike Snitzer
QA Contact: Storage QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-18 16:29 UTC by Shuichi Ihara
Modified: 2013-03-27 15:37 UTC (History)
12 users (show)

Fixed In Version: kernel-2.6.32-230.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-20 08:06:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
set dm's max_segments to physical device's max_segments (420 bytes, patch)
2011-11-18 16:32 UTC, Shuichi Ihara
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0862 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 6 kernel security, bug fix and enhancement update 2012-06-20 12:55:00 UTC

Description Shuichi Ihara 2011-11-18 16:29:15 UTC
device limits are set with dm_set_device_limits() in dm-table.c, but max_segments in dm is always 128 even device's max_segments is higher than 128.
e.g)
# multipath -ll

-- snip --
360001ff0805a7000000000018a1b0001 dm-5 
size=21T features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=20 status=active
| `- 3:0:0:1 sdc 8:32  active ready running
`-+- policy='round-robin 0' prio=100 status=enabled
  `- 4:0:0:1 sdg 8:96  active ready running
-- snip --

# cat /sys/block/sdc/queue/max_segments 
255
# cat /sys/block/sdg/queue/max_segments 
255

# cat /sys/block/dm-5/queue/max_segments 
128

I'm attaching a patch to copy physical's max_segments to dm's max_segments in dm_set_device_limits().

Comment 1 Shuichi Ihara 2011-11-18 16:32:42 UTC
Created attachment 534432 [details]
set dm's max_segments to physical device's max_segments

Comment 3 RHEL Program Management 2011-11-18 16:58:52 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 4 Milan Broz 2011-11-18 17:13:53 UTC
This is kernel patch, IMHO blk_queue_max_segments() is the function which should be used. Or even better, blk_stack_limits() should be used...

Mike, is the max_segments limit really missing there?

Comment 5 Shuichi Ihara 2011-11-18 17:17:12 UTC
Here is benchmark results with XFS after applied patch and without patch.
striped dm (-I 1024 -i 4) is created with 4 multipath devices. (8 physical
devices)

# multipath -ll
--snip--
360001ff0805a7000000000018a1b0001 dm-5 
size=21T features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=100 status=active
| `- 4:0:0:1 sdg 8:96  active ready running
`-+- policy='round-robin 0' prio=20 status=enabled
  `- 3:0:0:1 sdc 8:32  active ready running
--snip

# cat /sys/block/sdc/queue/max_segments 
255
# cat /sys/block/sdg/queue/max_segments 
255
# cat /sys/block/dm-5/queue/max_segments 
255


# pvcreate /dev/mapper/*[0123]
# vgcreate vg /dev/mapper/*[0123]
# lvcreate -I 1024 -i 4 -n vol01 -l 100%FREE vg
# mkfs.xfs /dev/vg/vol01

            write(GB/sec)  read(GB/sec)
unpatched       1.53          1.73
patched         1.64          1.74 


7-8% write perforamnce is improved.

Comment 6 Alasdair Kergon 2011-11-18 17:18:01 UTC
What if there are two devices, and the first has 128 and the second has 255?

I think the problem may be that the default value of 128 is always getting considered in the comparisons.

Comment 7 Shuichi Ihara 2011-11-18 17:27:14 UTC
/dev/sdc and /dev/sdg are pointing same LUN, but different priority.
max_segments on /dev/sd{c,g} are 255, but when we create multipath device with these two devices, new multipath devices's max_segments is 128 (always) and this is hard-corded which is problem..

Comment 8 Alasdair Kergon 2011-11-18 17:39:24 UTC
I'm not disputing the problem - I'm saying the attached patch isn't general enough to handle every case correctly.

Comment 9 Mike Snitzer 2011-11-18 20:32:31 UTC
(In reply to comment #4)
> This is kernel patch, IMHO blk_queue_max_segments() is the function which
> should be used. Or even better, blk_stack_limits() should be used...

Right, blk_stack_limits() is the right place to deal with this.

> Mike, is the max_segments limit really missing there?

blk_stack_limits() does deal with max_segments:

t->max_segments = min_not_zero(t->max_segments, b->max_segments);

But it clearly isn't working for me either.  I have an mpath device with max_segments=128 even though all underlying paths have max_segments=2048.

I'll dig deeper.

Comment 10 Mike Snitzer 2011-11-18 20:40:31 UTC
blk_set_default_limits() sets max_limits=BLK_MAX_SEGMENTS (which is 128).

DM first calls blk_set_default_limits() to establish a default.

DM's later call to blk_stack_limits (via bdev_stack_limits) will _not_ override 128 due to blk_stack_limits()'s use of min_not_zero() -- as I shared in comment#9.

This is a generic problem with the standard sequence:
blk_set_default_limits()
then
blk_stack_limits() or bdev_stack_limits()

Not sure on the proper fix yet...

Comment 11 Shuichi Ihara 2011-11-22 17:06:00 UTC
        blk_set_default_limits(limits);

+       limits->max_segments=0;
+
        while (i < dm_table_get_num_targets(table)) {
                blk_set_default_limits(&ti_limits);

Is just adding "limits->max_segments=0" after blk_set_default_limits(limits) in dm_calculate_queue_limits(), safe? If max_segments is set to "0", it cen be overwriteten by blk_stack_limits() later.

Comment 12 Mike Snitzer 2011-11-22 19:52:33 UTC
(In reply to comment #11)
>         blk_set_default_limits(limits);
> 
> +       limits->max_segments=0;
> +
>         while (i < dm_table_get_num_targets(table)) {
>                 blk_set_default_limits(&ti_limits);
> 
> Is just adding "limits->max_segments=0" after blk_set_default_limits(limits) in
> dm_calculate_queue_limits(), safe? If max_segments is set to "0", it cen be
> overwriteten by blk_stack_limits() later.

That is an isolated fix for max_segments but this needs a proper fix for all fields.

Comment 14 RHEL Program Management 2012-01-11 14:59:28 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 16 Mike Snitzer 2012-01-11 15:59:08 UTC
The upstream fix that has been backported to RHEL6.3 is here:
http://git.kernel.dk/?p=linux-block.git;a=commit;h=b1bd055d397e09f99dcef9b138ed104ff1812fcb

Comment 17 Aristeu Rozanski 2012-02-10 22:59:58 UTC
Patch(es) available on kernel-2.6.32-230.el6

Comment 19 Shuichi Ihara 2012-02-13 17:14:39 UTC
(In reply to comment #16)
> The upstream fix that has been backported to RHEL6.3 is here:
> http://git.kernel.dk/?p=linux-block.git;a=commit;h=b1bd055d397e09f99dcef9b138ed104ff1812fcb

can this patches be applied to current rhel6.2 kernel for testing?
Also, will these patches be landed in rhel6.2's updated kernel or won't support until rhel6.3?

Comment 20 Mike Snitzer 2012-02-13 18:26:10 UTC
(In reply to comment #19)
> (In reply to comment #16)
> > The upstream fix that has been backported to RHEL6.3 is here:
> > http://git.kernel.dk/?p=linux-block.git;a=commit;h=b1bd055d397e09f99dcef9b138ed104ff1812fcb
> 
> can this patches be applied to current rhel6.2 kernel for testing?
> Also, will these patches be landed in rhel6.2's updated kernel or won't support
> until rhel6.3?

Flagging this for 6.2.z (the fix has already been committed for 6.3)

Comment 26 errata-xmlrpc 2012-06-20 08:06:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0862.html


Note You need to log in before you can comment on or make changes to this bug.