Bug 147679 - kernel dm striped: device has a worse reading speed using a larger disk stripe
kernel dm striped: device has a worse reading speed using a larger disk stripe
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
x86_64 Linux
medium Severity high
: rc
: ---
Assigned To: Milan Broz
Cluster QE
Depends On: 245150
Blocks: 430698
  Show dependency treegraph
Reported: 2005-02-10 10:31 EST by Robert Scheck
Modified: 2013-02-28 23:04 EST (History)
16 users (show)

See Also:
Fixed In Version: kernel-2.6.32-14.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2010-11-15 09:21:21 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Proposed patch (1.35 KB, patch)
2007-02-09 11:05 EST, Milan Broz
no flags Details | Diff

  None (edit)
Description Robert Scheck 2005-02-10 10:31:09 EST
Description of problem:
LVM2 has a very bad reading performance using a larger disk stripe. Below is 
a single raw device out of a HP disk system (DS2405 to be exactly), connected 
via fibre channel (max 400 MB/s) to an HP DL145 (Opteron) machine:

--- snipp ---
[root@opteron ~]# time dd if=/dev/zero of=/dev/sdo bs=1024k count=10000
10000+0 records in
10000+0 records out

real    2m36.544s
user    0m0.017s
sys     0m16.983s
[root@opteron ~]#
[root@opteron ~]# time dd if=/dev/sdo of=/dev/null bs=1024k count=10000
10000+0 records in
10000+0 records out

real    2m20.435s
user    0m0.023s
sys     0m19.391s
[root@opteron ~]#
--- snapp ---

This results to 64 MB/s writing and 71 MB/s reading; that's okay.

Below is a disk stripe containing out of 14 HDDs with manual fibre channel 
multipath by LVM2 (at the same HP disk system):

--- snipp ---
[root@opteron ~]# time dd if=/dev/zero of=/dev/vg01/lv01 bs=1024k count=100000
100000+0 records in
100000+0 records out

real    4m44.290s
user    0m0.255s
sys     3m28.193s
[root@opteron ~]#
[root@opteron ~]# time dd if=/dev/vg01/lv01 of=/dev/null bs=1024k count=100000
100000+0 records in
100000+0 records out

real    9m32.262s
user    0m0.276s
sys     4m19.993s
[root@opteron ~]#
--- snapp ---

This results to 350 MB/s writing and 175 MB/s reading - a worse result for 
reading. If I use a HP-UX box, I get ~ 350/350 MB/s; a good result. So the 
problem is LVM2.

Version-Release number of selected component (if applicable):

How reproducible:
Everytime, see above.

Actual results:
Only 50% reading performance compared with the measured writing speed.

Expected results:
Same reading speed like at writing or even better.

Additional info:
This bug should be filed against Red Hat Enterprise Linux 4, but Red Hat 
Bugzilla currently has no possibility to do this.
Comment 1 Robert Scheck 2005-03-01 10:07:59 EST
Reassigned from Fedora Core devel to Red Hat Enterprise Linux 4.
Comment 2 Robert Scheck 2005-03-10 16:10:42 EST
Ping? This problem makes LVM2 mostly unusable for us and our customers at 
productive use...
Comment 3 Robert Scheck 2005-03-17 11:35:42 EST
We also get similar worse results, if we use LVM2 for a larger disk stripe out 
of SCSI disks - ~ 480 MB/s writing and only ~ 120 MB/s reading, that means, we 
can read four times slower than write...and the disk system is able to handle 
normally ~ 480 MB/s reading and writing!

Ah and for this results, SELinux was also disabled. Enabled SELinux doesn't 
improve the results in any way...

Come on - please :)
Comment 4 Robert Scheck 2005-04-22 04:06:21 EDT
Folks, what's up?! This is really a serious problem of RHEL4!!

We've got currently a further setup with a HP MSA30 (14 SCSI disks, 2 channels): 
A LVM2 stripe delivers only ~ 175 MB/s reading, while a software raid (striped)
provides ~ 350-400 MB/s at reading!

Writing speed at LVM2 is absolutely okay, but reading is worse :-(

Could we please get a working update soon?
Comment 5 Robert Scheck 2007-02-05 11:30:51 EST
Milan, are you planning to work on this issue or will it be as usual like in the
last two years: Nobody from Red Hat, Inc. cares about this report.
Comment 6 Milan Broz 2007-02-08 12:00:03 EST
Device-mapper doesn't calculate readahead for the stripe target, so the value is
not optimal and it leads to read performance degradation.

As a workaround you can set readahead for device mapper device by running
blockdev --setra <value in 512b sectors> /dev/mapper/<stripped device>

For best performace set "N * drive_readahead" (use --getra to check for current
value) but this will be too memory consuming, so you can try to set readahead to
2 * stripe_size.

Is it enough to solve problem with your configuration ? I am not able simulate
such big differences in read/write performance - there can be another problem.

Anyway, stripe target should be modified to calculate this automatically.
Comment 7 Milan Broz 2007-02-08 12:03:13 EST
Changing component to kernel, not related to LVM2 package.
Comment 8 Milan Broz 2007-02-09 11:05:49 EST
Created attachment 147784 [details]
Proposed patch

Add readahead calculation to dm raid0 target.
Comment 9 RHEL Product and Program Management 2007-02-09 11:24:25 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 11 Milan Broz 2007-07-12 10:12:44 EDT
Comment on attachment 147784 [details]
Proposed patch

Patch is not correct, cannot do this in stripe constructor.
Comment 12 Milan Broz 2007-07-12 10:23:46 EDT
There are at least two problems
 - small readahead, seems that we can do this from userspace, moved to bug 245150
 - device mapper core splits big io requests to max. page size (and this
decreases performance). Experimental patches for merging requests exists, but
not yet upstream and need kABI changes (RHEL4 backporting can be very
problematic). (See bug 232843 too).
Comment 15 Alasdair Kergon 2008-02-28 18:16:40 EST
Jens proposes wrapping the parameters being passed into a struct - I'm happy
with that
Comment 16 Alasdair Kergon 2008-02-29 18:43:42 EST
Do we accept this change to be RHEL6 material because of the kABI impact, or do
we put effort into making a special workaround?  (e.g. stack old values as we
recurse, and replace on return; or add new interface only for dm and add
hard-coding to block layer to use it)
Comment 17 Tom Coughlan 2008-03-25 16:03:44 EDT
The 4.7 deadline has passed. As far as I am aware, the need for this is not
large enough to justify the risk and effort to port this to 4.8. I will move
this to RHEL 6. If there is enough demand, it is possible that we could consider
this for RHEL 5, after it is upstream. Please open a separate BZ for that. 
Comment 18 Bill Nottingham 2008-10-09 15:04:00 EDT
Has this hit upstream yet?
Comment 19 Robert Scheck 2008-10-09 15:11:55 EDT
Sorry, I don't know. I'm just user, not programmer.
Comment 20 Milan Broz 2008-10-09 15:16:30 EDT
If the question is about "merge" patches, these are in 2.6.27-rc.

The lvm aligning to md chunk patches was just fixed in recent upstream lvm2 CVS.
Comment 21 Bill Nottingham 2008-10-09 15:21:39 EDT
Considering that RHEL 6 will include recent upstreams of both kernel and LVM, setting this to MODIFIED.

The feature requested has already been accepted into the upstream code base
planned for the next major release of Red Hat Enterprise Linux.

When the next milestone release of Red Hat Enterprise Linux 6 is available,
please verify that the feature requested is present and functioning as
Comment 22 Milan Kerslager 2008-10-10 06:01:03 EDT
Does this mean that backporting to RHEL 5 is not currently on the list?
Comment 23 Milan Broz 2008-10-10 06:35:09 EDT
Backporting kernel merge patches breaks kABI and workaround is not straightforward, so it is not planned for RHEL5 currently.

Setting optimal readahead for dm stripe should be in RHEL5.2 lvm2 package already (see bug 423391) .

(And using LVM over MD raid - LV aligning to MD chunksize by default will be in RHEL5.3 lvm2 package).
Comment 24 RHEL Product and Program Management 2009-02-05 18:31:37 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
Comment 26 releng-rhel@redhat.com 2010-11-15 09:21:21 EST
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.