Bug 456181
Summary: | Read speed of /sbin/dump command is critically slow with CFQ I/O scheduler | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Pankaj Saraf <psaraf> | ||||||
Component: | kernel | Assignee: | Jeff Moyer <jmoyer> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 5.1 | CC: | alanm, bmr, cward, d.buggie, dejohnso, dzickus, esandeen, galens, jens.axboe, jlayton, jmoyer, jplans, k.georgiou, marcobillpeter, mmatsuya, moshiro, peterm, pzijlstr, riek, rlerch, rwheeler, sardella, tao, vfalico, vgoyal, yugzhang | ||||||
Target Milestone: | rc | Keywords: | Regression | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
URL: | https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=187283&gid=37 | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: |
Some applications (e.g. dump and nfsd) try to improve disk I/O performance by distributing I/O requests to multiple processes or threads. However, when using the Completely Fair Queuing (CFQ) I/O scheduler, this application design negatively affected I/O performance. In Red Hat Enterprise Linux 5.5, the kernel can now detect and merge cooperating queues, Additionally, the kernel can also detect if the queues stop cooperating, and split them apart again.
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 533932 (view as bug list) | Environment: | |||||||
Last Closed: | 2010-03-30 07:19:03 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 483701, 485920, 499522, 525215, 533192, 533932, 541103, 570814 | ||||||||
Attachments: |
|
Description
Pankaj Saraf
2008-07-21 23:34:00 UTC
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP. I'll take a look at this. I talked with the CFQ author about this (Jens Axboe), and he is aware of the problem and willing to help out. We'll update the bugzilla when we have test patches or packages. A workaround, for the time being, is to set the slice_idle to 0 during backups. I would restore it to its default value after backups are complete, though. You can tune this value by echoing numbers to /sys/block/<blockdev>/queue/iosched/slice_idle. For example, if your device is /dev/sdb, you would do the following: echo 0 > /sys/block/sdb/queue/iosched/slice_idle User psaraf's account has been closed Created attachment 319934 [details]
Implement support for interleaving requests between multiple processes
This patch is a backport of some of the close_cooperator changes that were introduced to (and later removed from) the upstream kernel's cfq I/O scheduler implementation. The intent is to detect multiple processes interleaving sequentialll file I/O. This patch is still preliminary. I have tested it with good results against both the read-test reproducer and the dump(8) command. I am currently working with Jens Axboe to come up with a similar patch for the upstream sources (so that this will not regress again in RHEL 6).
Moving to 5.4 Could you get this customer to try a test package including Jeff's patch? * https://bugzilla.redhat.com/attachment.cgi?id=319934 * https://bugzilla.redhat.com/show_bug.cgi?id=456181#c19 Created attachment 330700 [details]
read-test2.tar.gz
From FJ:
---
Hi Oshiro-san
I tested CFQ read performance with your kernel
(kernel-PAE-2.6.18-53.1.21) on my machines. The result of
read performance, measured by a new test program (read-test2.tar.gz)
that changes its output from the previous one (read-test.tar.gz),
is as follows.
| read performance(MB/s)
-------------------------------------
| deadline | cfq |
-------------------------------------
RHEL5.1 | 74.10 | 8 or 24 |
test kernel | 74.24 | 24 or 74 |
CFQ read performance of test kernel is faster than RHEL5.1. However,
the performance is sometimes critically slow, which is 24 MB/s on
your kernel. Therefore, I think that this problem has not been
completely fixed yet. Additionaly, this performance degradation occurs
on RHEL5.1, too. Below is the usage of this new test program.
$ tar zxvf read-test2.tar.gz
read-test2/
read-test2/test.sh
read-test2/Makefile
read-test2/read-test.c
$ cd read-test2
$ make
gcc -g -Wall -lrt -D _GNU_SOURCE -o read-test2 read-test2.c
$ su
# ./test.sh /dev/sda
***Total Ave 24.687157 MB/sec ***
***Total Ave 74.124758 MB/sec ***
***Total Ave 24.377011 MB/sec ***
***Total Ave 73.683626 MB/sec ***
***Total Ave 24.467749 MB/sec ***
***Total Ave 24.414345 MB/sec ***
***Total Ave 24.389293 MB/sec ***
***Total Ave 74.885984 MB/sec ***
***Total Ave 74.780804 MB/sec ***
***Total Ave 24.365709 MB/sec ***
***Total Ave 74.885984 MB/sec ***
***Total Ave 74.780804 MB/sec ***
...
I do not understand why this performance degradation occurred.
Do you have any information related to this performance degradation?
We need to clarify and fix this degradation. Please investigate it.
I'll continue to test, too.
I'll attach the new test program.
---
Updating PM score. Hello, Any news about this bug? Thank you! The current plan is to backport the iocontext sharing code from upstream and to patch dump to share I/O contexts. This work will not make the 5.4 release. When the problem was initially reported, I talked to Jens Axboe about it, and he seemed receptive to the idea of adding some code to CFQ to detect processes interleaving I/Os. When I came up with a first patch for this, he then suggested that we would be better off solving the problem in the applications themselves, by having the applications explicitly share I/O contexts (using sys_clone and the CLONE_IO flag*). I wrote a patch for dump to do this very thing, and it did solve the problem. However, the list of applications suffering from this kept growing. The applications I know of that perform interleaved reads between multiple processes include: dump nfsd qemu's posix aio backend one of the iSCSI target mode implementations a third-party volume manager It is evident that this is not too uncommon of a programming paradigm, so Jens decided to take the close cooperator patch set into 2.6.30. However, the implementation he merged was not quite ready for merging as it can cause some processes to be starved. I've been working with him to fix the problem properly while preserving fairness. In the end, the solution may involve a combination of detecting cooperating processes and sharing I/O contexts between them automatically. This issue is my number one priority, and I will keep this bugzilla updated as progress is made. * Note that shared I/O contexts (and the CLONE_IO flag) are not supported in RHEL 5, otherwise I would have made that fix available for the 5.4 release. I put together another test kernel that implements close cooperator detection logic, and merges the cfq_queue's associated with cooperating processes. The result is that we get a good speedup. In 100 runs of the read-test2 program (written to simulate the I/O pattern of the dump utility), these are the throughput numbers in MB/s: Deadline: Avg: 101.26907 Std. Dev.: 17.59767 CFQ: Avg: 100.14914 Std. Dev.: 17.42747 Most of the runs saw 105MB/s, but there were some outliers in the 28-30MB/s range. I looked into those cases, and found that the cause was processes were scheduled in just the wrong order to introduce seeks into the workload. Unfortunately, I haven't come up with a good solution for that particular problem, though I'll note that the problem affects other I/O schedulers as well. Upstream does not exhibit this behaviour, and I believe it may be due to the rewritten readahead code, but I can't be certain without further investigation. Without the patch set applied, the numbers for cfq were in the 7-10MB/s range. I wasn't able to test nfs server performance as my test lab was experiencing some networking issue. I'll get that testing underway once that problem is resolved. I've uploaded a test kernel here: http://people.redhat.com/jmoyer/cfq-cc/ Please take it for a spin and report your results. If you'd like to test on an architecture other than x86_64, just let me know and I'll kick off a build for whatever architecture is required. I've kicked off a build for i686 and will update this bug when that build is complete. In the mean time, I've uploaded the srpm to the location listed above. An i686 kernel rpm is now available at: http://people.redhat.com/jmoyer/cfq-cc/ Happy testing! in kernel-2.6.18-173.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified. I posted one additional patch for this to rhkernel-list for review. in kernel-2.6.18-177.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Some applications (including dump and nfsd) try to improve disk I/O performance by distributing I/O requests to multiple processes or threads. When using the CFQ I/O scheduler, this application design actually hurt performance, as the I/O scheduler would try to provide fairness between the processes or threads. This kernel contains a fix for this problem by detecting cooperating queues and merging them together. If the queues stop issuing requests close to one another, then they are broken apart again. Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Some applications (including dump and nfsd) try to improve disk I/O performance by distributing I/O requests to multiple processes or threads. When using the CFQ I/O scheduler, this application design actually hurt performance, as the I/O scheduler would try to provide fairness between the processes or threads. This kernel contains a fix for this problem by detecting cooperating queues and merging them together. If the queues stop issuing requests close to one another, then they are broken apart again.+Some applications (e.g. dump and nfsd) try to improve disk I/O performance by distributing I/O requests to multiple processes or threads. However, when using the Completely Fair Queuing (CFQ) I/O scheduler, this application design negatively affected I/O performance. In Red Hat Enterprise Linux 5.5, the kernel can now detect and merge cooperating queues, Additionally, the kernel can also detect if the queues stop cooperating, and split them apart again. ~~ Attention Customers and Partners - RHEL 5.5 Beta is now available on RHN ~~ RHEL 5.5 Beta has been released! There should be a fix present in this release that addresses your request. Please test and report back results here, by March 3rd 2010 (2010-03-03) or sooner. Upon successful verification of this request, post your results and update the Verified field in Bugzilla with the appropriate value. If you encounter any issues while testing, please describe them and set this bug into NEED_INFO. If you encounter new defects or have additional patch(es) to request for inclusion, please clone this bug per each request and escalate through your support representative. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |