Bug 239998 - [RHEL4U5] Performance of Sequential-Read is bad on cciss disk
[RHEL4U5] Performance of Sequential-Read is bad on cciss disk
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.5
All Linux
medium Severity high
: ---
: ---
Assigned To: Tomas Henzl
Martin Jenner
: OtherQA
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-05-14 05:29 EDT by Masaki MAENO
Modified: 2009-06-19 13:03 EDT (History)
8 users (show)

See Also:
Fixed In Version: RHBA-2007-0791
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-11-15 11:26:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sysctl -a (15.74 KB, application/octet-stream)
2007-05-15 01:36 EDT, Masaki MAENO
no flags Details
/sys/block/cciss!c0d?/queue/* (209 bytes, application/octet-stream)
2007-05-15 01:37 EDT, Masaki MAENO
no flags Details
SmartArray6i (101.84 KB, image/png)
2007-05-15 01:39 EDT, Masaki MAENO
no flags Details
Sets READ_AHEAD back to 1024 (971 bytes, patch)
2007-06-07 16:56 EDT, Mike Miller (OS Dev)
no flags Details | Diff

  None (edit)
Description Masaki MAENO 2007-05-14 05:29:25 EDT
Description of problem:
Performance of sequential-read is bad on cciss disk, 
because read_ahead is not effective at all.

The default value of READ_AHEAD on cciss is 0 in RHEL4U5.
In RHEL4U4, it's 1024 (/sys/block/cciss!c0d0/queue/read_ahead_kb = 4096).

If we use ioctl(BLKRASET), I can evade the problem.
However, this is not a too friendly method. 

I hope that it change default parameter to 4096 (or 256 etc).

Steps to Reproduce:
1. $ iozone -apzM -i0 -i1 -i2 -y8k -q8k -n64M -g256M \
            -U /mnt/other -f /mnt/other/IOZONE.TEMP

Actual results:
* IOzone benchmark Result
=========================================================
OS            |  RHEL4U4  |        RHEL4U5              |
---------------------------------------------------------
read_ahead_kb |      4096 |         0 |            4096 |
              |(cciss def)|(cciss def)|(change by ioctl)|
--------------------------------------------------------|
FileSize      |           |           |                 |
         64MB |     79640 |BAD  23570 |           79948 |
        128MB |     81586 |BAD  23493 |           82047 |
        256MB |     81888 |BAD  23694 |           81746 |
=========================================================

Expected results:

READ_AHEAD is 4096 on cciss driver.
Comment 1 Masaki MAENO 2007-05-15 01:36:24 EDT
Created attachment 154712 [details]
sysctl -a
Comment 2 Masaki MAENO 2007-05-15 01:37:23 EDT
Created attachment 154713 [details]
/sys/block/cciss!c0d?/queue/*
Comment 3 Masaki MAENO 2007-05-15 01:39:03 EDT
Created attachment 154714 [details]
SmartArray6i
Comment 4 Masaki MAENO 2007-06-06 02:40:04 EDT
The result did not change though I updated v2.58 to v2.67 of hp SmartArray6i
Firmware version and tested it.

SmartArray6i v2.76 (Firmware Maintenance CD 7.80)
Actual results:
* IOzone benchmark Result
=========================================================
OS            |  RHEL4U4  |        RHEL4U5              |
Kernel (SMP)  |2.6.9-42.EL|        2.6.9-55.EL          |
---------------------------------------------------------
read_ahead_kb |      4096 |         0 |            4096 |
              |(cciss def)|(cciss def)|(change by ioctl)|
--------------------------------------------------------|
FileSize      |           |           |                 |
         64MB |     82022 |BAD  25590 |           80088 |
        128MB |     81676 |BAD  25413 |           81671 |
        256MB |     81898 |BAD  25436 |           82411 |
=========================================================

extra test:

# time dd if=/mnt/other/test.bin of=/dev/zero
(test.bin is 1GB file size.)

SmartArray6i v2.76 (Firmware Maintenance 7.80)
======
* Umount and 1st Read Time

- RHEL4U4 read_ahead_kb=4096
real    0m11.037s
user    0m0.818s
sys     0m2.697s

- RHEL4U5 read_ahead_kb=0 (RHEL4U5 default value)
real    0m41.190s
user    0m0.906s
sys     0m5.761s

- RHEL4U5 read_ahead_kb=4096 (RHEL4U0 - RHEL4U4 defualt value)
real    0m11.035s
user    0m0.737s
sys     0m2.671s
======
Comment 5 Tom Coughlan 2007-06-07 15:46:13 EDT
To recap some email communication: 

Mike Miller from HP said that, prior to 4.5, they were getting some reports of
bad read performance, at least on certain controllers. The solution suggested by
the firmware engineers was to just let the controller handle the read ahead. To
accomplish this, we changed the cciss driver READ_AHEAD parameter from 1024 to 0
in 4.5. It is possible that this improves performance on certain controllers, or
certain workloads, but reduces the performance on others. HP is currently
investigating this further. 

Depending on the results, we will consider changing the READ_AHEAD parameter in
4.6. In the meantime, you can change the parameter on a running system in sysfs:

#echo 1024 > /sys/block/cciss!c0d1/queue/read_ahead_kb


 
Comment 6 RHEL Product and Program Management 2007-06-07 16:05:30 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 7 Mike Miller (OS Dev) 2007-06-07 16:56:21 EDT
Created attachment 156509 [details]
Sets READ_AHEAD back to 1024

This patch sets read ahead back to 1024. Customers are complaining about poor
sequential read performance with read ahead set to zero.
Comment 8 Masaki MAENO 2007-06-07 21:01:25 EDT
>> Tom Coughlan

It is different. 

# ehho 1024 > /sys/block/cciss\!c0d0/queue/read_ahead_kb

Even if its command execute, read_ahead_kb is 256
because of sysfs's limit on /sys/block/cciss\!c0d0/queue/max_hw_sectors_kb (= 256).


> If we use ioctl(BLKRASET), I can evade the problem.
> However, this is not a too friendly method. 

To begin with, I know that it is possible to change with ioctl(BLKRASET).
If RedHat adjust READ_AHEAD to 1024 by the maintenance release, I will be
satisfied. 

ex.) # perl -e 'require "sys/ioctl.ph"; open (IN, "/dev/cciss/c0d0"); \
       ioctl(IN, 4706, 8192);'
                 ^^^^  ^^^^
                  |     |
             BLKRASET  read_ahead_kb x 2

>> RHEL Manager & Mike Miller

Thanks. Good job.
I hope that Red Hat takes READ_AHEAD patch to RHEL4.5 kernel errata release.
Comment 9 Masaki MAENO 2007-06-07 21:11:06 EDT
Supplement: READ_AHEAD = 1024 ---> read_ahead_kb = 4096
Comment 10 Mike Miller (OS Dev) 2007-06-08 11:26:30 EDT
I'm a little confused (as usual), does RH need something else from me? 
Comment 11 Tom Coughlan 2007-06-13 12:41:47 EDT
Mike,

We need to know whether the patch in Comment #7 (Sets READ_AHEAD back to 1024)
is the right solution for 4.6. Are there some controllers where READ_AHEAD = 0
gives better performance? What about the reports of poor performance that
motivated this change in the first place? 

> +#define READ_AHEAD 	1024	/* controller handles the read-ahead */

I assume that we should drop the comment from this. 

Tom
Comment 13 Mike Miller (OS Dev) 2007-06-13 14:12:33 EDT
Yes, this patch does set read_ahead back to 1024. In my haste I did not drop the
comment. Sorry.
The reports I get from our performance types conflict. They seem to run
disparate tools which act differently like queue depth. One tool may only keep a
queue depth of one while another may set it much deeper. 
Based on my analysis of the numbers I've seen going to 1024 is beneficial for
all sequential read ops. I don't see any evidence that setting read_ahead to
1024 adversely affects random read ops.
Please include this patch and strike the comment.
Comment 17 Fuchi Hideshi 2007-06-14 19:58:50 EDT
Hello Maeno-san,

Thank you for your cooperation.

> # perl -e 'require "sys/ioctl.ph"; open (IN, "/dev/cciss/c0d0"); ioctl(IN,
4706, 8192);'

Yes, your answer is correct. Anyway you can change the parameter on a
running system in sysfs. 

But, from the standpoint of a support, we wouldn't recommend it. 
We now have a plan 4.5.z, or as a hot fix for that.  So please wait. 
Your request is being processed.

If I have any information about the release I will update this ticket, thanks.

Kind regards,
Fuchi
Comment 18 Masaki MAENO 2007-06-14 22:09:50 EDT
>> Fuchi-san

Thank you for your reply.

> But, from the standpoint of a support, we wouldn't recommend it.

I see.

> We now have a plan 4.5.z, or as a hot fix for that.  So please wait. 
> Your request is being processed.

I knew RedHat's status well.
I hope for the thing that kernel-2.6.9-55.0.x is promptly released. 
At least, I hope for the thing taken into RHEL4.6.
Comment 19 Tomas Henzl 2007-06-15 08:56:07 EDT
tested and posted today as
http://post-office.corp.redhat.com/archives/rhkernel-list/2007-June/msg01527.html
Comment 20 RHEL Product and Program Management 2007-06-15 08:58:17 EDT
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.
Comment 21 Masaki MAENO 2007-06-18 08:25:38 EDT
Thank you.
I hope that new maintenance kernel of RHEL4 is released early. 
Comment 26 Don Howard 2007-07-17 15:53:54 EDT
A patch addressing this issue has been included in kernel-2.6.9-55.19.EL.
Comment 32 John Poelstra 2007-08-29 13:52:43 EDT
A fix for this issue should have been included in the packages contained in the
RHEL4.6 Beta released on RHN (also available at partners.redhat.com).  

Requested action: Please verify that your issue is fixed to ensure that it is
included in this update release.

After you (Red Hat Partner) have verified that this issue has been addressed,
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent
symptoms of the problem you are having and change the status of the bug to FAILS_QA.

If you cannot access bugzilla, please reply with a message to Issue Tracker and
I will change the status for you.  If you need assistance accessing
ftp://partners.redhat.com, please contact your Partner Manager.
Comment 33 John Poelstra 2007-09-05 18:26:03 EDT
A fix for this issue should have been included in the packages contained in 
the RHEL4.6-Snapshot1 on partners.redhat.com.  

Requested action: Please verify that your issue is fixed to ensure that it is 
included in this update release.

After you (Red Hat Partner) have verified that this issue has been addressed, 
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent 
symptoms of the problem you are having and change the status of the bug to 
FAILS_QA.

If you cannot access bugzilla, please reply with a message about your test 
results to Issue Tracker.  If you need assistance accessing 
ftp://partners.redhat.com, please contact your Partner Manager.
Comment 34 Masaki MAENO 2007-09-10 02:37:17 EDT
I confirmed that it is fixed from 0 to 1024 for READ_AHEAD
(drivers/block/cciss.c) in kernel-2.6.9-56.EL source code of RHEL4U6 beta.

So, I tested the same one.
I confirmed that it is also fixed in sequencial-read performance in RHEL4U6 beta.

Steps to produce:
$ iozone -apzM -i0 -i1 -i2 -y8k -q8k -n64M -g256M \
            -U /mnt/other -f /mnt/other/IOZONE.TEMP

SmartArray6i v2.76 (Firmware Maintenance CD 7.80)
Actual results:
* IOzone benchmark Result
=====================================================================
OS            |  RHEL4U4  |        RHEL4U5              |RHEL4U6beta|
Kernel (SMP)  |2.6.9-42.EL|        2.6.9-55.EL          |2.6.9-56.EL|
---------------------------------------------------------------------
read_ahead_kb |      4096 |         0 |            4096 |      4096 |
              |(cciss def)|(cciss def)|(change by ioctl)|(cciss def)|
---------------------------------------------------------------------
FileSize      |           |           |                 |           |
         64MB |     82022 |BAD  25590 |           80088 |     80808 |
        128MB |     81676 |BAD  25413 |           81671 |     81672 |
        256MB |     81898 |BAD  25436 |           82411 |     82483 |
=====================================================================

Thank you very much.

Comment 36 errata-xmlrpc 2007-11-15 11:26:56 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html

Note You need to log in before you can comment on or make changes to this bug.