Bug 234297

Summary:	RHEL4U4 cciss array performance much better than RHEL5
Product:	Red Hat Enterprise Linux 5	Reporter:	Peter Klotz <peter.klotz>
Component:	kernel	Assignee:	Tom Coughlan <coughlan>
Status:	CLOSED WONTFIX	QA Contact:	Red Hat Kernel QE team <kernel-qe>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	5.2	CC:	aakpinar, coughlan, h.plankl, ito.kazuo, jarod, mike.miller, w.moser
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-07-20 22:02:15 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Peter Klotz 2007-03-28 10:19:37 UTC

Description of problem:

We use HP ProLiant DL385 servers (2 dual-core Opteron 280, 2.4GHz, 8GB RAM,
2*72GB RAID1+0 [OS], 3*300GB RAID5 [VMs]) together with VMware Server 1.0.2 to
virtualize most of our infrastructure.

Recently we switched from RHEL4U4 x86_64 to RHEL5 x86_64 and noticed a severe
performance degradation. 

Tasks that compile software in virtual machines are 50% to 100% slower than they
were before. Parallel disk I/O in different virtual machines leads to very high
CPU loads (up to 40) on the host machine that did not occur before the upgrade.
Virtual RHEL3U8 machines run into SCSI timeouts and bus resets and have to be
rebooted.

Version-Release number of selected component (if applicable):
kernel-2.6.18-8.el5

How reproducible:
Always

Steps to Reproduce:
1. The high CPU load of the host can be observed by running commands like "find
/" in two different virtual machines in parallel.
2.
3.
  
Actual results:
High CPU load on host. SCSI timeouts in virtual machines. Poor performance of
virtual machines.

Expected results:
Behavior of RHEL4U4.

Additional info:

Comment 1 Peter Klotz 2007-04-02 10:36:17 UTC

The machine uses a Smart Array 6i RAID Controller. We can reproduce the I/O
problems without VMware Server and its virtual machines. Simple tests with dd
show that parallel read/write operations on the host (especially when performed
on the RAID5 array) result in poor performance.

RAID5 performance (reading 3GB, writing 1GB):

[root@chip icorac_disk]# dd if=3GBtest of=/dev/null
6291456+0 records in
6291456+0 records out
3221225472 bytes (3.2 GB) copied, 188.196 seconds, 17.1 MB/s
[root@chip machines]# dd if=/dev/zero of=1GBtest bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 130.694 seconds, 8.2 MB/s

The write performance is bad even without a finalizing sync operation.


RAID1 performance (reading 3GB, writing 1GB):

[root@chip tmp]# dd if=3GBtest of=/dev/null
6291456+0 records in
6291456+0 records out
3221225472 bytes (3.2 GB) copied, 61.4583 seconds, 52.4 MB/s
[root@chip tmp]# dd if=/dev/zero of=1GBtest bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.47527 seconds, 240 MB/s

The drivers for RHEL4 and RHEL5 differ (according to modinfo):

RHEL4 ... cciss 2.6.10.RH1
RHEL5 ... cciss 3.6.14-RH1

Maybe a change that was made to this driver explains our performance issue.

Comment 2 Jarod Wilson 2007-09-02 04:33:32 UTC

The cciss driver has received significant updates for rhel5.1. If you would, please give the latest rhel5.1 
beta kernel a try and let us know if the performance problems persist.

http://people.redhat.com/dzickus/el5/

Comment 3 Peter Klotz 2007-09-05 09:13:56 UTC

We had to reinstall RHEL4U4 since it is a production machine. The only machine I
got left under RHEL5 has no RAID5 (only RAID1+0) and therefore is no good test
candidate.

Nevertheless I will try to perform some comparison between stock RHEL5 and the
updated kernel you supplied.

To show the performance difference I repeated the measurements from Comment #1
under RHEL4U4 on the RAID5 array:

[root@chip machines]# time dd if=3GBtest of=/dev/null
6291456+0 records in
6291456+0 records out

real    0m48.374s
user    0m1.299s
sys     0m9.015s

[root@chip machines]# time dd if=/dev/zero of=1GBtest bs=1M count=1024
1024+0 records in
1024+0 records out

real    0m4.725s
user    0m0.002s
sys     0m3.164s

Comparison of RAID5 performance:

              RHEL4U4   RHEL5
Reading 3GB       48s    188s 
Writing 1GB        5s    130s

I am aware that especially write performance is influenced by caching but since
I used the same hardware this should not have been an issue.

It seems that RHEL5 does not use caching at all. The disks are U320 SCSI 300GB
10K RPM HDDs so writing should be much faster than the measured 8.2 MB/s (see
Comment #1).

Comment 4 Aldemir Akpinar 2007-09-13 08:06:53 UTC

I have installed the latest kernel (2.6.18-47PAE) to our production machine, on
which we are having same problems. But this did not make any differences at all.
Still having load averages around 1000. The machine I have is: HP DL380 G3 with
3x74GB disks having RAID5. The machine has plenty of RAM for a webserver (8GB).
If you want I can provide more information. But I have to revert to machine back
to RedHat4 (or some other Distribution I must say) soon since this issue makes
our website sluggish and unusable.

Comment 5 Peter Klotz 2007-09-13 08:46:22 UTC

Finally I managed to add a RAID5 (using the already mentioned 300GB HDDs) to the
remaining RHEL5 machine. 

The results are odd since even with the RHEL5 stock kernel (2.6.18-8) I obtained
very good results.

[root@brain vmtest]# uname -a
Linux brain.tilak.ibk 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:14 EST 2007 x86_64
x86_64 x86_64 GNU/Linux
[root@brain vmtest]# dd if=3GBtest of=/dev/null
6291456+0 records in
6291456+0 records out
3221225472 bytes (3.2 GB) copied, 17.692 seconds, 182 MB/s
[root@brain vmtest]# dd if=/dev/zero of=1GBtest bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 5.16115 seconds, 208 MB/s

There are two differences between both machines I used for testing.

* The slow one uses 8GB RAM, the fast one only 3GB
* Different firmware

The firmware changelog does not mention any HDD performance issues fixed.
Could the difference in RAM cause such an phenomenon?

Since the 8GB machine is a production server (and currently running under RHEL4)
it is not very easy to either reduce the amount of RAM or to upgrade the firmware.

Comment 6 Peter Klotz 2007-09-13 09:42:14 UTC

2.6.18-47 performs more or less equal to 2.6.18-8:

[root@brain vmtest]# uname -a
Linux brain.tilak.ibk 2.6.18-47.el5 #1 SMP Tue Sep 11 17:46:21 EDT 2007 x86_64
x86_64 x86_64 GNU/Linux
[root@brain vmtest]# dd if=3GBtest of=/dev/null
6291456+0 records in
6291456+0 records out
3221225472 bytes (3.2 GB) copied, 19.4392 seconds, 166 MB/s
[root@brain vmtest]# dd if=/dev/zero of=1GBtest bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 7.1683 seconds, 150 MB/s

Tomorrow we will shutdown our production machine and test it with 3GB RAM under
RHEL5. This should confirm or rule out the firmware as the origin of our
performance issues.

Comment 7 Herbert L. Plankl 2008-07-07 08:29:23 UTC

Now we've updated our production machine from RHEL4U4 to RHEL5.2. The parallel
I/O performance remains really poor (in comparison to RHEL4U4).

machine: see comment #1
OS: RHEL5.2 x86_64
VMware: VMware-server-1.0.6-91891
kernel: 2.6.18-92.el5

Comment 9 Mike Miller (OS Dev) 2008-11-10 16:16:55 UTC

What controller is being used?

Comment 10 Peter Klotz 2009-01-20 12:39:31 UTC

It is a HP Smart Array 6i RAID Controller (see Comment #1).

Comment 11 Kazuo Ito 2010-06-10 11:58:51 UTC

This might be the same as, or somewhat related to Bug 237605,
closed because we haven't paid enough attention to it...

I would like to suggest re-running of the test after
changing the value of /sys/block/<device>/queue/nr_requests
to its old default, 8192, or something lower,
yet still higher than the current default of 128.

Comment 12 Tom Coughlan 2012-07-20 22:02:15 UTC

No reply since June '10. Assuming this is resolved. Closing.