Red Hat Bugzilla – Bug 42355
Poor disk performance
Last modified: 2007-04-18 12:33:25 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Description of problem:
Running 7.1 kernel v 2.4.2-2
IBM xSeries x340
IBM ServeRAID 4M
I have run several benchmark util including:
The results are as follows. I see the same performance from a single drive
attached to an adaptec 78xx controller as with a 6 drive RAID-0 attached
to an IBM ServeRAID 4M. I did the most extensive testing with iotest. I
was using 8K blocks, 67%read 33%write, 20000 iterations, processes varied
from 2 - 30. It maxed out around 14 process at 1.2 MB/s on both configs.
However if I started multiple instances of IOtest each with 14 procs, each
one would report 1.2 MB/s on either Adaptec or IBM. So with 3 instances of
IOtest I got around 3.6 MB/s. The results from other benchmarks varied as
far as the actual throughput numbers, but they always reported about the
same reusults for the single drive on the Adaptec as the 6 drive RAID0.
Also doing a ps showed the status of the processes as:
[root@test tiobench-0.3.1]# ps -eo pid,%cpu,args,wchan|grep IOtest
4976 0.1 ./IOtest wait4
4979 0.1 ./IOtest wait_on_buffer
4980 0.2 ./IOtest wait_on_buffer
4981 0.2 ./IOtest wait_on_buffer
4982 0.1 ./IOtest wait_on_buffer
4983 0.0 ./IOtest wait_on_buffer
4984 0.0 ./IOtest wait_on_buffer
4985 0.2 ./IOtest wait_on_buffer
4986 0.2 ./IOtest wait_on_buffer
The ServeRAID 4M is capable of much faster throuput. We have checked the
driver to make sure it was not serializing the I/O requests.
Bug #33309 reference problems with elevator. I don't know it this is
related. I also see very poor performance on large dd commands.
I see the same results in 2.4.3 and 2.4.4
Steps to Reproduce:
1.Run any disk benchmark.
How much memory does the machine in question have ?
I'll arrange to get the ServerRAID shipped to me so I can test and see if
things can be improved.
It's an xSeries 340 Model 8656-6RY, 1G RAM, 2x1GHz Processors, ServeRAID 4M.
Don't declare me nuts immediatly, but does the same slowness also occur if you
boot with "mem=800M" as parameter ?
(if it doesn't it's a highmem problem, if it does, well, it isn't)
I tried it, but it had no effect.
I tried to reproduce this on my testmachine with an aic7xxx and tiobench
happily reports 29.86 Mbyte/second streaming performance. (but this is with our
current working kernel)
I can't get the serverraid card as QA needs it for testing and validation and we
have only one.
Was that read perf or write? How large was the file, what block size, how many
The results I was getting from tiobench using 5G file, 8K block size, 10
process was around 24M/s write 12M/s read. But I have to use 2.4.3 or 2.4.4 or
I get terrible write performance and the following in /var/log /messages:
(/var/log/messages-- using 2.4.2-2smp kernel)
May 17 10:58:16 localhost kernel: __alloc_pages: 0-order allocation failed.
May 17 10:58:49 localhost last message repeated 324 times
But I'm getting the same throughput results on both the single drive attached
to the onboard controller and a 3 drive RAID0 attached to the ServeRAID.
Also I'm seeing problems when running tiobench with large files, and when doing
large dd transfers. I run out of memory and it ends up killing off other
processes. It never frees the memory.
Here is the output of free after doing a large dd or tiobench.
[root@test /root]# free
total used free shared buffers cached
Mem: 2059448 2034148 25300 0 3228 1946488
-/+ buffers/cache: 84432 1975016
Swap: 0 0 0
When it's running free goes down to about 2K. Once it finishes then it comes
back up a little, but it never frees very much memory. If you try to run the
test again it will start killing more processes almost immediately. I also get
the following in /var/log messages.
May 31 08:23:29 localhost kernel: Out of Memory: Killed process 2846 (xterm).
May 31 08:23:33 localhost kernel: Out of Memory: Killed process 1357 (xfs).
May 31 08:23:37 localhost kernel: Out of Memory: Killed process 1644 (xterm).
We also tried the tests on a similar Compaq config. We end up getting about the
same results. A single drive on the embedded controller performs about the same
as a 3 drive RAID0 on the raid controller.
If the 2.4.7-0.X kernels still have this, it must be a device-driver issue
as my megaraid setup achieves > 28 megabyte/second on a raid5 partition
(ok this is not really super fast but it's limited by the hw raid5 engine)
IBM: Please retest with the 2.4.7-x kernel available via rawhide and provide
Sorry I didn't respond sooner. The problem ended up not being a disk perfomance
problem (exactly). The test we were running, tiobench, wasn't saturating the
controller. So it appeared that it wasn't getting very good throughput. The
real problem was made clear when we did large seq read/write ops. If you do a
dd on a really large file >2G it runs fine for a little while but it slowly
eats up all your virtual memory. It even starts killing off other processes
because there isn't enough free memory to go around. Also it doesn't free up
the memory once the process completes. I tried it on 2.4.7. It's not as bad as
earlier versions but it's still a problem.
It's basically the same problem.
KH 9/14/01 Per discussion with submitter, closing this bug as a duplicate of 33309.
*** This bug has been marked as a duplicate of 33309 ***