42355 – Poor disk performance

Bug 42355 - Poor disk performance

Summary: Poor disk performance

Keywords:
Status:	CLOSED DUPLICATE of bug 33309
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	kernel
Sub Component:
Version:	7.1
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Arjan van de Ven
QA Contact:	Brock Organ
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2001-05-25 22:57 UTC by Need Real Name
Modified:	2007-04-18 16:33 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2001-08-22 17:22:49 UTC
Embargoed:

Attachments	(Terms of Use)

Description Need Real Name 2001-05-25 22:57:45 UTC

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)

Description of problem:
Running 7.1 kernel v 2.4.2-2
IBM xSeries x340
IBM ServeRAID 4M

I have run several benchmark util including:
bonnie
iotest
tiobench
dd
The results are as follows. I see the same performance from a single drive 
attached to an adaptec 78xx controller as with a 6 drive RAID-0 attached 
to an IBM ServeRAID 4M. I did the most extensive testing with iotest. I 
was using 8K blocks, 67%read 33%write, 20000 iterations, processes varied 
from 2 - 30. It maxed out around 14 process at 1.2 MB/s on both configs. 
However if I started multiple instances of IOtest each with 14 procs, each 
one would report 1.2 MB/s on either Adaptec or IBM. So with 3 instances of 
IOtest I got around 3.6 MB/s. The results from other benchmarks varied as 
far as the actual throughput numbers, but they always reported about the 
same reusults for the single drive on the Adaptec as the 6 drive RAID0.
Also doing a ps showed the status of the processes as:
[root@test tiobench-0.3.1]# ps -eo pid,%cpu,args,wchan|grep IOtest
 4976  0.1 ./IOtest         wait4
 4979  0.1 ./IOtest         wait_on_buffer
 4980  0.2 ./IOtest         wait_on_buffer
 4981  0.2 ./IOtest         wait_on_buffer
 4982  0.1 ./IOtest         wait_on_buffer
 4983  0.0 ./IOtest         wait_on_buffer
 4984  0.0 ./IOtest         wait_on_buffer
 4985  0.2 ./IOtest         wait_on_buffer
 4986  0.2 ./IOtest         wait_on_buffer
The ServeRAID 4M is capable of much faster throuput. We have checked the 
driver to make sure it was not serializing the I/O requests.
Bug #33309 reference problems with elevator. I don't know it this is 
related. I also see very poor performance on large dd commands.
I see the same results in 2.4.3 and 2.4.4

How reproducible:
Always

Steps to Reproduce:
1.Run any disk benchmark.
2.
3.
	

Additional info:

Comment 1 Arjan van de Ven 2001-05-26 11:31:09 UTC

How much memory does the machine in question have ?
I'll arrange to get the ServerRAID shipped to me so I can test and see if 
things can be improved.

Comment 2 Need Real Name 2001-05-26 23:00:17 UTC

It's an xSeries 340 Model 8656-6RY, 1G RAM, 2x1GHz Processors, ServeRAID 4M.

Comment 3 Arjan van de Ven 2001-05-28 09:27:37 UTC

Don't declare me nuts immediatly, but does the same slowness also occur if you
boot with "mem=800M" as parameter ?

(if it doesn't it's a highmem problem, if it does, well, it isn't)

Comment 4 Need Real Name 2001-05-30 23:07:44 UTC

I tried it, but it had no effect.

Comment 5 Arjan van de Ven 2001-05-31 09:17:27 UTC

I tried to reproduce this on my testmachine with an aic7xxx and tiobench
happily reports 29.86 Mbyte/second streaming performance. (but this is with our
current working kernel)

I can't get the serverraid card as QA needs it for testing and validation and we
have only one.

Comment 6 Need Real Name 2001-05-31 17:03:20 UTC

Was that read perf or write? How large was the file, what block size, how many 
threads?
The results I was getting from tiobench using 5G file, 8K block size, 10 
process was around 24M/s write 12M/s read. But I have to use 2.4.3 or 2.4.4 or 
I get terrible write performance and the following in /var/log /messages:
(/var/log/messages-- using 2.4.2-2smp kernel)
May 17 10:58:16 localhost kernel: __alloc_pages: 0-order allocation failed.
May 17 10:58:49 localhost last message repeated 324 times


But I'm getting the same throughput results on both the single drive attached 
to the onboard controller and a 3 drive RAID0 attached to the ServeRAID.
Also I'm seeing problems when running tiobench with large files, and when doing 
large dd transfers. I run out of memory and it ends up killing off other 
processes. It never frees the memory.
Here is the output of free after doing a large dd or tiobench.
[root@test /root]# free
             total       used       free     shared    buffers     cached
Mem:       2059448    2034148      25300          0       3228    1946488
-/+ buffers/cache:      84432    1975016
Swap:            0          0          0

When it's running free goes down to about 2K. Once it finishes then it comes 
back up a little, but it never frees very much memory. If you try to run the 
test again it will start killing more processes almost immediately. I also get 
the following in /var/log messages.
May 31 08:23:29 localhost kernel: Out of Memory: Killed process 2846 (xterm).
May 31 08:23:33 localhost kernel: Out of Memory: Killed process 1357 (xfs).
May 31 08:23:37 localhost kernel: Out of Memory: Killed process 1644 (xterm).

We also tried the tests on a similar Compaq config. We end up getting about the 
same results. A single drive on the embedded controller performs about the same 
as a 3 drive RAID0 on the raid controller.

Comment 7 Arjan van de Ven 2001-08-03 14:10:30 UTC

If the 2.4.7-0.X kernels still have this, it must be a device-driver issue
as my megaraid setup achieves > 28 megabyte/second on a raid5 partition
(ok this is not really super fast but it's limited by the hw raid5 engine)

Comment 8 Preston Brown 2001-08-15 13:48:28 UTC

IBM: Please retest with the 2.4.7-x kernel available via rawhide and provide 
feedback.

Comment 9 Need Real Name 2001-08-22 17:22:46 UTC

Sorry I didn't respond sooner. The problem ended up not being a disk perfomance 
problem (exactly). The test we were running, tiobench, wasn't saturating the 
controller. So it appeared that it wasn't getting very good throughput. The 
real problem was made clear when we did large seq read/write ops. If you do a 
dd on a really large file >2G it runs fine for a little while but it slowly 
eats up all your virtual memory. It even starts killing off other processes 
because there isn't enough free memory to go around. Also it doesn't free up 
the memory once the process completes. I tried it on 2.4.7. It's not as bad as 
earlier versions but it's still a problem. 
Also see:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=33309
It's basically the same problem.

Comment 10 Need Real Name 2001-09-14 20:11:38 UTC

KH 9/14/01 Per discussion with submitter, closing this bug as a duplicate of 33309.

*** This bug has been marked as a duplicate of 33309 ***

Note You need to log in before you can comment on or make changes to this bug.