156437 – Writing large file to 1TB ext3 volume sometimes very slow

Bug 156437 - Writing large file to 1TB ext3 volume sometimes very slow

Summary: Writing large file to 1TB ext3 volume sometimes very slow

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.0
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Stephen Tweedie
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	161203 175140 (view as bug list)
Depends On:
Blocks:	161203 167231 168429
TreeView+	depends on / blocked

Reported:	2005-04-30 05:04 UTC by Tom Sightler
Modified:	2018-10-19 20:52 UTC (History)
CC List:	12 users (show)
Fixed In Version:	RHSA-2006-0132
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2006-03-07 18:56:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Oprofile output from system performing normally (140.46 KB, text/plain) 2005-04-30 05:05 UTC, Tom Sightler	no flags	Details
Oprofile output from system when copy is running slow (78.11 KB, text/plain) 2005-04-30 05:06 UTC, Tom Sightler	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2005:808	0	normal	SHIPPED_LIVE	Important: kernel security update	2005-10-27 04:00:00 UTC
Red Hat Product Errata	RHSA-2006:0132	0	qe-ready	SHIPPED_LIVE	Moderate: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 3	2006-03-09 16:31:00 UTC

Description Tom Sightler 2005-04-30 05:04:07 UTC

From Bugzilla Helper:
User-Agent: Opera/7.54 (X11; Linux i686; U)  [en]

Description of problem:
We have a Dell 6450, 4 - PIII Xeon 700Mhz, 8GB RAM, Qlogic 2340 fiber channel 
adapter.  This system has been connected to EMC Clariion array for some time 
mounting abount 15 LUNS of various sizes, two that are approximately 1TB each.

Recently we added an Apple Xserve RAID to the SAN and mapped a total of 6 LUNS 
from this device (~365GB each) to this same system.  We used lvm to create a 
linear stripe set of three volumes into a VG and then create a LV of the entire 
space giving an ext3 filesystem of approximately 1.1TB.

We then began copying the data from the old EMC 1TB volumes to the new 1.1TB LVM 
volume.  Initial performance seemed great, but approximately 15% into the copy 
every thing slowed way down.

The files being consist of only about 24 40GB files for a total of just about 
960GB.  If we start the copy again there is about a 1 in 4 chance that the copy 
will return back to normal speed.  We have tried using various methods to perform 
the copy, from normal cp, to rsync, to a tar pipe but the problem is consistent.

When rsync is being used the system can copy about 40-50MB/s, cp is typically much 
faster, however, when the copy is slow we see only about 250KB/s.  Other 
interesting factors, when rsync is running quickly, the system shows about 25% CPU 
utilization with a single CPU doing most of the work, and about 17% going to 
rsync, a few % to IO wait, and a few to system, however, once the system slows 
down the profile changes with <1% going to rsync, <1% to IO wait, and 25% 
(basically an entire CPU) going to system.

We initially suspected this problem might be an LVM2/DM issue so we created the 
volumes as older style MD linear devices (/dev/md0) and also tried LVM2 striping 
and MD RAID 0 with the same basic result.  We also tried the beta kernel 2.6.9-6.
37 and 2.6.9-6.40 as well as the hugemem kernel.  We also tried the Qlogic 8.00.
03 from their website.  None of this changed the symptom at all. 

We compiled our own kernel from the src RPM package and installed the debuginfo 
package.  When the problem occurs the oprofile output changes significantly.  I 
have attached the oprofile output from a good run and a bad one.  Basically, when 
the copy is progressing normally most of the time is spent in rsync and in copy_to 
and copy_from user.  When the performance slows the profile shows large amounts of 
time in init_ext3_attr and bh_lru_install and other routines mostly in ext3 and 
jbd.

We attempted to recreate this problem using benchmark tools like bonnie and iozone 
but with those it seems we can hammer the array for hours without seeing any bad 
performance, however, simply copying these very large files around we can usually 
reproduce the problem in only a few minutes, although occasionally it will work 
for hours before the problems shows itself.

We don't seem to be able to reproduce this problem with RHEL3.  Also, we 
eventually got the system to copy all of the files by simply mounting the exact 
same volumes as ext2.  So far we have not been able to reproduce the problem 
mounted ext2 rather than ext3.

Please let us know any other information that we can provide as this is a major 
problem for us.

Thanks,
Tom


Version-Release number of selected component (if applicable):
kernel-2.6.9-5.0.5.ELsmp

How reproducible:
Sometimes

Steps to Reproduce:
1. Create 1TB linear or striped volume of three disks 
2. Format and mount ext3
3. Copy/rsync large (40GB +) files around.

  

Actual Results:  Some copies perform well, others run at a VERY slow rate.

Expected Results:  All copies should occur at roughtly the same speed.

Additional info:

Comment 1 Tom Sightler 2005-04-30 05:05:42 UTC

Created attachment 113882 [details]
Oprofile output from system performing normally

Comment 2 Tom Sightler 2005-04-30 05:06:19 UTC

Created attachment 113883 [details]
Oprofile output from system when copy is running slow

Comment 3 Suzanne Hillman 2005-05-02 17:16:55 UTC

As you mention this is a major problem, you would be wise to contact our support
team, either by going to http://www.redhat.com/support and logging in, or by
calling 800-REDHAT1

Bugzilla is a bug tracking system, and not a support system, so is not suited
for urgent/immediate problems.

Comment 4 Tom Sightler 2005-05-03 03:37:59 UTC

Right, we understand that.  We plan to open an official support ticket this week.  
We are working to setup an environment where we are sure we can reproduce the 
problem consistently.

Historically we've not had much luck with Redhat support.  Especially on issues 
like this that are difficult to reproduce 100%.  Actually, the few times we've had 
Redhat actually provide a fix to our problem it was always via a Bugzilla, while 
the support request languished untouched in the ticket system.  As a matter of 
fact, I don't think we've ever had a support ticket succesfully resovled.

Because of this history we usually open the Bugzilla first because we find that 
sometimes other users that are having similar problems chime in and post comments 
that give us clues to our problem right away.  We then open a support ticket and 
reference the Bugzilla entry.

I guess basically, we open the Bugzilla for the community, we open the support 
ticket for us.

Thanks,
Tom

Comment 5 Stephen Tweedie 2005-05-03 15:28:22 UTC

From the oprofile output:

74044    20.7120  vmlinux           vmlinux           default_idle
29079     8.1341  ext3.ko           ext3              init_ext3_xattr

That looks seriously suspect: 

int __init
init_ext3_xattr(void)
{
	ext3_xattr_cache = mb_cache_create("ext3_xattr", NULL,
		sizeof(struct mb_cache_entry) +
		sizeof(struct mb_cache_entry_index), 1, 6);
	if (!ext3_xattr_cache)
		return -ENOMEM;
	return 0;
}

This function is called once, on module __init, and is then thrown away by the
kernel (due to the "__init" attribute.)  But the fact that you're seeing one CPU
pegged at 100% when the performance goes bad indicates that we should indeed be
seeing excessive CPU cycles somewhere in the kernel.  The question is where they
are really being spent.

Comment 6 Tom Sightler 2005-05-03 18:36:39 UTC

I'll admidt to being an oprofile newbie and I'll admidt that looks bogus,
perhaps we did something wrong.  We couldn't find the actual debuginfo package
for the distribution kernel so we recompiled the kernel source RPM to get a new
kernel and kernel-debuginfo RPM.  We then installed our custom RPM's (both
kernel-smp and kernel-debuginfo) and ran the oprofile.

Is there anything that would make oprofile be so wrong?  Could it be a mistake
on our part?

The CPU is pegged at 100% system time, almost no user or io wait time.

We're trying to setup a lab environment so that will allow us to reproduce the
issue and be more valuable in solving the problem.  We happen to have the spare
capacity to setup an identical 1TB array and attempt to copy back and forth.  I
guess at that point we'll find out just how reproducible the issue really is.

Thanks,
Tom

Comment 7 Stephen Tweedie 2005-05-03 22:20:32 UTC

It could be a mis-match between the running kernel and oprofile's lookup for the
symbols: I'd recommend explicitly telling oprofile where to look for the .ko
files, for example

opreport -p /lib/modules/2.6.9-5/kernel/ -l /ext3

If that still looks bogus, we can always add the "-d" option to log every single
oprofile hit on ext3, but that generates very verbose output indeed.

Comment 8 Tom Sightler 2005-05-04 03:31:22 UTC

The command I actually ran was

opreport -l -p /lib/modules/2.6.9-5.0.5.EL.rootsmp

which seems pretty close to the same.  The -l /ext3 appears to simply limit the 
results to only the ext3 module.  I'm quite confident the modules match the 
running kernel as I compiled the RPM's from source myself and have the following 
installed:

kernel-smp-2.6.9-5.0.5.EL.root
kernel-smp-devel-2.6.9-5.0.5.EL.root
kernel-debuginfo-2.6.9-5.0.5.EL.root

And am running 2.6.9-5.0.5.EL.rootsmp which is my custom compiled kernel.  One 
thing that is suspicious, should there be a kernel-smp-debuginfo package?  Also, 
the debuginfo RPM appears to contain a bunch of files like ext3.ko.debug in /usr/
lib/debug/lib/modules/<kernelversion>/kernel.  If I get rid of these I get totally 
different output from oprofile.

Perhaps you could point me to where I could get the actualy kernel debuginfo 
package for the stock kernel and I'll try again without my "compiled from source" 
versions.

We hope to open a support ticket tomorrow because we think we have this setup in a 
reproducible way.

Thanks,
Tom

Comment 9 Stephen Tweedie 2005-05-04 09:17:14 UTC

There should not be a kernel-smp-debuginfo package: the kernel-debuginfo package
contains all of the debug objects for both the UP and the SMP builds.

If you run with the official errata package, what does "opreport -d /ext3" show?
 I should be able to see the hot spots from that pretty easily, and match them
up to the disassembly of the ext3.ko to work out precisely where we're spending
all the time.

Comment 10 starlight 2005-05-08 02:45:59 UTC

Just encountered the same bug.  Kernel 2.6.9-5.0.5.EL on a P4 uniprocesor.

Have a 250GB Maxtor Maxline II+ disk with a boot partition and the rest of the
disk allocated as a single PV on a single VG and with a single LV holding an
'ext3' FS.  Created with 'mke2fs -v -N 256000 -j -m 0 /dev/vg01/lv00'.

Expanding large two 100GB files via a 'rsh remote cat hugefile.gz | gzip -d
>hugefile'.  'iostat' shows things humming along nicely at 30MB/second until the
'ext3' FS reaches the 128GB mark (134217728 per 'df -k').  Suddenly the transfer
rate drops down to 2MB/second and the 'gzip -d' process starts eating 100% of
the CPU.  'top' reports that it is all system time.  'strace -p <pid> --c'
confirms that vast amounts of time are being spent in the 'write' system call,
about 100 times that of 'read'.  A similar 'strace' run before the 128GB mark
shows 'read' using more time than 'write'.

A clue to the nature of the problem may lie in disk fragmentation.  'fsck -f'
shows an astonishing fragmentation level of 20%.  Wow!  I've never seen a disk
that fragmented ever before under any circumstances--much less as the result of
writing two sequential files immediately after 'mkfs' and 'mount'.

And now for the good news:  The problem goes away entirely if the FS is mounted
as 'ext2' instead of 'ext3'.  The performance wall-crash at 128GB and the
hideous fragmentation disappear, though I can't say I'm impressed by the 5.4%
non-contiguous value reported in this case.  Solaris typically reports values
way under 1%.

So I'm using 'ext2' for the big disk while I wait for the fix.

Comment 11 starlight 2005-05-08 03:18:13 UTC

A potentially important detail for someone trying to reproduce the above
scenario is that a custom version of 'gzip' was used.  This 'gzip' is a
modification to 1.3.3 to write data in 256KB increments (though 'strace' reports
32KB 'write' calls).  The normal version of 'gzip' writes in tiny increments,
particularly on compression operations, and as a result performs horribly on
VxFS under Solaris.  I could have re-tested this with regular 'gzip', but I
don't care to spend further hours on the problem now that a work-around has been
discovered.

Comment 12 Tom Sightler 2005-05-08 03:57:28 UTC

I think it is unlikely to be a problem with your modified gzip.  We've reproduced 
the issue with many different tools.  Our problems first started when we were 
using a simple 'cp' to copy several 40GB files.  We later tried using rsync, and 
then eventualy a tar pipe.  All three methods produce the issue.

We also tested on some smaller hardware without LVM and seem to show the same 
issue.  Actually, we think it is trivially reproducible by simply writing any 
large file over and over on any SMP system (possible UP as well but we haven't 
been able to make it fail there yet).

One thing we found is that sometimes if we continually cancelled and then 
restarted the copy eventually performance would return to normal, at least for 
that one file.

We also worked around our issue by mounting ext2.

We got a little busy last week with other issues but we do have a setup that 
seems to be able to reproduce this issue consistently now so we will probably open 
a support case Monday.

Later,
Tom

Comment 13 Kostas Georgiou 2005-07-20 11:56:00 UTC

We are seeing the same problem as well. We have a java "ftp" server and randomly
some threads will get caught by the problem from strace i can see that writes take
a long time to complete (sys cpu goes up to %100) but the disk is idle (from
istat). The partition is 1.7TB and the transfers are all 1GB files at the
moment. I try different elevators but the problem is still there. I'll try to
get some oprofile numbers as soon as possible.

Comment 17 Stephen Tweedie 2005-09-02 13:06:18 UTC

One other thing to try --- it's likely that the ext3 reservation code is
involved here; does "mount -o noreservation" help?

Comment 18 Tom Sightler 2005-09-02 14:18:45 UTC

I will try this today.  I can usually reproduce the issue in a matter of hours.
 It does still occur with 2.6.9-16 from the beta.

Some other things that we seem to see in our observations, which may or may not
be important:

1.  So far I've been unable to reproduce this on a UP system.
2.  For some reason it seems easier to reproduce with more CPU's.  The system
that normally exhibits the problem is a 4-way, however, it took MUCH longer for
us to get the problem to show up on a 2-way box although it did eventually.

Comment 19 Stephen Tweedie 2005-09-02 15:33:17 UTC

OK, then that sounds like a possible SMP livelock issue.  I've got a 4-way
that's busy doing some other ext3 testing right now, but should be able to get
that trying to reproduce this problem shortly.

Comment 20 starlight 2005-09-02 15:47:39 UTC

Please note my entries above.  The problem can be easily
reproduced on a single CPU system.

Comment 21 Tom Sightler 2005-09-02 16:03:01 UTC

You said P4 uniprocessor.  Is that a HT system?  Those behave more like dual
processors.  Of course I'm not saying it's impossible to reproduce on a single
CPU system, but I've probably passed 100TB of data through my system with my
simple test case (copying 40GB files around on two 1TB volumes) and so far
haven't been able to create a failure with a UP kernel on my box.

On the other hand my 4-way system usually fails somewhere around 4-6 hours into
the copies.

Comment 22 starlight 2005-09-02 16:20:38 UTC

It's a cheap Dell with a single 2.4GHz P4.  Perhaps it depends 
on the exact way the system is stressed (maybe the pattern of 
CPU/write activity inherent in our custom 'gzip'), but the 
problem appears consistently at 128GB whenever the files are 
expanded.  Running 'mkfs' beforehand might be relevant.

I can provide any system details of interest.  Don't have time
right now to run further tests though.

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 15
model		: 2
model name	: Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping	: 7
cpu MHz		: 2392.563
cache size	: 512 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips	: 4718.59

Comment 23 Tom Sightler 2005-09-03 14:01:13 UTC

My tests systems have passed over 5TB of data since yesterday when I remounted 
with the 'noreservation' option.  Normally I would hit the issue well before this 
point, usually around 500-600GB mark.  I will continue the test over the weekend, 
but at this moment things are looking good.

What are the negative impacts of this option?

Comment 24 Stephen Tweedie 2005-09-05 11:36:54 UTC

Reservations are a new feature of ext3 in the 2.6 kernel intended to reduce
fragmentation: basically it's a way of remembering which bits of the disk
already have files growing into them, so that they don't step on each others' space.

The reservations are remembered in memory only --- they are not present on disk,
so there are no on-disk compatibility issues with having reservations enabled or
not.  Without reservations, we are just using the old 2.4-style method where
each file is growing independently without regard for others; so using -o
noreservation will have no downside relative to the old 2.4 (RHEL-3) behaviour.

Comment 25 Tom Sightler 2005-09-06 15:02:56 UTC

My system survived the weekend after copying over 20TB of data so the
noreservation option sure seems to be a good workaround for the issue.  I guess
that also implies there is a small problem in the reservation code.

Thanks for the suggestion!

Comment 29 Stephen Tweedie 2005-09-08 20:55:50 UTC

I've also been trying to reproduce the problem, but without success so far.  I'm
going to try reformatting with artifically-small block groups; I suspect that
there's a scaling problem when the block groups start to fill up and we have to
walk many block groups for each IO.  That will be exacerbated even on smaller
filesystems if we use very small block groups (the default is 128MB, but we can
go much lower.)

There are a couple of improvements to the reservation code that have gone into
upstream kernels.  I haven't marked them as critical so far because they didn't
seem to be important enough, but it is quite possible that their impact becomes
much greater on larger filesystems where per-block-group overhead is very
important.  Would people be willing to test a kernel with those changes applied?

Comment 30 starlight 2005-09-08 21:16:15 UTC

Have no time to test right now.  However I'm willing to make our 
'gzip' source available and make available raw data files known 
to cause the problem.  The data is public-domain stock price 
information, two 15GB .gz data sets that expand to about 80GB each.  
Each data set consists of about 100 128MB segments that can be 
downloaded from an Apache httpd server using 'curl' or 'wget'.
The problem occurs when a 'rsh RMTSRV cat /???/file.gz_??? | gzip -d >file'
is run from a freshly made 'ext3' file system.  It is 100%
reproducible on my machine.  Alternately any large mostly text
file over 128GB in size should produce the problem after it's
been compresses with 'gzip' and is expanded with our 'gzip'
variant.  Perhaps the normal 1.3 'gzip' will also reproduce
the problem.

Comment 32 Tom Sightler 2005-09-08 22:05:05 UTC

I'm not arguing, but really just curious, if it's a scaling problem, wouldn't it
happen more consistently?  In our case it seems totally random, we have about 50
files that are all about 40GB in size.  Just copying these around is enough to
trigger the problem, but there seems to be no predicting which one will trigger
it (although it seems to happen pretty consistently around 500-600GB, so usually
on the 10-12th file).  Sometimes we can just cancel the copy and start over and
it will just return to normal for the next 3 or 4 files and then suddenly start
going slow again.

We've formatted the ext3 volumes with various options, including block sizes and
the largefile and largefile4 options without being able to make significant
differences (I understand that's not exactly what you're talking about, but just
wanted to point this out as well).

I'd be more that happy to test any kernels.  I have a system that's pretty much
dedicated to testing this issue.

Also, would an oprofile output of the problem occurring be any help?  I can
provide that easily.

Comment 33 starlight 2005-09-08 22:20:54 UTC

Yes, you are arguing.

This is my final entry, I am unsubscribing from this bug.

The last time I tried, I could reproduce the problem 100% (not 
99%, not 98%, not even 97%--a perfect 100%) of the time at the 
128GB mark with the approach I have described on an inexpensive
single-CPU Dell.

I do not have the time or interest to work on it further, but am 
willing to provide quality information to those who are.  Simply 
read my entries and you know most of what I know.

If anyone is interested in any further information from me,
contact me directly.

Comment 34 Tom Sightler 2005-09-09 01:28:30 UTC

No seriously, it was a question.  It was "how could it be" not "no way, it can't 
be", and that's also why it was accompanied by a statement of being willing to 
test ANY patches.  I've certainly been unable to fix it and am willing to 
entertain and test any theories, especially from people much smarter than me, but 
that doesn't mean I'm not curious about those theories.

Once again, I'm willing to test ANY patches on any of my test platforms to see if 
they solve my issues.

Comment 40 Stephen Tweedie 2005-09-21 20:09:25 UTC

OK, I have finally been able to reproduce this.  I think the problem has been
that we've been testing with filesystems that have either been too small, or,
paradoxically, too large.  To reproduce, we need a filesystem that has a lot of
block groups, but also one which is nearly full; tests on small filesystems did
not show the problem because they have few block groups, and so CPU time spent
scanning block groups does not add up to much; and tests on very large
filesystems did not get full enough to trigger the problem.

But I can reproduce with a relatively small 15GB filesystem simply by forcing an
artifically large number of block groups to be created, by formatting with

mke2fs -j -O^resize_inode -N 256 -g 256

Once the filesystem gets above about 96% full for normal writes, I see the same
problem of throughput dropping catastrophically and the kernel spending vast
amounts of system mode CPU time in ext3/jbd.  

I have been able to narrow the problem down, and have built a test kernel which
solves the problem in my local testing.  I have placed x86_64 and x86 kernels
with the fix up for testing at

http://people.redhat.com/sct/.private/test-kernels/kernel-2.6.9-22.EL.sct.1/

This eliminates the problem entirely in my local test case.  It is a merge of a
set of fixes from upstream, so has been well tested in that context.  The
kernels have not been properly QAed and are not supported, but have survived an
SMP filesystem stress test for a couple of hours locally.  It would be useful to
know if they solve the problem in your cases.

Thanks,
 Stephen.

Comment 41 Tom Sightler 2005-09-22 14:18:11 UTC

I will set this up on my test system today.  Hopefully we should have good
results in a couple of days.

I'm a little concerned since in my test cases the filesystems were not what I
would consider "near full".  In some cases they were less than 40% full.  Also,
some cases working even when the drive was 90%+ full.  Would there be any reason
for aparent randomness, perhaps due to the size of the stored files on the
volume or something?

Anyway, I will test this diligently over the next few days.  Thanks for all your
work.

Later,
Tom

Comment 42 Stephen Tweedie 2005-09-22 14:55:01 UTC

Yes, it depends on many factors, including existing fragmentation on the
filesystem, number of files being copied at once, etc.  The reproducer I found
was on a freshly formatted filesystem with no other activity; you would expect
to see different patterns on filesystems with existing data or with multiple
writers.

Comment 44 Tom Sightler 2005-09-25 00:51:47 UTC

Just an update.

The system has survived 48 hours with excellent performance so far.  With previous 
kernels the problem would almost always show up in 8 hours, and never worked for 
24 hours.  I will continue the tests until Monday afternoon which will be 96 hours 
of testing but so far these kernels are looking good.

Will report again on Monday afternoon.

Thanks,
Tom

Comment 45 Tom Sightler 2005-09-27 16:36:53 UTC

OK, the system survived our 96 hours stress test without any hint of slowing
down.  We averaged a sustained throughput of about 50MB/s on each thread and
processed a total of almost 16TB of data with no single thread showing more than
a single digit percentage difference from any other.  Previously we would see
some threads that would takes 50-100x longer for their writes to complete.

I've also attempted to reproduce this issue using a few other test cases on
another hardware platform.  That platform didn't show the problem as much, but
did usually show a slow writer at least once in 24 hours.  That platform also
survived the weekend.

I would have to say that these patches appear to have solved the problem in all
of my test cases.

So now I've got both an acceptable workaround (noreservation option) and a
future supported fix.  I'll assume this will eventually make it's way into
RHEL4, maybe U3?

Thanks so much for your help.
Tom

Comment 46 Stephen Tweedie 2005-09-27 17:06:20 UTC

Yes, I'll queue this for U3 (subject to product/engineering review as usual.)

Comment 47 John Shakshober 2005-09-27 17:40:42 UTC

Performance of memory pressure I/O has improved greatly (in KB/sec)
Iozone to 32GB file on a system with 4 GB memory, using 1M transfers;

              KB         reclen   write  rewrite    read   reread
R4 U2         33554432    1024   43507   39022    46831    46811
R4 sct        33554432    1024  106093   81218    55216    56591

x86_64 for both
2.6.9-20.ELsmp
2.6.9-22.EL.sct.1smp

Comment 52 Stephen Tweedie 2005-11-11 19:52:43 UTC

*** Bug 161203 has been marked as a duplicate of this bug. ***

Comment 54 Stephen Tweedie 2005-12-07 12:39:05 UTC

*** Bug 175140 has been marked as a duplicate of this bug. ***

Comment 55 masanari iida 2006-02-18 02:33:48 UTC

I understand what is "reservation" according to comment #24.
Would you mind to fill up "reservation" and "noreservation" section of
Documentation/filesystems/ext3.txt ?  Then send it to LKML,too.

And man for mount(8) also should have an explanation about "reservation"
and "noreservation"

Thanks.

Comment 57 Red Hat Bugzilla 2006-03-07 18:56:45 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html

Comment 60 Simon Matter 2006-03-10 07:25:24 UTC

Just for the archives, I've seen exactly the same problem on a P3 UP machine
doing rsync from my laptop via 100mbps network. I just didn't understand what's
wrong because it has worked for years against my good old RedHat 7.2 server. I
then tried with scp, rsh/rcp, ftp and whatever and it just didn't work as
expected, always the same. Thanks to everybody involved in fixing it.

Note You need to log in before you can comment on or make changes to this bug.