Red Hat Bugzilla – Bug 77564
UDMA causes serious filesystem/hdd lag
Last modified: 2007-04-18 12:48:19 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR
Description of problem:
Using any sort of UDMA mode causes the HDD/filesystem to lock up for any
period of time when there is heavy disk activity.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. TEST 1: Copy a large file (300+MB) from the network to RedHat's HDD using
FTP or Samba, and watch bandwidth monitor on other computer. This is using a
default RedHat install with no HDD or FS tweaks whatsoever.
2. TEST 2: Copy a large file from the network to RedHat using FTP,
to /dev/null, and watch bandwidth monitor.
3. TEST 3: Turn "unmaskirq" off (unsure if made a difference), and switch
to "mdma2" mode, using hdparm. Repeat the same 300+MB network file transfer
test over Samba or FTP, writing to the HDD.
4. TEST 4: Try to get drive stats using hdparm directly after cancelling
failed laggy transfer of test 1.
5. TEST 5: Switch to any PIO mode using hdparm and retry 300+MB network HDD
Actual Results: TEST 1: The transfer goes nice at the start, about ~9MB/s,
then suddenly dips to ZERO throughput after about 5 seconds or less (could be
after writes are committed from cache/journal? - which happens every 5
seconds, right?), and remains at zero throughput for a few seconds, then
starts going at ~9MB/s again for a few seconds, then zero throughput again,
and repeat until the file is done. Sometimes it will hover between ~0.5MB/s
and ~4MB/s after a while, going up and down continuously.
TEST 2: Transfer is a solid ~9MB/s all the way through to the end of the file.
TEST 3: Transfer is a solid ~5MB/s all the way through.
TEST 4: hdparm hangs for upto a few minutes until finally returning to the
prompt. Ctrl-c does not work.
TEST 5: The transfer goes at about 9MB/s for upto 5 seconds, again, and then
suddenly dips to ~2MB/s. After a few seconds it goes back up to 9MB/s. Then
back to 2MB/s. Repeat until file is done. It never dips to zero throughput,
and is consistent in it's 2MB/s<->9MB/s switches.
Expected Results: TEST 1: Solid 9MB/s, just like RedHat 7.3 used to do for
me, and just like RedHat 7.3 is doing for me now after i reinstalled it over
the top of the *extremely buggy* RedHat 8.0.
TEST 2: As expected.
TEST 3: As expected. Multi-word DMA isn't as good as Ultra DMA, of course, but
this perhaps proves that there is a problem with the UDMA code in RedHat 8.0,
but not PIO or MDMA.
TEST 4: hdparm should respond immediately and return the command prompt, like
it does normally when there is low disk activity.
TEST 5: I guess i expect the actual results. I'm not too sure on how
journalling works, or how RedHat's write caching works, but it appears that
PIO mode has no problem. 9MB/s into the cache, then while it's
committing/flushing cache, it throttles to about 2MB/s, then fills up the
cache again, then repeat.
Redhat 7.3 works great. RedHat 8.0 doesn't. Yet i can't see much difference in
the kernel versions.
I am using a Asus A7V133-C motherboard, VIA KT133A chipset, UATA66 Quantum
Fireball Plus KA 9.1GB (Primary Master, UDMA66+ cable), UATA100 Maxtor
DiamondMax Plus D740X 80GB (Secondary Master UDMA66+ cable), SMC 1211TX
Realtek NIC, TNT2. All filesystems written to during the tests were ext3.
Note that i didn't do much testing on the Maxtor HDD, and the tests that i did
with multi-word DMA worked fine on the Quantum HDD, but NOT on the Maxtor. The
Maxtor had problems no matter what i did.
I also didn't do any local tests, as for the major part of my testing +
frustration, i was under the impression that the NIC was having problems. I
now have RedHat 7.3 installed, so it's too late to test.
These tests were done on a standard RedHat 8.0 install with very little
configuration, and up2date'd (including kernel) as of 8th November 2002.
Also note, i am not sure if this is a *kernel* bug, or a bug of the files
which interact with the kernel. Seeing as the kernel did not change much
between 7.3 and 8.0, i would guess that it might not be a kernel bug, but
where the heck this bug report belongs to, i don't know.
Again, RedHat 7.3, re-installed this morning, up2date'd (including) kernel) as
of 9th November 2002, works great, just like RedHat 7.3 has always worked for
this is funny since the 7.3 update kernel is very identical to the 8.0 update
Yeah, i thought they looked pretty identical, which is why i'm puzzled as to
whether this is even a kernel bug. :(
Well.. I was busy up until today so i haven't been able to test more until
now. Now i find that i have the same problem again in RedHat 7.3 - up2date'd
or with the original rpms (but not a fresh install, just rpm -U --oldpackage
back to the original rpms).
It *was* working fine before i did more post-install configuration, which
brings me to believe i've done something to the BIOS, or RedHat has done
something funky with my configuration after i installed more stuff.
I have encountered a problem in the past with installed programs (rp-pppoe, in
fact) stomping over the kernel, causing performance issues and forcing me to
reboot (even killing the program won't work), so i'm not going to ignore the
possibility that something has fiddled with settings or is running at startup
and causing problems with the kernel.
Not sure if it means much, but some relevant dmesg output is:
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 21
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:04.1
ide0: BM-DMA at 0xd800-0xd807, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xd808-0xd80f, BIOS settings: hdc:DMA, hdd:pio
hda: QUANTUM FIREBALLP KA9.1, ATA DISK drive
hdc: MAXTOR 6L080J4, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 18041184 sectors (9237 MB) w/371KiB Cache, CHS=1123/255/63, (U)DMA
hdc: 156355584 sectors (80054 MB) w/1819KiB Cache, CHS=155114/16/63, UDMA(100)
Does "VP_IDE: not 100% native mode: will probe irqs later" mean anything bad?
I'll continue testing. Please let me know if there's a specific way i can test
or change some useful settings.
Is it worth installing some older or newer kernels (rawhide maybe? or build a
new kernel directly from kernel.org?) and see if they work?
Ok, nevermind, might as well close this bug. I've narrowed it down even
further and it turns out that it's yet *another* bad Maxtor product. Probably
not a faulty drive, just Maxtor's terrible products. I've had nothing but
problems with them in the past. This drive didn't work properly in Windows to
start with so i moved it to Linux for storage in hope that it'd work well.
I tested the Maxtor and Quantum drives on a KT333 motherboard and they acted
The Maxtor's problems are compounded by ext3's journaling. When i mount the
partitions on both drives as ext2, the Quantum goes a solid 9.5-11.5MB/s in
both directions (read/write) in multiple tests of ~700MB files - absolutely no
problem at all even in udma66. With ext3 though it dips every ~5 seconds a
little when the journal/buffer is flushed (i guess? honestly i don't know much
about journaling) and gives somewhat slower performance (but acceptable and
However, the Maxtor and ext2 start to have problems after about 10 seconds of
writing ~9MB/s. It falls down to about 0.5-2MB/s, hovering up and down between
the two. If i abort the transfer and quickly delete the file, delete takes up
to a minute to complete, whereas on the Quantum drive, deletion is instant. In
ext3, the 9MB/s write spurts end after a few seconds and sink to zero
throughput for a few seconds, then repeats. Reading from Maxtor in ext2 or
ext3 works like writing in ext3, it goes at 9MB/s for a few seconds, then
0MB/s for a few seconds, and so on.
Sorry for the hassle.
The lesson? Don't by anything Maxtor. :)