Bug 31141

Summary: Poor VM performance (tar, mkisofs)
Product: [Retired] Red Hat Linux Reporter: Ed McKenzie <eem12>
Component: kernelAssignee: Stephen Tweedie <sct>
Status: CLOSED RAWHIDE QA Contact: Brock Organ <borgan>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-04-07 20:43:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ed McKenzie 2001-03-09 05:31:18 UTC
After a backup cycle (tar, mkisofs, cdrecord...) the system slows to a
crawl. Large amounts of swap are being used -- actually, I'd say excessive,
since it seems to be greater than the address spaces of everything I'm
running. Bringing the system down to single doesn't improve things.

Running 2.4.2-0.1.19. Incidentally, overall system performance and
interactivity during large tar/mkisofs operations is really bad compared to
2.2.16-22 or 2.2.17-14 -- the system is basically unusable under 2.4.x.

Comment 1 Michael K. Johnson 2001-03-13 04:10:02 UTC
This is probably a combination of several things, including a need for
more VM tuning, but if you are using SCSI it could also be a bug that
causes SCSI requests to be issued serially.

You might try the latest kernels from rawhide to see what kinds of
improvements (or not...) you find.  Thanks!

Comment 2 Ed McKenzie 2001-03-13 04:59:18 UTC
This is on IDE (on a Promise ATA66 controller.) I tested 2.4.2-0.1.25 today, and
while it does subjectively seem slightly less bad than 0.1.19 (?), it's still
worse than 2.2.19pre, where the mouse doesn't get choppy at all. This is all
subjective, of course. Things do get worse the longer they go, e.g. big tarfiles
really kill the system towards the end.

My feeling from looking at vmstat is that too much paging is going on, and
possibly the elevator is not tuned properly.

Comment 3 Ed McKenzie 2001-03-13 06:41:39 UTC
s/and possibly/and also possibly/

(the two items of speculation are not necessarily related)

Comment 4 Stephen Tweedie 2001-03-13 10:31:27 UTC
It's definitely the VM: we've had a couple of elevator fixes to add recently but
what's left doesn't look like it can be explained by IO performance.  The kernel
simply swaps too much.

2.4.2-ac18 tunes this up slightly but doesn't fix it completely.  We're working
forwards from that.

Comment 5 Ed McKenzie 2001-03-26 01:31:51 UTC
With kernel-2.4.2-0.1.28 (in fact, every 2.4 kernel I've tried), large dd's
cause the system to become almost completely unresponsive -- ls -lR takes a
_really_ long time as compared to 2.2.16-22, and netscape takes several minutes
to start.  It seems to me that this is an elevator issue, as vmstat shows
nothing being paged.  IIRC, 2.2.14-5.0 had this problem as well, and that was
before the 2.2 elevator was fixed.

Comment 6 Ed McKenzie 2001-03-26 01:33:13 UTC
s/paged/swapped/, of course.

Comment 7 Ed McKenzie 2001-03-27 00:15:13 UTC
Filing elevator issues as a separate bug, as I'm definitely seeing a regression
relative to recent 2.2 wrt elevator starvation. See bug 33309.

Comment 8 Arjan van de Ven 2001-04-07 20:25:42 UTC
Please try kernels 2.4.2-0.1.40 or 2.4.2-0.1.49 from RawHide. We changed the 
VM tuning and a LOT of people are very happy with the new tuning.

Please reopen this bug if the tuning is not working well for you.


Comment 9 Ed McKenzie 2001-04-07 20:43:36 UTC
I'm extremely happy with overall VM performance! Unfortunately, I can still
create near-total disk starvation by doing a large dd, and it's much easier (and
more effective) than on 2.2.18 :-/

Comment 10 Ed McKenzie 2001-04-07 20:45:10 UTC
Elevator issues followed up in bug 33309