464792 – Disk IO slows down when memory usage is high with vm.overcommit_memory=2

Bug 464792 - Disk IO slows down when memory usage is high with vm.overcommit_memory=2

Summary: Disk IO slows down when memory usage is high with vm.overcommit_memory=2

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	10
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-09-30 14:59 UTC by Bruno Wolff III
Modified:	2008-12-13 17:02 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-12-13 17:02:42 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
lspci -vvv output (46.18 KB, text/plain) 2008-09-30 14:59 UTC, Bruno Wolff III	no flags	Details
/etc/fstab (756 bytes, text/plain) 2008-09-30 16:55 UTC, Bruno Wolff III	no flags	Details
dmesg (39.42 KB, text/plain) 2008-10-22 17:14 UTC, Bruno Wolff III	no flags	Details
/var/log/messages extract (4.13 KB, text/plain) 2008-10-26 18:17 UTC, Bruno Wolff III	no flags	Details
View All

Description Bruno Wolff III 2008-09-30 14:59:06 UTC

Created attachment 318083 [details]
lspci -vvv output

Description of problem:
I don't know that I am going to be able to give you enough information to track this down, but I'll describe what I am seeing so as to get the issue on record.
I have a dual core xeon with 2GB of real memory and 10Gb of real memory configured not to use the OOM killer (vm.overcommit_memory = 2 and vm.overcommit_ratio = 50). I boot with the nomodeset (to avoid display problem) and the elevator=deadline (to try to get better interactive response) kernel parameters.
Occassionally when I do yum updates yum's memory usage baloons to over 2Gb (this is supposedly a bug in rpm libs) and when this happens I often see problems. When I am remote I can get normal ping responses from the machine but opening new ssh or smtp sessions will hang. When I am local the mouse will move extremely slowly and things will eventually appear to hang. On at least one case I was able to kill off yum and things seemed to return to normal.
The last kernel I saw this happen on was 2.6.27-0.354.rc7.git3.fc10.x86_64 though I am now using 2.6.27-0.372.rc8.fc10.x86_64.

Version-Release number of selected component (if applicable):


How reproducible:
It isn't easy to reproduce.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Bruno Wolff III 2008-09-30 16:07:23 UTC

I had this happen again with kernel 2.6.27-0.372.rc8.fc10.x86_64. Updating lots of kde components seems to be a good way to trigger the yum/rpmlibs bug which triggers the apparent kernel bug.
In this case my console X session slowly ground to a halt and I needed to reboot the system as things were dead before i could kill the yum process.

Comment 2 Bruno Wolff III 2008-09-30 16:55:20 UTC

Created attachment 318096 [details]
/etc/fstab

I forgot to add some relevant information about file system and block devices.
All my filesystems are ext3. Everything (including swap) is using dm-crypt on top of md raid 1.

Comment 3 Bruno Wolff III 2008-09-30 16:55:51 UTC

(Except /boot.)

Comment 4 Bruno Wolff III 2008-10-05 18:35:47 UTC

I have now seen this with the 382 kernel. Pings still got a response, but the ssh sessions I had open (one running yum and another just sitting with a shell prompt) did not respond.
I have not seen this happen with my home system which uses filesystems on raid devices, but not encrypted devices (except for temporarily mounted usb drives).

Comment 5 Bruno Wolff III 2008-10-05 22:07:45 UTC

I will have a good chance to get this to happen again Monday morning when I get access to the machine again. Most likely trying the yum update again will get the lockup to happen again. If you have something that I can do to collect more information about the problem please let me know. I'll be checking this bug report before trying any updates.
This was not happening with f9 and I did see this fairly soon after changing to Rawhide in September. So it is likely a regression in the 2.6.26 or 2.6.27 kernels. I didn't see an applicable bug on the kernel regressions list though.

Comment 6 Alan Cox 2008-10-10 16:04:46 UTC

Sounds like lack of low memory for I/O.

Comment 7 Bruno Wolff III 2008-10-10 16:30:58 UTC

Is there something that would be helpful for me to do to help diagnose this?

Comment 8 Chuck Ebbert 2008-10-14 17:47:13 UTC

(In reply to comment #8)
> I have a dual core xeon with 2GB of real memory and 10Gb of real memory

You mean 2GB real and 10GB of swap?

Comment 9 Bruno Wolff III 2008-10-14 21:54:18 UTC

Yes, 2GB real (ECC) memory and 10GB swap (on an encrypted mirrored partition).

Comment 10 Dave Jones 2008-10-22 15:19:28 UTC

I'm not seeing how this is a kernel bug.  You have an app that's eating all your memory (and then swap).  As I understand it, overcommit=2 will only allow 50% (1GB in your case) of real memory to be 'committed' at any time, so the allocation that follows once that has been met will hit swap, which is obviously going to impact disk IO throughput.

That it didn't show up in F9 doesn't necessarily mean it's a kernel regression. Yum/rpm may have developed a leak in rawhide for eg.

Comment 11 Alan Cox 2008-10-22 16:36:22 UTC

Disagree: in no overcommit mode we shouldn't be thrashing that badly and hanging - that means either the overcommit isn't working or the VM has gone to crap.

The underlying bugs are long long standing kernel vm bug (or bugs to be more accurate). Also overcommit=2 will ensure swap backing exists but will not use it unless we are out of physical ram, it ensures we have room to swap stuff out nicely. It should constrain the scale of the resulting mess and is worth trying.

There are two key bugs in the general area I know about:

1. the ext3 journal thread has no ioprio set so the CFQ scheduler everyone seems to like to default to gets confused and tries to punish it as an io hog, ditto the writeout threads. Arjan posted fixes for this many months ago but they are still not upstream. Makes my I/O far more stable and way faster under load. If we don't apply those fixes we shouldn't ship CFQ as its broken out of the box

2. under load the vm writeout threads start randomly writing back chunks of memory which ruins the nice ordered write out from the fs and trashes your disk performance. No easy fix and it seems some of the vm people have entirely lost the plot here. And to maximise damage #1 kicks on the write back threads in to mess up the effect of #2 further.

On a box without an IOMMU paging to try and get memory under the 4GB boundary makes the entire thing even worse. I wouldn't be suprised if all the extra allocation and poking by the encryption code makes the mess bigger again.

However testing with overcommit =2 and deadline scheduling ought to have avoided problem #1 for the most part. The lack of an IOMMU will cause thrashing of some sort with a 32bit capable disk controller and no IOMMU but even under high allocations and 2GB memory usage on a 10GB box it shouldn't be becoming unusable.

Is the kernel in use 32 or 64bit ?
Are the disks on the PATA port or the AHCI SATA ports ?

Comment 12 Bruno Wolff III 2008-10-22 16:45:09 UTC

I am running a x86_64 kernel.
I didn't build the system myself, so am not 100% sure what the disks were plugged in to. Is there a command I can run to get a definitive answer here or do I need to open the box up and look at things?

Comment 13 Bruno Wolff III 2008-10-22 16:50:52 UTC

I am not sure that I used up all of my allocation when this occured. swap + 50% of real memory is 11GB. I was seeing a difference when yum was using about 2GB of virtual memory as opposed to about .5GB of virtual memory. If I am running that close to the maximum memory use that an extra 1.5 puts me over, I would expect to be seeing problems from processes not being able to run on a regular basis.

Comment 14 Alan Cox 2008-10-22 16:52:28 UTC

The dmesg boot log will tell you which controller has which disk

Comment 15 Bruno Wolff III 2008-10-22 17:14:07 UTC

Created attachment 321182 [details]
dmesg

It looks like the devices are on a scsi bus and hence are attached to the ahci controller, but since I am not an expert at reading the dmesg info, I've attached it.

Comment 16 Jeff Moyer 2008-10-22 17:16:17 UTC

(In reply to comment #11)
> There are two key bugs in the general area I know about:
> 
> 1. the ext3 journal thread has no ioprio set so the CFQ scheduler everyone
> seems to like to default to gets confused and tries to punish it as an io hog,
> ditto the writeout threads. Arjan posted fixes for this many months ago but
> they are still not upstream. Makes my I/O far more stable and way faster under
> load. If we don't apply those fixes we shouldn't ship CFQ as its broken out of
> the box

AKPM NAKed that approach.  See:
  http://lkml.org/lkml/2008/10/1/405

Comment 17 Alan Cox 2008-10-22 22:13:38 UTC

Yeah Andrew whines about it and proposes other stuff that doesn't work. Not my problem, my kernels all have that fixed and they flatten Red Hat ones on usability.

Anyway this system has a SAS MPT Fusion controller for the disks.  Fusion is 64bit so this workload is showing a real kernel problem and its not 32bit unbalancing, something is busted.

Comment 18 Bruno Wolff III 2008-10-25 14:48:52 UTC

I am not sure if this is another aspect of the same problem or a different one.
Recently I have been having stalls while transferring data to an encrypted usb device. However the system recovers from these when the transfer finishes. Occasionally I will have the X display freeze up (the curser doesn't track the mouse and no other changes affect the screen) for times on the order of a minute and then things will happening suggesting that at least some input was captured during the freeze. After some pauses the X server will restart. I don't know if it is crashing or if something decides it isn't running any more and restarts it.

Comment 19 Bruno Wolff III 2008-10-26 18:17:05 UTC

Created attachment 321555 [details]
/var/log/messages extract

After the latest xorg restart I took a look in /var/log/messages and found that there was relevant information. There was plenty of swap space available. There were some references to low and high memory, but I don't know enough to tell if there was a shortage of a particular kind of memory causing the problem.

Comment 20 Bruno Wolff III 2008-11-08 03:57:23 UTC

I haven't seen the machine lock up unrecoverably in a long time, but when copying 100s for megabytes of data to my encrypted usb drive I regularly see processes stall, but not all at the same time. And it isn't unusual for X to restart. When I've checked swap usage its virtually nothing.

Comment 21 Bug Zapper 2008-11-26 03:17:10 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 22 Bruno Wolff III 2008-12-08 19:14:30 UTC

Recently things seem to be improving, but since the problem didn't always manifest itself when doing a lot of file i/o I am not sure it is really fixed.
I am currently running 2.6.27.7-137.fc10.x86_64.

Comment 23 Bruno Wolff III 2008-12-13 17:02:42 UTC

I think it has been long enough that we can consider this fixed. The worst I have seen in the last week was subsecond delays in music being played and that might just be normal expected disk contention. I haven't had any minute+ periods without response nor have I seen X crashes due to failure to allocate memory nor have I had a yum update lock up the system.

Note You need to log in before you can comment on or make changes to this bug.