Here's some setup email from a good friend of mine: --------- I've got a Linux question - hoped you could point me at someone who might be able to help. In my latest DE testing I've been able to cause a 2.4.{0,1} production system to get into a state where it is unusable - almost every command you type exits with SEGV - I can't even su - to shut it down! It happens after pounding the box running the deltaedge software (runs as unpriveledged user on a stock kernel, so I don't think there is any way I could have been responsible for mucking the kernel memory). It runs OK for a little ibt and then the server pukes with a SEGV, and then the machine is hosed. When the machine gets into this state, a simple program like: #include <stdio.h> static void *foo[10000]; int main() { int i; for (i = 0; i < 10000; i++) if (foo[i] != NULL) printf("ERROR\n"); } Will start showing error - after i >= 640...really weird considering the C spec says that the entire memory occupied by foo MUST be 0! -------- The application that he's talking about is a big proxy server. He says that he knows that this problem doesn't show up in test9 and he's doing a binary search to figure out exactly where the problem started showing up. Some stats: gcc 2.91.66, roughly 6.2 with a maybe hacked libc ( he'll have to give more information here. ) The machine is a VA box with 2 PIII 750 mhz cpus and a gig of ram. The application makes heavy use of threads, mmap and raw I/O. He's going to see about getting us a solid test case. He'll be added to this bug as soon as I let him know the ID.
Client side test case software (before I forget): http://polygraph.ircache.net/
I'm looking for the kernel rev where the problem appears to be introduced - as the problem doens't occurr at exactly the sampe place every time, it might be a while before I can track it down. So far, 2.4.0-test10 APPEARS to be OK.
2.4.0-test11 and 2.4.0-test12 APPEAR OK too. There is something going on between 2.4.0-test12 and 2.4.0-prerelease which seems to be effecting the performance of the application. I suspect (based on timing information tracked in the app) that the cost of mmap() has increased in situations where process maintains a large number of active mappings (~5000+). It also appears the disk I/O has slowed significantly as well - perhaps related to the above. This behaviour is also shown in production 2.4.0. I'll start looking at the 2.4.1-testX kernels - so far, I can only reproduce regularly on 2.4.1.
It appears that the change causes the behavior went in with 2.4.1-pre1. As noted previously, 2.4.0-prerelease and 2.4.0 both performed horribly compared to 2.4.0-test{9,10,11,12}. 2.4.1-pre1 crashes as intially described and the system becomes unusable - even simple commands such as "ls" and "sync" die with SEGV.
There are two things I'm curious about: could you try booting the kernel with the nofxsr option? Also, does the corruption still occur if you run the machine with no swap?
add dmgrime to the cc list
Tests run under both 2.4.1-pre1 AND 2.4.1: nofxsr && noswap: performance problem as described above, no crash nofxsr : performance problem as described above, crash noswap : performance problem as described above, no crash So, seems like the crash can be prevented by disabling swap, but the performance problem seems to persist from 2.4.0-prerelease through 2.4.1 production. The "performance problem" I keep referring to I will try to dig into - my first instinct points at something with the raw device I/O. I suspect it has to do with concurrent raw requests to mlutiple physical devices, I'm going to rerun some tests with only 1 spindle - the application serializes requests per spindle, so this will rule out a concurrency race.
Can you please test again with swap after applying the following patchball: http://www.kvack.org/~blah/fix-v2.4.1-A.tar.gz Unpack the tarball and apply the patches with for i in fix-v2.4.1-A/*.diff ; do patch -p1 -s -N -E -d linux/ <$i ; done . This has the kiobuf fixes from Stephen, a patch for zeropage COW based on Linus' ideas, and Jens' block fixes. I'm also curious to know which of the patches make a difference (I expect that the zeropage fix is the culprit). -ben
Patched downloaded and applied against stock 2.4.1. The crash symptoms appear to be gone - but the "performance" issue remains. Did something change from 2.4.0-test12 to 2.4.0-prerelease that would effect performance of an application with MANY ( >5000 ) active mmap() segments? There appears to be quite a bit of activity in mm/mmap.c - in particular the removal of "merge_segments()"; perhaps related? I'm going to try stock 2.4.1 with one the zeropage patch next to check stability - update coming soon.
Update: Stock 2.4.1 + 05-zeropage.diff is stable - crash symptoms gone. Performance problems remain. Please see previous note regarding mm/mmap.c.
Ben, I'm assigning this bug to you directly since you are working on it.
Here's a quick update: I was pretty much out of commission last week, but I'm back now and putting together a patch based on the suggestion that the removal of segment merging in the kernel is the source of the problem. I should have it for you later on today, and will update this entry then.
This was fixed for 7.1 final.