Description of problem: kernel crashes during heavy load on large (>1TB md) xfs valumes when the kernel stack size is only 4k Version-Release number of selected component (if applicable): 2.6.18-1.2257.fc5 How reproducible: once or two times per week, on heavy load Steps to Reproduce: 1. heavy load from several sources: httpd, nfs, scp, dd Actual results: xfs related kernel panic, or simply freeze Expected results: ??? work without flaws, maybe??? Additional info: i have an impression this is a known issue that xfs and 4k do not mix anymore. you should decide to either (1) switch to default 8k or (2) remove xfs support since a lot of people is losing data (one of the crashes required complete format since xfs_repair was segfaulting on such mess)...
we need to see the traces really to be able to do anything with this.
Yep, stacking xfs over MD and nfs and ... is going to push the limits of the 4k stack. sgi guys have done a little work to try to trim down stack usage but adding IO layers will help add it up again. Seeing traces would at least identify the particular callchain that you ran into.... a couple of thoughts on making this more robust/safe for fedora users... perhaps we could refuse to even load the xfs module into a 4kstacks kernel unless loaded with an "i_want_to_live_on_the_edge" module option or somesuch... also, Russell has suggested the possibility of making stack size a boot time parameter; but I bet that will go over like a ton of bricks upstream. -Eric
We need more info on how this actually failed, or we can't even begin to address it. Backtraces please?
Created attachment 156797 [details] Panic backtrace of XFS stack overflow Kernel panic trace from a Hyperthreaded P4 running 2.6.20-1.2316.fc5smp. The system provides NFS access to a 1.3T XFS volume on software raid 5. The machine is also used to analyze the data written to the discs. Under heavy load (two memory intensive processes) plus a constant (~1Mb/s) write rate to the discs, the machine falls over after some unpredictable time between a few hours to a few days.
Thanks, I'll look over that stack backtrace. Interestingly, there seems to be no xfs functions in the stack which caused the initial warning. FWIW in the past I think I've seen the stack overflow warning itself cause the *actual* stack overflow... Honestly, xfs + any complex IO stack will have a very hard time fitting into 4k. SGI has been chipping away at the problem, but the low-hanging fruit has all been found... If this is critical to your workflow, I would seriously suggest using an x86_64 box for the purpose, which has 8k stacks (yes, they must share that 8k with irqs, but in my experience it's been a much more robust combination). I'll look over those stack traces & see if anything interesting pops out, though. -Eric
ah, I think the backtrace has the stack warning intermingled with the subsequent oops, it will take a bit to untangle it. (but yeah, xfs is in there) :)
*** Bug 240077 has been marked as a duplicate of this bug. ***
The other fun thing here is that the stack warning *itself* tends to push the stack over the edge, if you're close enough to make it go off. Neat, eh? I've sent something upstream to see about addressing this, but it's not been accepted yet. -Eric
I'm going to close this as "UPSTREAM" - I'm never going to fully resolve this bug, though I do occasionally try to whack down the worst offenders if I can. Realistically, the problem is known, and it's going to take upstream work (from sgi, on xfs, and maybe from core kernel work to ease stack for stacked IO). For now, as of fedora devel today, xfs is reasonably functional on 4k stacks, esp. if you don't put it over a volume manager....