I ran two fsx instances on a local 2 disk RAID0 mount, one on an NFS mount, and an fs-stress run on the same local RAID0. After about 10 minutes, it went bang with the attached oopses/panic.
Created attachment 137603 [details] serial console capture
Dave, first quick thought is this might be a dup of bugs #208404 / #207739 In case you want to re-test w/ that patch... I'll need to dig up the exact trees that these problems were reported on to make sure it's the same BUG_ON we're hitting.
Ok, it was wishful thinking that this is a dup of those bugs. Although, it may still be the root cause. The patch which "fixed" those bugs is in this oopsing kernel, but it wound up at pretty much the same place.
I'll look into this a bit.
Can't seem to reproduce this on anything but Dave's box... From the attached oops, we went down the path of the last call to journal_do_submit_data in journal_submit_data_buffers, I think.
At long last, I can hit this one now too. The key is that it needs a block size < page size to hit it, I think. Although, it still took many hours for me to hit it (about 12...) As we gain understanding of the bug I'll try to whip up a better testcase.
This should be fixed in todays FC6 kernel.