Red Hat Bugzilla – Bug 148840
Kernel Oops in rpc.mountd on NFS servers with a large number of NFS mounts
Last modified: 2015-01-04 17:17:01 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Description of problem:
In a cluster of 290 dual-Xeon FC3 (2.6.10-1.760) machines we
where experiencing several kernel crashes (Oops) a day until
Neil Brown identified a critical (for us) patch in the RPC
The details of this discussion and a proof of principle patch may
be found starting with,
The proof of principle patch has now been running on our 290 node
cluster for over 6 days without a single crash.
The question here is how to get the cleaned up version of the patch
integrated into FC3 to avoid having to patch all of our systems.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install FC3 on 290 linux machines
2. Have them all cross mount 1 filesystem from each other node
3. Run data intensive analysis jobs that read from all 290 filesystems
on every node
4. Wait a few hours for a kernel Oops.
if you can attach the patch to this bugzilla, I'll take a look at
including it until it gets merged upstream.
Created attachment 111190 [details]
This patch is now in 2.6.12-rc1. Any idea when it might be merged into a FC3
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem. Please update to this new kernel, and
report whether or not it fixes your problem.
If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.
Unfortunately, the SMP version of 2.6.12-1.1372_FC3 will not boot on these
I am now able to boot kernel-smp-2.6.12-1.1372_FC3 since
mkinitrd-184.108.40.206-1.i386.rpm was released today in test/update