Red Hat Bugzilla – Bug 232401
rpc.idmapd can be DOS'ed.
Last modified: 2008-04-04 08:42:04 EDT
Description of problem:
NFS4 server on FC6/x86_64 (dual core) can be DOS'ed by an older 32-bit client
(approximately kernel-smp-2.6.9/nfs-utils-1.0.6-70). rpc.idmapd logs an error
and stops responding, leading to entire NFS4 service being dead for all the
clients. I could trigger the bug reproducibly by running rsync on the client,
syncing local disk to an initially empty NFS4-mounted directory. Restarting
rpc.idmapd brings the service alive only to the next execution of rsync.
Version-Release number of selected component (if applicable):
Always, given the data.
Steps to Reproduce:
1. rsync -av user /net/server/home
The NFS4 server logs:
Mar 14 11:32:21 server rpc.idmapd: nfsdcb: id '-2' too big!
Mar 14 11:32:35 server kernel: nfs4_cb: server another_client_ip/server_ip not
responding, timed out
The NFS4 client logs:
Mar 14 11:32:52 client kernel: NFS: v4 server returned a bad sequence-id error!
Mar 14 11:32:54 client kernel: decode_getfattr: xdr error 10008!
(the clocks on the client and server might have been few seconds off).
The server ultimately hangs after few restarts of rpc.idmapd with:
kernel: BUG: soft lockup detected on CPU#1!
kernel: Call Trace:
kernel: [<ffffffff8026999a>] show_trace+0x34/0x47
kernel: [<ffffffff802699bf>] dump_stack+0x12/0x17
kernel: [<ffffffff802b6d9b>] softlockup_tick+0xdb/0xf6
kernel: [<ffffffff80293cdd>] update_process_times+0x42/0x68
kernel: [<ffffffff802749e7>] smp_local_timer_interrupt+0x34/0x55
kernel: [<ffffffff8027509b>] smp_apic_timer_interrupt+0x51/0x69
kernel: [<ffffffff8025ccf6>] apic_timer_interrupt+0x66/0x70
kernel: [<ffffffff8826187a>] :sunrpc:svc_close_socket+0xa/0xa9
kernel: [<ffffffff882603bd>] :sunrpc:svc_destroy+0x67/0xc4
kernel: [<ffffffff882ff901>] :nfsd:nfsd+0x29e/0x2b1
kernel: [<ffffffff8025ced8>] child_rip+0xa/0x12
This should never happen.
For the record, I have not seen the problem since I upgraded to
Occassionally, I get following warnings instead:
nfs4_cb: server 64bit_client_ip/server_ip�����.gnu.linkonce.this_module not
responding, timed out
- Yes, these strange (0xff) characters are there! I think there is some mistake
in the format...
On the 64-bit client, I also got once:
VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day...
after automount unmounted the file system.
BTW, I have scanned the bug list for duplicates. Bug 225507 appears to be similar.
FWIW, I haven't seen any of these recently, after a number of kernel upgrades.
Running 2.6.20-1.2962.fc6 now.
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.
If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
Thanks for your help, and we apologize again that we haven't handled
these issues to this point.
The process we are following is outlined here:
We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.
And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers