Bug 166152
Summary: | unmounting nfs fs causes badness in interruptible_sleep_on_timeout | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | erikj |
Component: | kernel | Assignee: | Steve Dickson <steved> |
Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | bevan, davej, dominik, greenrd, jburgess777, john.ellson, katzj, m.a.young, mesrik, prarit, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | ia64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-11-23 06:07:10 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 163350 |
Description
erikj
2005-08-17 15:37:16 UTC
This is reproducable on the 2.6.13-1.1526_FC4smp kernel in FC4 on an i686 Sep 30 14:52:55 itspc-1-28 kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:3297 (Not tainted) Sep 30 14:52:55 itspc-1-28 kernel: [<c031732f>] interruptible_sleep_on_timeout+0xf7/0x113 Sep 30 14:52:55 itspc-1-28 kernel: [<c012b9e1>] group_send_sig_info+0x59/0x63 Sep 30 14:52:55 itspc-1-28 kernel: [<c011d046>] default_wake_function+0x0/0xc Sep 30 14:52:55 itspc-1-28 kernel: [<dfbcd48b>] lockd_down+0xbe/0x120 [lockd] Sep 30 14:52:55 itspc-1-28 kernel: [<dfc20cbe>] nfs_kill_super+0x5e/0x62 [nfs] Sep 30 14:52:55 itspc-1-28 kernel: [<c0169fd8>] deactivate_super+0x5d/0x6e Sep 30 14:52:55 itspc-1-28 kernel: [<c017ef44>] sys_umount+0x33/0x73 Sep 30 14:52:55 itspc-1-28 kernel: [<c017bf6b>] destroy_inode+0x3f/0x4e Sep 30 14:52:55 itspc-1-28 kernel: [<c0108055>] do_syscall_trace+0xef/0x123 Sep 30 14:52:55 itspc-1-28 kernel: [<c017ef9b>] sys_oldumount+0x17/0x1b Sep 30 14:52:55 itspc-1-28 kernel: [<c010395d>] syscall_call+0x7/0xb Perhaps I should have mentioned this was with nfs-utils-1.0.7-11. Same thing happens on FC4 with # rpm -q kernel-smp nfs-utils kernel-smp-2.6.13-1.1526_FC4 nfs-utils-1.0.7-11 It does not happen with kernel-smp-2.6.12-1.1456_FC4, same nfs-utils. Should I file this under FC4 kernel? Oct 5 21:09:11 lab-s1 kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:3297 (Not tainted) Oct 5 21:09:11 lab-s1 kernel: [<c031732f>] interruptible_sleep_on_timeout+0xf7/0x113 Oct 5 21:09:11 lab-s1 kernel: [<c012b9e1>] group_send_sig_info+0x59/0x63 Oct 5 21:09:11 lab-s1 kernel: [<c011d046>] default_wake_function+0x0/0xc Oct 5 21:09:11 lab-s1 kernel: [<f89cc48b>] lockd_down+0xbe/0x120 [lockd] Oct 5 21:09:11 lab-s1 kernel: [<f8c7fcbe>] nfs_kill_super+0x5e/0x62 [nfs] Oct 5 21:09:11 lab-s1 kernel: [<c0169fd8>] deactivate_super+0x5d/0x6e Oct 5 21:09:11 lab-s1 kernel: [<c017ef44>] sys_umount+0x33/0x73 Oct 5 21:09:11 lab-s1 kernel: [<c017bf6b>] destroy_inode+0x3f/0x4e Oct 5 21:09:11 lab-s1 kernel: [<c0108055>] do_syscall_trace+0xef/0x123 Oct 5 21:09:11 lab-s1 kernel: [<c017ef9b>] sys_oldumount+0x17/0x1b Oct 5 21:09:11 lab-s1 kernel: [<c010395d>] syscall_call+0x7/0xb also happens on FC4 x86_64 with kernel-smp-2.6.13-1.1526_FC4 nfs-utils-1.0.7-11 Oct 5 17:05:07 lxr2 kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:3297 (Not tainted) Oct 5 17:05:07 lxr2 kernel: Oct 5 17:05:07 lxr2 kernel: Call Trace: <ffffffff8033c2c8>{interruptible_sleep_on_timeout+131} <ffffffff80131654>{default_wake_function+0} <ffffffff88245367>{:lockd:lockd_down+207} <ffffffff8825b857>{:nfs:nfs_kill_super+78} <ffffffff80186531>{deactivate_super+95} <ffffffff8019ca23>{sys_umount+739} <ffffffff801107ea>{syscall_trace_enter+217} <ffffffff80110827>{syscall_trace_leave+55} <ffffffff8010daa2>{tracesys+113} <ffffffff8010db02>{tracesys+209} I see that nfs-utils-1.0.7-12 have just been released. No change though. # rpm -q kernel-smp-2.6.13 nfs-utils kernel-smp-2.6.13-1.1526_FC4 nfs-utils-1.0.7-12.FC4 Oct 6 14:02:59 lab-s1 kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:3297 (Not tainted) Oct 6 14:02:59 lab-s1 kernel: [<c031732f>] interruptible_sleep_on_timeout+0xf7/0x113 Oct 6 14:02:59 lab-s1 kernel: [<c012b9e1>] group_send_sig_info+0x59/0x63 Oct 6 14:02:59 lab-s1 kernel: [<c011d046>] default_wake_function+0x0/0xc Oct 6 14:02:59 lab-s1 kernel: [<f89cc48b>] lockd_down+0xbe/0x120 [lockd] Oct 6 14:02:59 lab-s1 kernel: [<f8c7fcbe>] nfs_kill_super+0x5e/0x62 [nfs] Oct 6 14:02:59 lab-s1 kernel: [<c0169fd8>] deactivate_super+0x5d/0x6e Oct 6 14:02:59 lab-s1 kernel: [<c017ef44>] sys_umount+0x33/0x73 Oct 6 14:02:59 lab-s1 kernel: [<c017bf6b>] destroy_inode+0x3f/0x4e Oct 6 14:02:59 lab-s1 kernel: [<c0108055>] do_syscall_trace+0xef/0x123 Oct 6 14:02:59 lab-s1 kernel: [<c017ef9b>] sys_oldumount+0x17/0x1b Oct 6 14:02:59 lab-s1 kernel: [<c010395d>] syscall_call+0x7/0xb me too on 2.6.13-1.1601_FC5/x86_64 Looks to me like a re-occurance of the old bug 132726 which was addressed by the patch: linux-2.6.8-lockd-racewarn2.patch and this was dropped in kernel-2_6_12-1_1396 Me too on FC4, this time on IA32, so it's not just a 64 bit issue. kernel-smp-2.6.13-1.1526_FC4 nfs-utils-1.0.7-12.FC4 Oct 20 21:03:41 leto kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:3297 (Not tainted) Oct 20 21:03:41 leto kernel: [<c031732f>] interruptible_sleep_on_timeout+0xf7/0x113 Oct 20 21:03:41 leto kernel: [<c012b9e1>] group_send_sig_info+0x59/0x63 Oct 20 21:03:41 leto kernel: [<c011d046>] default_wake_function+0x0/0xc Oct 20 21:03:41 leto kernel: [<f8b8448b>] lockd_down+0xbe/0x120 [lockd] Oct 20 21:03:41 leto kernel: [<f8bffcbe>] nfs_kill_super+0x5e/0x62 [nfs] Oct 20 21:03:41 leto kernel: [<c0169fd8>] deactivate_super+0x5d/0x6e Oct 20 21:03:41 leto kernel: [<c017ef44>] sys_umount+0x33/0x73 Oct 20 21:03:41 leto kernel: [<c017a1a0>] dput+0x126/0x258 Oct 20 21:03:41 leto kernel: [<c0165066>] __fput+0x139/0x18d Oct 20 21:03:41 leto kernel: [<c01638a6>] filp_close+0x3e/0x62 Oct 20 21:03:41 leto kernel: [<c010395d>] syscall_call+0x7/0xb I can confirm it's still there in kernel-smp-2.6.13-1.1532_FC4, tested on dual Athlon MP. *** Bug 173144 has been marked as a duplicate of this bug. *** Howdy, Completely repeatable here too with up to date patched FC4 on DELL PE-2550 (ia32) dual processor configuration. Doesn't however occur with previous 2.6.13-1.1532_FC4 smp-kernel or with similar single processor PE-2550 configuration running 2.6.14-1.1637_FC4. HTH, :-) riku # uname -a Linux rudy.cc.jyu.fi 2.6.14-1.1637_FC4smp #1 SMP Wed Nov 9 18:34:11 EST 2005 i686 i686 i386 GNU/Linux [root@rudy src]# rpm -q nfs-utils nfs-utils-1.0.7-12.FC4 Badness in interruptible_sleep_on_timeout at kernel/sched.c:3403 (Not tainted) [<c031d2b2>] interruptible_sleep_on_timeout+0xf7/0x114 [<c012b213>] group_send_sig_info+0x59/0x63 [<c011c4de>] default_wake_function+0x0/0xc [<f8c1b4fb>] lockd_down+0xbe/0x120 [lockd] [<f8ccdcf7>] nfs_kill_super+0x5c/0x5e [nfs] [<c01699ed>] deactivate_super+0x60/0x71 [<c017eb4e>] sys_umount+0x33/0x73 [<c0107dc6>] do_syscall_trace+0x1e4/0x1f6 [<c017eba5>] sys_oldumount+0x17/0x1b [<c01039e1>] syscall_call+0x7/0xb -- *** Bug 173730 has been marked as a duplicate of this bug. *** Here is that patch that will take care of this badness warning... --- fs/lockd/svc.c.orig 2005-10-27 20:02:08.000000000 -0400 +++ fs/lockd/svc.c 2005-11-17 16:31:48.111289000 -0500 @@ -305,7 +305,7 @@ lockd_down(void) * the lockd semaphore, we can't wait around forever ... */ clear_thread_flag(TIF_SIGPENDING); - interruptible_sleep_on_timeout(&lockd_exit, HZ); + wait_event_timeout(lockd_exit, nlmsvc_pid == 0, HZ); if (nlmsvc_pid) { printk(KERN_WARNING "lockd_down: lockd failed to exit, clearing pid\n"); merged in cvs. built for rawhide, should go out in the first post-test1 push (also available from http://people.redhat.com/davej/kernels/Fedora/devel/ Will there be an errata for fc4 fixing this? is already fixed in the current fc4 errata released earlier this week. |