Bug 171453 - NFS: Badness interruptible_sleep_on_timeout
Summary: NFS: Badness interruptible_sleep_on_timeout
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 4
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Brian Brock
URL:
Whiteboard:
: 172763 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-10-21 18:28 UTC by Alan Bleasby
Modified: 2007-11-30 22:11 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-12-10 05:10:14 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Alan Bleasby 2005-10-21 18:28:37 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
Error log produces Badness in interruptible_sleep_on_timeout at kernel/sched.c:3297 (Not tainted)

The system is fully updated as of the date of submission of this bug.

The machine(s) involved use autofs to NFS mount home directories of a user
logging onto the system. After the user logs off then the error reported
in the results box is seen.

This started happening with kernel-2.6.13-1.1526_FC4: kernel-2.6.12-1.1456_FC4
did not have the problem.

Version-Release number of selected component (if applicable):
kernel-2.6.13-1.1532_FC4

How reproducible:
Always

Steps to Reproduce:
1. Log onto system which uses autofs to NFS mount the home directory
2. Log off
3. Wait, log onto system and examine /var/log/messages
  

Actual Results:  Oct 21 17:57:33 emboss16 kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:3297 (Not tainted)
Oct 21 17:57:33 emboss16 kernel:  [<c031746f>] interruptible_sleep_on_timeout+0xf7/0x113
Oct 21 17:57:33 emboss16 kernel:  [<c012b9e1>] group_send_sig_info+0x59/0x63
Oct 21 17:57:33 emboss16 kernel:  [<c011d046>] default_wake_function+0x0/0xc
Oct 21 17:57:33 emboss16 kernel:  [<f8a4f48b>] lockd_down+0xbe/0x120 [lockd]
Oct 21 17:57:33 emboss16 kernel:  [<f8b0fcbe>] nfs_kill_super+0x5e/0x62 [nfs]
Oct 21 17:57:33 emboss16 kernel:  [<c0169fd8>] deactivate_super+0x5d/0x6e
Oct 21 17:57:33 emboss16 kernel:  [<c017ef34>] sys_umount+0x33/0x73
Oct 21 17:57:33 emboss16 kernel:  [<c017bf5b>] destroy_inode+0x3f/0x4e
Oct 21 17:57:33 emboss16 kernel:  [<c0108055>] do_syscall_trace+0xef/0x123
Oct 21 17:57:33 emboss16 kernel:  [<c017ef8b>] sys_oldumount+0x17/0x1b
Oct 21 17:57:33 emboss16 kernel:  [<c010395d>] syscall_call+0x7/0xb


Expected Results:  No such error in /var/log/messages

Additional info:

Comment 1 Moritz Baumann 2005-10-27 08:31:14 UTC
output from dmesg:

Bluetooth: RFCOMM TTY layer initialized
device eth0 entered promiscuous mode
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE,EPP]
lp0: using parport0 (interrupt-driven).
lp0: console ready
Badness in interruptible_sleep_on_timeout at kernel/sched.c:3297 (Tainted: P  )
 [<c031746f>] interruptible_sleep_on_timeout+0xf7/0x113
 [<c012b9e1>] group_send_sig_info+0x59/0x63
 [<c011d046>] default_wake_function+0x0/0xc
 [<f8ea648b>] lockd_down+0xbe/0x120 [lockd]
 [<f924bcbe>] nfs_kill_super+0x5e/0x62 [nfs]
 [<c0169fd8>] deactivate_super+0x5d/0x6e
 [<c017ef34>] sys_umount+0x33/0x73
 [<c017bf5b>] destroy_inode+0x3f/0x4e
 [<c0108055>] do_syscall_trace+0xef/0x123
 [<c017ef8b>] sys_oldumount+0x17/0x1b
 [<c010395d>] syscall_call+0x7/0xb
application mixer_applet2 uses obsolete OSS audio interface

automount is running, but no fs got mounted so far (homes are local).


Comment 2 Guy Streeter 2005-11-07 19:13:59 UTC
I see this traceback in my logs a lot. kernel-2.6.13-1.1532_FC4smp

Comment 3 Dave Jones 2005-11-07 20:13:23 UTC
does it still happen with the 2.6.14 kernel in updates-testing ?


Comment 4 Alan Bleasby 2005-11-07 22:23:15 UTC
Sadly, yes.

Nov  7 22:22:21 emboss16 kernel: Badness in interruptible_sleep_on_timeout at
kernel/sched.c:3403 (Not tainted)
Nov  7 22:22:21 emboss16 kernel:  [<c031d1c2>]
interruptible_sleep_on_timeout+0xf7/0x114
Nov  7 22:22:21 emboss16 kernel:  [<c012b05c>] group_send_sig_info+0x59/0x63
Nov  7 22:22:21 emboss16 kernel:  [<c011c4de>] default_wake_function+0x0/0xc
Nov  7 22:22:21 emboss16 kernel:  [<f8a524fb>] lockd_down+0xbe/0x120 [lockd]
Nov  7 22:22:21 emboss16 kernel:  [<f8b13cf7>] nfs_kill_super+0x5c/0x5e [nfs]
Nov  7 22:22:21 emboss16 kernel:  [<c016983d>] deactivate_super+0x60/0x71
Nov  7 22:22:21 emboss16 kernel:  [<c017e97e>] sys_umount+0x33/0x73
Nov  7 22:22:21 emboss16 kernel:  [<c0107dc6>] do_syscall_trace+0x1e4/0x1f6
Nov  7 22:22:21 emboss16 kernel:  [<c017e9d5>] sys_oldumount+0x17/0x1b
Nov  7 22:22:21 emboss16 kernel:  [<c01039e1>] syscall_call+0x7/0xb


Comment 5 Dave Jones 2005-11-09 20:04:51 UTC
*** Bug 172763 has been marked as a duplicate of this bug. ***

Comment 6 Erik A. Espinoza 2005-11-09 22:34:39 UTC
I can confirm that this bug happens on x86_64 and with the latest kernel in
testing, 2.6.14-1.1633_FC4smp.

Comment 7 Dave Jones 2005-11-10 20:18:02 UTC
2.6.14-1.1637_FC4 has been released as an update for FC4.
Please retest with this update, as a large amount of code has been changed in
this release, which may have fixed your problem.

Thank you.


Comment 8 Alan Bleasby 2005-11-10 21:52:26 UTC
Thanks for the suggestion: 2.6.14-1.1637_FC4 has the same fault.

Nov 10 21:44:38 emboss16 kernel: Badness in interruptible_sleep_on_timeout at
kernel/sched.c:3403 (Not tainted)
Nov 10 21:44:38 emboss16 kernel:  [<c031d2b2>]
interruptible_sleep_on_timeout+0xf7/0x114
Nov 10 21:44:38 emboss16 kernel:  [<c012b213>] group_send_sig_info+0x59/0x63
Nov 10 21:44:38 emboss16 kernel:  [<c011c4de>] default_wake_function+0x0/0xc
Nov 10 21:44:38 emboss16 kernel:  [<f8a524fb>] lockd_down+0xbe/0x120 [lockd]
Nov 10 21:44:38 emboss16 kernel:  [<f8b13cf7>] nfs_kill_super+0x5c/0x5e [nfs]
Nov 10 21:44:38 emboss16 kernel:  [<c01699ed>] deactivate_super+0x60/0x71
Nov 10 21:44:38 emboss16 kernel:  [<c017eb4e>] sys_umount+0x33/0x73
Nov 10 21:44:38 emboss16 kernel:  [<c0107dc6>] do_syscall_trace+0x1e4/0x1f6
Nov 10 21:44:38 emboss16 kernel:  [<c017eba5>] sys_oldumount+0x17/0x1b
Nov 10 21:44:38 emboss16 kernel:  [<c01039e1>] syscall_call+0x7/0xb


Comment 9 JM 2005-11-16 15:15:02 UTC
I get the same messages with kernel 2.6.14-1.1637_FC4smp on a dual AMD Opteron
system.

Badness in interruptible_sleep_on_timeout at kernel/sched.c:3403 (Tainted: P     )

Call Trace:<ffffffff80341f10>{interruptible_sleep_on_timeout+131}
       <ffffffff80130981>{default_wake_function+0}
<ffffffff885d6377>{:lockd:lockd_down+207}
       <ffffffff885ec845>{:nfs:nfs_kill_super+81}
<ffffffff80186979>{deactivate_super+95}
       <ffffffff8019cdb5>{sys_umount+735}
<ffffffff801106ba>{syscall_trace_enter+217}
       <ffffffff801106f7>{syscall_trace_leave+55} <ffffffff8010da80>{tracesys+113}
       <ffffffff8010dae0>{tracesys+209}


Comment 10 JM 2005-11-16 15:17:41 UTC
I forgot, it's the 2.6.14-1.1637_FC4smp kernel for x86_64.

Comment 11 Matthew West 2005-11-20 15:41:41 UTC
I'd just like to chime in with a "me too". Kernel 2.6.14-1.1637_FC4smp on a P4 with HT. It happens 
intermittantly on several machines here, always between 60 and 90 seconds after cron.hourly has run 
(which accesses NFS mounts, amongst other things).

Nov 19 12:02:31 aspect kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:3403 (Not 
tainted)
Nov 19 12:02:31 aspect kernel:  [<c031d2b2>] interruptible_sleep_on_timeout+0xf7/0x114
Nov 19 12:02:31 aspect kernel:  [<c012b213>] group_send_sig_info+0x59/0x63
Nov 19 12:02:31 aspect kernel:  [<c011c4de>] default_wake_function+0x0/0xc
Nov 19 12:02:31 aspect kernel:  [<f8bbf4fb>] lockd_down+0xbe/0x120 [lockd]
Nov 19 12:02:31 aspect kernel:  [<f8c12cf7>] nfs_kill_super+0x5c/0x5e [nfs]
Nov 19 12:02:31 aspect kernel:  [<c01699ed>] deactivate_super+0x60/0x71
Nov 19 12:02:31 aspect kernel:  [<c017eb4e>] sys_umount+0x33/0x73
Nov 19 12:02:31 aspect kernel:  [<c0107dc6>] do_syscall_trace+0x1e4/0x1f6
Nov 19 12:02:31 aspect kernel:  [<c017eba5>] sys_oldumount+0x17/0x1b
Nov 19 12:02:31 aspect kernel:  [<c01039e1>] syscall_call+0x7/0xb


Comment 12 Alfons Zitterbacke 2005-11-30 17:26:38 UTC
I've got P4 with HT and it always happens on shutdown.

kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:3403
(Tainted: P     )
kernel:
kernel: Call Trace:<ffffffff80341f10>{interruptible_sleep_on_timeout+131}
kernel:        <ffffffff80130981>{default_wake_function+0}
<ffffffff884a3377>{:lockd:lockd_down+207}
kernel:        <ffffffff884b9845>{:nfs:nfs_kill_super+81}
<ffffffff80186979>{deactivate_super+95}
kernel:        <ffffffff8019cdb5>{sys_umount+735}
<ffffffff801106ba>{syscall_trace_enter+217}
kernel:        <ffffffff801106f7>{syscall_trace_leave+55}
<ffffffff8010da80>{tracesys+113}
kernel:        <ffffffff8010dae0>{tracesys+209}

But I don't use autofs to mount home directory, it is just a filesystem added to
/etc/fstab .

Comment 13 Alfons Zitterbacke 2005-11-30 17:28:31 UTC
I forgot, it's the 2.6.14-1.1637_FC4smp kernel for x86_64.

Comment 14 Dave Jones 2005-11-30 18:34:53 UTC
should be fixed in 1644 which was released a few days ago.


Comment 15 Alan Bleasby 2005-11-30 19:07:11 UTC
I'm afraid I cannot tell if that is the case. The machine did not reboot
yesterday after 1644 was installed. The machine involved is in a machine room
to which I may not be able to negotiate access. Perhaps one of the other people
who've replied will be able to tell you whether it worked.


Comment 16 Alan Bleasby 2005-12-08 14:12:46 UTC
Just been able to gain access to the affected machine. The new kernel has
indeed cured the problem. Thanks.


Comment 17 Matthew West 2005-12-08 18:55:48 UTC
The new kernel has fixed the problem for me. None of my machines have logged a 
interruptible_sleep_on_timeout since the update. Thanks!


Note You need to log in before you can comment on or make changes to this bug.