Bug 118413

Summary: (FS AUTOFS4) badness in interruptible sleep
Product: [Fedora] Fedora Reporter: Thomas J. Baker <tjb>
Component: kernelAssignee: Jeff Moyer <jmoyer>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 2CC: alan, anvil, bfox, d+redhat, mb/redhat, ndbecker2, notting, oliva, orion, patc, strecken
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.6-1.391 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-04-05 14:07:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Thomas J. Baker 2004-03-16 15:24:33 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040217 Galeon/1.3.13

Description of problem:

blackstar is writing to the local machine (I've got it doing a
rpmbuild of mozilla and my home directory is mounted on blackstar from
the local machine). I noticed it was taking forever just to extract
the tarball so I looked at dmesg on the local machine and saw this:

nfs: server blackstar not responding, timed out
nfs: server blackstar.sr.unh.edu not responding, timed out
nfs: server blackstar not responding, timed out
nfs: server blackstar.sr.unh.edu not responding, timed out
Badness in interruptible_sleep_on at kernel/sched.c:1924
Call Trace:
 [<02124031>] interruptible_sleep_on+0x5a/0x14a
 [<021ae4f2>] avc_has_perm+0x3f/0x49
 [<02123c08>] default_wake_function+0x0/0xc
 [<021ae4f2>] avc_has_perm+0x3f/0x49
 [<42a88a97>] autofs4_wait+0x252/0x301 [autofs4]
 [<021af868>] inode_has_perm+0x57/0x5f
 [<42a87562>] try_to_fill_dentry+0x21/0x17a [autofs4]
 [<42a87801>] autofs4_root_revalidate+0x146/0x18f [autofs4]
 [<0216ec65>] do_lookup+0x54/0x72
 [<0216f46b>] link_path_walk+0x7e8/0x9b6
 [<0216f96e>] path_lookup+0x165/0x195
 [<0216faaa>] __user_walk+0x21/0x51
 [<0217dc00>] sys_umount+0x14/0x64
 [<021610ab>] filp_close+0x59/0x5f
 [<02161151>] sys_close+0xa0/0xd3
 
nfs: server blackstar not responding, timed out
nfs: server blackstar.sr.unh.edu not responding, timed out
nfs: server blackstar not responding, timed out
nfs: server blackstar.sr.unh.edu not responding, timed out
Badness in interruptible_sleep_on at kernel/sched.c:1924
Call Trace:
 [<02124031>] interruptible_sleep_on+0x5a/0x14a
 [<02123c08>] default_wake_function+0x0/0xc
 [<42a88a97>] autofs4_wait+0x252/0x301 [autofs4]
 [<021af868>] inode_has_perm+0x57/0x5f
 [<42a87637>] try_to_fill_dentry+0xf6/0x17a [autofs4]
 [<42a87801>] autofs4_root_revalidate+0x146/0x18f [autofs4]
 [<0216ec65>] do_lookup+0x54/0x72
 [<0216f086>] link_path_walk+0x403/0x9b6
 [<0216f96e>] path_lookup+0x165/0x195
 [<0216faaa>] __user_walk+0x21/0x51
 [<0216acc6>] vfs_lstat+0x11/0x37
 [<0215cace>] free_pages_and_swap_cache+0x5d/0x70
 [<0216b224>] sys_lstat64+0xf/0x23
 [<02155129>] unmap_vma_list+0xe/0x17
 [<021555d7>] do_munmap+0x17e/0x18a
 [<02120203>] do_page_fault+0x0/0x4b5
 
icmpv6_send: addr_any/mcast source
icmpv6_send: addr_any/mcast source

On blackstar at the time, this is what I saw:

blackstar> rpmbuild -ba mozilla.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.84616
+ umask 022
+ cd /net/home/rcc/tjb/rpm/BUILD
+ LANG=C
+ export LANG
+ unset DISPLAY
+ cd /net/home/rcc/tjb/rpm/BUILD
+ rm -rf mozilla
+ /usr/bin/bzip2 -dc
/net/home/rcc/tjb/rpm/SOURCES/mozilla-source-1.4.tar.bz2
+ tar -xf -
tar: mozilla/layout/base/public/nsICanvasFrame.h: Cannot close:
Input/output error
tar: mozilla/layout/html/tests/table/marvin/x_caption_align_top.xml:
Cannot close: Input/output error
tar: mozilla/mailnews/base/src/nsMsgAccountManagerDS.cpp: Cannot
close: Input/output error
 
It's still going but this is probably enough information to get this
bug rolling.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.3-2.1.253

How reproducible:
Didn't try


Additional info:

Comment 1 Arjan van de Ven 2004-03-20 09:44:22 UTC
Steve: this is one of those broken sleep_on uses in NFS we talked
about before.....

Comment 2 Jeff Moyer 2004-05-06 13:10:29 UTC
*** Bug 118589 has been marked as a duplicate of this bug. ***

Comment 3 Jeff Moyer 2004-05-06 13:10:50 UTC
*** Bug 122509 has been marked as a duplicate of this bug. ***

Comment 4 Jeff Moyer 2004-05-06 13:15:04 UTC
Actually, this is a broken interruptible_sleep_on in the autofs code.
 Ian Kent's patch set fixes this.  It's in the -mm tree, so hopefully
it will get merged into 2.6 on short order.

Comment 5 Jeff Moyer 2004-05-19 11:19:50 UTC
*** Bug 123550 has been marked as a duplicate of this bug. ***

Comment 6 Jeff Moyer 2004-05-21 23:00:02 UTC
*** Bug 123923 has been marked as a duplicate of this bug. ***

Comment 7 Jeff Moyer 2004-05-24 14:43:42 UTC
*** Bug 124014 has been marked as a duplicate of this bug. ***

Comment 8 Jeff Moyer 2004-05-24 14:44:40 UTC
Additional Comment #1 From Arjan van de Ven (arjanv)  on
2004-05-23 02:45 -------

you might want to try the update candidate kernel from
http://people.redhat.com/arjanv/2.6/



Comment 9 Jeff Moyer 2004-05-27 17:08:08 UTC
*** Bug 124573 has been marked as a duplicate of this bug. ***

Comment 10 Orion Poplawski 2004-05-27 20:57:31 UTC
Appears to be fixed with kernel-smp-2.6.6-1.391

Comment 11 Alexandre Oliva 2004-05-27 22:09:18 UTC
Seems to be fixed in the 1.383 FC2 update in testing as well.

Comment 12 Jeff Moyer 2004-06-01 13:06:09 UTC
*** Bug 124119 has been marked as a duplicate of this bug. ***

Comment 13 Jeff Moyer 2004-06-01 17:19:42 UTC
*** Bug 124929 has been marked as a duplicate of this bug. ***

Comment 14 André Johansen 2004-06-04 08:50:34 UTC
I've seen a similar bug as well.  P4 HT on FC2 SMP kernel: 
 
Jun  2 15:40:51 anduin kernel: Badness in interruptible_sleep_on at 
kernel/sched.c:1927 
Jun  2 15:40:51 anduin kernel: Call Trace: 
Jun  2 15:40:51 anduin kernel:  [<0229fe72>] 
interruptible_sleep_on+0x5a/0xc6 
Jun  2 15:40:51 anduin kernel:  [<0211b419>] 
default_wake_function+0x0/0xc 
Jun  2 15:40:51 anduin kernel:  [<428a594a>] 
ext3_read_inode+0x2af/0x2c0 [ext3] 
Jun  2 15:40:51 anduin kernel:  [<021620f2>] d_instantiate+0x54/0x57 
Jun  2 15:40:51 anduin kernel:  [<02150ba8>] 
__find_get_block+0xb5/0xbe 
Jun  2 15:40:51 anduin kernel:  [<4af8e4f9>] 
autofs4_wait+0x1fe/0x25e [autofs4] 
Jun  2 15:40:51 anduin kernel:  [<4af8d5e5>] 
try_to_fill_dentry+0xa8/0x100 [autofs4] 
Jun  2 15:40:51 anduin kernel:  [<02159950>] do_lookup+0x54/0x72 
Jun  2 15:40:51 anduin kernel:  [<02159f70>] 
link_path_walk+0x602/0x7d0 
Jun  2 15:40:51 anduin kernel:  [<0215a42b>] path_lookup+0x13f/0x16f 
Jun  2 15:40:51 anduin kernel:  [<0215a567>] __user_walk+0x21/0x51 
Jun  2 15:40:51 anduin kernel:  [<021561ae>] vfs_lstat+0x11/0x37 
Jun  2 15:40:51 anduin kernel:  [<0215670c>] sys_lstat64+0xf/0x23 
Jun  2 15:40:51 anduin kernel: 
 

Comment 15 Jeff Moyer 2004-08-03 14:36:44 UTC
*** Bug 128851 has been marked as a duplicate of this bug. ***