Bug 141757 (IT_56112)

Summary: Infinite loop when syncing over automounted NFS
Product: Red Hat Enterprise Linux 3 Reporter: Bastien Nocera <bnocera>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: hgarcia, hooft, kanderso, peterm, petrides, riel, sct, tao
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-18 13:28:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 132991    
Attachments:
Description Flags
script.pl
none
altsysrq.txt
none
Proposed Patch none

Description Bastien Nocera 2004-12-03 16:19:47 UTC
kernel 2.4.21-25.EL

Setup:
- 1 NFS server
- 2 NFS clients that automount a directory in rw from the NFS server

Reproducing the problem:
a) cd to the auto mountpoint on client number 1
b) cd to the auto mountpoint on client number 2
c) execute script.pl on client number 1
d) on client number 2, delete the file being generated by client number 1
e) now, client number 1 starts complaining about ESTALE, which is
still acceptable. However, if the file is deleted from client number
1, the file is renamed to ".nfsXXXXXXX" instead and no error is
returned, writes are successful and there is no hang.
f) on client number 1, run the sync command

Result:
Client number 1 spins inside in wait_on_locked() (it is inlined, so
only the caller, sync_inodes_sb() shows)

Comment 1 Bastien Nocera 2004-12-03 16:22:12 UTC
The modified wait_on_locked() and sync_inodes_sb():

static void wait_on_locked(struct list_head *head)
{
       printk("wait_on_locked begins\n");
       struct list_head * tmp;
       while ((tmp = head->prev) != head) {
               struct inode *inode = list_entry(tmp, struct inode,
i_list);
               printk("i_ino: %lu\n", inode->i_ino);
               printk("struct inode address: %p\n", (void *)inode);
               printk("head address: %p\n", (void *)head);
               printk("tmp address: %p\n", (void *)tmp);
               printk("head->prev address: %p\n", (void *)(head->prev));
               __iget(inode);
               spin_unlock(&inode_lock);
               __wait_on_inode(inode);
               iput(inode);
               spin_lock(&inode_lock);
       }
       printk("wait_on_locked ends\n");
}

the above function is originally inline in the kernel source and it is
called by:

void sync_inodes_sb(struct super_block *sb)
{
       spin_lock(&inode_lock);
       while
(!list_empty(&sb->s_dirty)||!list_empty(&sb->s_locked_inodes)) {
               sync_list(&sb->s_dirty);
               wait_on_locked(&sb->s_locked_inodes);
       }
       spin_unlock(&inode_lock);
}

The output when reproducing the problem:
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
etc.

Comment 2 Bastien Nocera 2004-12-03 16:23:21 UTC
Created attachment 107839 [details]
script.pl

Reproducer script

Comment 3 Bastien Nocera 2004-12-03 16:24:15 UTC
Created attachment 107840 [details]
altsysrq.txt

Alt+SysRq+T of the "hang"

Comment 5 Bastien Nocera 2004-12-06 10:10:44 UTC
It's not an autofs loopback mount, the NFS mounts are automounted on the clients
from the NFS server.

Comment 6 Jeff Moyer 2004-12-06 13:54:14 UTC
It seems to me that this can be reproduced without the automounter, yes?  

Comment 7 Bastien Nocera 2004-12-06 14:05:26 UTC
automount only seems to trigger the hang quicker, but with a normal
NFS mount, the problem still happens after a couple of sync()'s.

Comment 8 Bastien Nocera 2004-12-06 16:09:00 UTC
The NFS options triggering the bug were:
acregmin=1,acregmax=1

Any low values of acregmax would trigger the hang. Steve, please let
me know if this is normal. If it is, the bug can be closed.

Comment 10 Bastien Nocera 2004-12-08 09:53:26 UTC
It looks like, after more thorough testing, the default values for
acregmin and acregmax don't fix the issue.

Comment 12 Steve Dickson 2004-12-08 11:29:10 UTC
Ok... Would it possible to get the raw output of the dump?
Meaning either "tethereal -w /tmp/ethdump.pcap" or 
tcpdump -o /tmp/tcpdump.pcap. Having the raw data makes it
easier to sort out the noise.... Also You might want to bzip2
any dumps since it makes it easier to download... 



Comment 15 Steve Dickson 2004-12-14 19:28:16 UTC
Created attachment 108552 [details]
Proposed Patch

The sync process loops in wait_on_locked(), when called from
sync_inodes_sb(), since the "broken" inode can not be cleared
from the locked inode list.

This patch sets the NFS_INO_STALE bit in write path (via
nfs_writeback_done) which breaks the inode is early enough to
stop it from being added to the that list.

Comment 20 Ernie Petrides 2005-01-11 23:43:37 UTC
A fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-27.7.EL).


Comment 21 Tim Powers 2005-05-18 13:28:48 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html