Bug 141757 - (IT_56112) Infinite loop when syncing over automounted NFS
Infinite loop when syncing over automounted NFS
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Steve Dickson
Brian Brock
:
Depends On:
Blocks: 132991
  Show dependency treegraph
 
Reported: 2004-12-03 11:19 EST by Bastien Nocera
Modified: 2007-11-30 17:07 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-18 09:28:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
script.pl (232 bytes, text/plain)
2004-12-03 11:23 EST, Bastien Nocera
no flags Details
altsysrq.txt (21.72 KB, text/plain)
2004-12-03 11:24 EST, Bastien Nocera
no flags Details
Proposed Patch (807 bytes, patch)
2004-12-14 14:28 EST, Steve Dickson
no flags Details | Diff

  None (edit)
Description Bastien Nocera 2004-12-03 11:19:47 EST
kernel 2.4.21-25.EL

Setup:
- 1 NFS server
- 2 NFS clients that automount a directory in rw from the NFS server

Reproducing the problem:
a) cd to the auto mountpoint on client number 1
b) cd to the auto mountpoint on client number 2
c) execute script.pl on client number 1
d) on client number 2, delete the file being generated by client number 1
e) now, client number 1 starts complaining about ESTALE, which is
still acceptable. However, if the file is deleted from client number
1, the file is renamed to ".nfsXXXXXXX" instead and no error is
returned, writes are successful and there is no hang.
f) on client number 1, run the sync command

Result:
Client number 1 spins inside in wait_on_locked() (it is inlined, so
only the caller, sync_inodes_sb() shows)
Comment 1 Bastien Nocera 2004-12-03 11:22:12 EST
The modified wait_on_locked() and sync_inodes_sb():

static void wait_on_locked(struct list_head *head)
{
       printk("wait_on_locked begins\n");
       struct list_head * tmp;
       while ((tmp = head->prev) != head) {
               struct inode *inode = list_entry(tmp, struct inode,
i_list);
               printk("i_ino: %lu\n", inode->i_ino);
               printk("struct inode address: %p\n", (void *)inode);
               printk("head address: %p\n", (void *)head);
               printk("tmp address: %p\n", (void *)tmp);
               printk("head->prev address: %p\n", (void *)(head->prev));
               __iget(inode);
               spin_unlock(&inode_lock);
               __wait_on_inode(inode);
               iput(inode);
               spin_lock(&inode_lock);
       }
       printk("wait_on_locked ends\n");
}

the above function is originally inline in the kernel source and it is
called by:

void sync_inodes_sb(struct super_block *sb)
{
       spin_lock(&inode_lock);
       while
(!list_empty(&sb->s_dirty)||!list_empty(&sb->s_locked_inodes)) {
               sync_list(&sb->s_dirty);
               wait_on_locked(&sb->s_locked_inodes);
       }
       spin_unlock(&inode_lock);
}

The output when reproducing the problem:
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
etc.
Comment 2 Bastien Nocera 2004-12-03 11:23:21 EST
Created attachment 107839 [details]
script.pl

Reproducer script
Comment 3 Bastien Nocera 2004-12-03 11:24:15 EST
Created attachment 107840 [details]
altsysrq.txt

Alt+SysRq+T of the "hang"
Comment 5 Bastien Nocera 2004-12-06 05:10:44 EST
It's not an autofs loopback mount, the NFS mounts are automounted on the clients
from the NFS server.
Comment 6 Jeffrey Moyer 2004-12-06 08:54:14 EST
It seems to me that this can be reproduced without the automounter, yes?  
Comment 7 Bastien Nocera 2004-12-06 09:05:26 EST
automount only seems to trigger the hang quicker, but with a normal
NFS mount, the problem still happens after a couple of sync()'s.
Comment 8 Bastien Nocera 2004-12-06 11:09:00 EST
The NFS options triggering the bug were:
acregmin=1,acregmax=1

Any low values of acregmax would trigger the hang. Steve, please let
me know if this is normal. If it is, the bug can be closed.
Comment 10 Bastien Nocera 2004-12-08 04:53:26 EST
It looks like, after more thorough testing, the default values for
acregmin and acregmax don't fix the issue.
Comment 12 Steve Dickson 2004-12-08 06:29:10 EST
Ok... Would it possible to get the raw output of the dump?
Meaning either "tethereal -w /tmp/ethdump.pcap" or 
tcpdump -o /tmp/tcpdump.pcap. Having the raw data makes it
easier to sort out the noise.... Also You might want to bzip2
any dumps since it makes it easier to download... 

Comment 15 Steve Dickson 2004-12-14 14:28:16 EST
Created attachment 108552 [details]
Proposed Patch

The sync process loops in wait_on_locked(), when called from
sync_inodes_sb(), since the "broken" inode can not be cleared
from the locked inode list.

This patch sets the NFS_INO_STALE bit in write path (via
nfs_writeback_done) which breaks the inode is early enough to
stop it from being added to the that list.
Comment 20 Ernie Petrides 2005-01-11 18:43:37 EST
A fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-27.7.EL).
Comment 21 Tim Powers 2005-05-18 09:28:48 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html

Note You need to log in before you can comment on or make changes to this bug.