Bug 141757 (IT_56112)

Summary:

Infinite loop when syncing over automounted NFS

Product:

Red Hat Enterprise Linux 3

Reporter:

Bastien Nocera <bnocera>

Component:

kernel

Assignee:

Steve Dickson <steved>

Status:

CLOSED ERRATA

QA Contact:

Brian Brock <bbrock>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

3.0

CC:

hgarcia, hooft, kanderso, peterm, petrides, riel, sct, tao

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2005-05-18 13:28:48 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

132991

Attachments:

Description	Flags
script.pl	none
altsysrq.txt	none
Proposed Patch	none

Description Bastien Nocera 2004-12-03 16:19:47 UTC

kernel 2.4.21-25.EL

Setup:
- 1 NFS server
- 2 NFS clients that automount a directory in rw from the NFS server

Reproducing the problem:
a) cd to the auto mountpoint on client number 1
b) cd to the auto mountpoint on client number 2
c) execute script.pl on client number 1
d) on client number 2, delete the file being generated by client number 1
e) now, client number 1 starts complaining about ESTALE, which is
still acceptable. However, if the file is deleted from client number
1, the file is renamed to ".nfsXXXXXXX" instead and no error is
returned, writes are successful and there is no hang.
f) on client number 1, run the sync command

Result:
Client number 1 spins inside in wait_on_locked() (it is inlined, so
only the caller, sync_inodes_sb() shows)

Comment 1 Bastien Nocera 2004-12-03 16:22:12 UTC

The modified wait_on_locked() and sync_inodes_sb():

static void wait_on_locked(struct list_head *head)
{
       printk("wait_on_locked begins\n");
       struct list_head * tmp;
       while ((tmp = head->prev) != head) {
               struct inode *inode = list_entry(tmp, struct inode,
i_list);
               printk("i_ino: %lu\n", inode->i_ino);
               printk("struct inode address: %p\n", (void *)inode);
               printk("head address: %p\n", (void *)head);
               printk("tmp address: %p\n", (void *)tmp);
               printk("head->prev address: %p\n", (void *)(head->prev));
               __iget(inode);
               spin_unlock(&inode_lock);
               __wait_on_inode(inode);
               iput(inode);
               spin_lock(&inode_lock);
       }
       printk("wait_on_locked ends\n");
}

the above function is originally inline in the kernel source and it is
called by:

void sync_inodes_sb(struct super_block *sb)
{
       spin_lock(&inode_lock);
       while
(!list_empty(&sb->s_dirty)||!list_empty(&sb->s_locked_inodes)) {
               sync_list(&sb->s_dirty);
               wait_on_locked(&sb->s_locked_inodes);
       }
       spin_unlock(&inode_lock);
}

The output when reproducing the problem:
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
i_ino: 213089
struct inode address: cd6a1780
head address: ce03f06c
tmp address: cd6a1788
head->prev address: cd6a1788
etc.

Comment 2 Bastien Nocera 2004-12-03 16:23:21 UTC

Created attachment 107839 [details]
script.pl

Reproducer script

Comment 3 Bastien Nocera 2004-12-03 16:24:15 UTC

Created attachment 107840 [details]
altsysrq.txt

Alt+SysRq+T of the "hang"

Comment 5 Bastien Nocera 2004-12-06 10:10:44 UTC

It's not an autofs loopback mount, the NFS mounts are automounted on the clients
from the NFS server.

Comment 6 Jeff Moyer 2004-12-06 13:54:14 UTC

It seems to me that this can be reproduced without the automounter, yes?

Comment 7 Bastien Nocera 2004-12-06 14:05:26 UTC

automount only seems to trigger the hang quicker, but with a normal
NFS mount, the problem still happens after a couple of sync()'s.

Comment 8 Bastien Nocera 2004-12-06 16:09:00 UTC

The NFS options triggering the bug were:
acregmin=1,acregmax=1

Any low values of acregmax would trigger the hang. Steve, please let
me know if this is normal. If it is, the bug can be closed.

Comment 10 Bastien Nocera 2004-12-08 09:53:26 UTC

It looks like, after more thorough testing, the default values for
acregmin and acregmax don't fix the issue.

Comment 12 Steve Dickson 2004-12-08 11:29:10 UTC

Ok... Would it possible to get the raw output of the dump?
Meaning either "tethereal -w /tmp/ethdump.pcap" or 
tcpdump -o /tmp/tcpdump.pcap. Having the raw data makes it
easier to sort out the noise.... Also You might want to bzip2
any dumps since it makes it easier to download...

Comment 15 Steve Dickson 2004-12-14 19:28:16 UTC

Created attachment 108552 [details]
Proposed Patch

The sync process loops in wait_on_locked(), when called from
sync_inodes_sb(), since the "broken" inode can not be cleared
from the locked inode list.

This patch sets the NFS_INO_STALE bit in write path (via
nfs_writeback_done) which breaks the inode is early enough to
stop it from being added to the that list.

Comment 20 Ernie Petrides 2005-01-11 23:43:37 UTC

A fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-27.7.EL).

Comment 21 Tim Powers 2005-05-18 13:28:48 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html