446396 – crm #1790828 Kernel 2.6.9-67.ELsmp panics in nfs4_free_client

Bug 446396 - crm #1790828 Kernel 2.6.9-67.ELsmp panics in nfs4_free_client

Summary: crm #1790828 Kernel 2.6.9-67.ELsmp panics in nfs4_free_client

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.6
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Jeff Layton
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	457407 493709 (view as bug list)
Depends On:
Blocks:	461297
TreeView+	depends on / blocked

Reported:	2008-05-14 13:42 UTC by Issue Tracker
Modified:	2018-10-20 03:19 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-05-18 19:19:27 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
proposed patch -- handle nfs4 incomplete opens more comprehensively (10.49 KB, patch) 2008-06-18 15:21 UTC, Jeff Layton	no flags	Details \| Diff
patch -- handle nfs4 incomplete opens more comprehensively (12.75 KB, patch) 2008-09-03 18:59 UTC, Jeff Layton	no flags	Details \| Diff
patch -- handle nfs4 incomplete opens more comprehensively (14.16 KB, patch) 2008-09-17 19:04 UTC, Jeff Layton	no flags	Details \| Diff
updated patch -- handle nfs4 incomplete opens more comprehensively (13.92 KB, patch) 2008-09-23 15:50 UTC, Jeff Layton	no flags	Details \| Diff
Show Obsolete (3) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2009:1024	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 4.8 kernel security and bug fix update	2009-05-18 14:57:26 UTC

Description Issue Tracker 2008-05-14 13:42:51 UTC

Escalated to Bugzilla from IssueTracker

Comment 1 Issue Tracker 2008-05-14 13:42:53 UTC

Description of problem: System panic with below panic message :

------------[ cut here ]------------
kernel BUG at fs/nfs/nfs4state.c:135!
invalid operand: 0000 [#1]
SMP 
Modules linked in: loop netconsole netdump sg ppp_async ppp_generic slhc crc_ccitt mptctl mptbase ipmi_devintf ipmi_si ipmi_msghandler dell_rbu nfs lockd nfs_acl autofs4 sunrpc ipt_limit ipt_state ip_conntrack iptable_filter ip_tables dm_mirror dm_mod button battery ac ohci_hcd e1000 tg3 floppy ext3 jbd aacraid sd_mod scsi_mod
CPU:    1
EIP:    0060:[<f8b46ea4>]    Not tainted VLI
EFLAGS: 00010202   (2.6.9-67.ELsmp) 
EIP is at nfs4_free_client+0x35/0x69 [nfs]
eax: c305ee40   ebx: c305ee00   ecx: c305ee48   edx: c1000000
esi: c3191600   edi: f8b5c860   ebp: cc66d000   esp: cc66df34
ds: 007b   es: 007b   ss: 0068
Process umount (pid: 15237, threadinfo=cc66d000 task=e5781630)
Stack: c3191600 f8b46ce8 c3191c00 f8b2faab c3191c40 c3191c00 c016103c 00000000 
       bff3ebf0 08a2f11d c0175392 d039c17c f5d9aa00 c01640ba 00000202 00000000 
       00000001 00000001 00000000 c01512da f6f470c4 e822fd84 c015160a b7ce8000 
Call Trace:
 [<f8b46ce8>] destroy_nfsv4_state+0x2b/0x37 [nfs]
 [<f8b2faab>] nfs4_kill_super+0x3b/0x5c [nfs]
 [<c016103c>] deactivate_super+0x5b/0x70
 [<c0175392>] sys_umount+0x65/0x6c
 [<c01640ba>] sys_stat64+0xf/0x23
 [<c01512da>] unmap_vma_list+0xe/0x17
 [<c015160a>] do_munmap+0x108/0x116
 [<c01753a4>] sys_oldumount+0xb/0xe
 [<c02d8607>] syscall_call+0x7/0xb
Code: c1 74 20 8b 41 04 8b 11 89 42 04 89 10 89 c8 c7 01 00 01 10 00 c7 41 04 00 02 20 00 e8 95 16 60 c7 eb d6 8d 43 40 39 43 40 74 08 <0f> 0b 87 00 a2 eb b4 f8 8b 43 64 85 c0 74 05 e8 0e f6 f6 ff 89 

How reproducible: Random

Steps to Reproduce: No real steps but we have a complete vmcore and I had attached the CAS report.

Actual results: System panic at random.

Expected results: System should not panic.

Additional info:

Your corefile is ready for you
You may view it at core-i386.gsslab.rdu.redhat.com
Login with kerberos name/password
$ cd /cores/20080331031722/work
/cores/20080331031722/work$ ./crash

Sosreport attached.
This event sent from IssueTracker by fleite  [Support Engineering Group]
 issue 173813

Comment 4 Issue Tracker 2008-05-14 13:42:57 UTC

Found a similar bugzilla:

http://devresources.linux-foundation.org/dev/nfsv4/bugzilla/show_bug.cgi?id=113

One more bugzilla with similar call traces:

https://bugzilla.redhat.com/show_bug.cgi?id=228292#c61

Regards,
Nitin


Issue escalated to Support Engineering Group by: nbansal.
nbansal assigned to issue for Production Support (Pune).
Internal Status set to 'Waiting on SEG'
Status set to: Waiting on Tech

This event sent from IssueTracker by fleite  [Support Engineering Group]
 issue 173813

Comment 5 Issue Tracker 2008-05-14 13:42:58 UTC

Note: Looks related to https://bugzilla.redhat.com/show_bug.cgi?id=433249


This event sent from IssueTracker by fleite  [Support Engineering Group]
 issue 173813

Comment 6 Issue Tracker 2008-05-14 13:42:59 UTC

Note: looks slightly related
https://bugzilla.redhat.com/show_bug.cgi?id=402581


This event sent from IssueTracker by fleite  [Support Engineering Group]
 issue 173813

Comment 7 Issue Tracker 2008-05-14 13:43:00 UTC

I'm not very familiar with the NFSv4 structures, so I'm having a hard
time navigating the vmcore on crash. I'll escalate this to Engineering to
get some extra help.


Issue escalated to RHEL 4 Storage by: fleite.
Internal Status set to 'Waiting on Engineering'

This event sent from IssueTracker by fleite  [Support Engineering Group]
 issue 173813

Comment 9 Jeff Layton 2008-05-20 14:17:03 UTC

Ugh...Busy inodes after umount problem. We probably won't be able to tell much
from the core. Typically, the conditions that cause this situation are long gone
by the time the box crashes.

In this case, we crashed because of this:

        BUG_ON(!list_empty(&clp->cl_state_owners));

This list should be empty, but it's not (likely because we have busy inodes).
Looking at the core, the nfs4_client address is in %ebx. Some interesting fields:

crash> struct nfs4_client c305ee00
...
  cl_state_owners = {
    next = 0xf7d1e880, 
    prev = 0xf7d1e880
  }, 
...


...which has only one entry on the list:

struct nfs4_state_owner {
  so_list = {
    next = 0xc305ee40, 
    prev = 0xc305ee40
  }, 
  so_client = 0xc305ee00, 
  so_id = 0x0, 
  so_sema = {
    count = {
      counter = 0x1
    }, 
    sleepers = 0x0, 
    wait = {
      lock = {
        lock = 0x1, 
        magic = 0xdead4ead
      }, 
      task_list = {
        next = 0xf7d1e8a0, 
        prev = 0xf7d1e8a0
      }
    }
  }, 
  so_seqid = 0x500b9, 
  so_count = {
    counter = 0x1
  }, 
  so_cred = 0xf7d1e900, 
  so_states = {
    next = 0xf7d56600, 
    prev = 0xf7d56600
  }, 
  so_delegations = {
    next = 0xf7d1e8bc, 
    prev = 0xf7d1e8bc
  }
}

...I'll have to work back from here and see if I can track down the inode.

Comment 10 Jeff Layton 2008-05-20 14:44:52 UTC

For reference, I got he nfs4_state_owner like this:

crash> list nfs4_state_owner.so_list -s nfs4_state_owner 0xf7d1e880
f7d1e880

...the so_states list is non-empty:

crash> list nfs4_state.open_states -s nfs4_state 0xf7d56600
f7d56600
struct nfs4_state {
...
  inc_open = {
    next = 0xf7054240, 
    prev = 0xf7054240
  }, 
...
  inode = 0xc5dcf94c, 
...
}

...the inc_open list also seems to be non-empty, so it looks like an open
attempt failed at some point and wasn't properly cleaned up:

crash> list nfs4_inc_open.state -s nfs4_inc_open 0xf7054240
f7054240
struct nfs4_inc_open {
  state = {
    next = 0xf7d56618, 
    prev = 0xf7d56618
  }, 
  task = 0xe5008e30, 
  flags = 0x11
}

...the task here seems to be long gone. Some interesting stuff from the inode:

i_ino = 63361
i_mode = 100644 (regular file)
i_sb = 0xc3190a00, 

...superblock here is also gone from the mount list.

Comment 11 Jeff Layton 2008-05-20 18:45:48 UTC

Ok, my suspicion is that this is another manifestation of this bug:

https://bugzilla.redhat.com/show_bug.cgi?id=234587

...with that patch I had NFSv4 track incomplete opens and clean them up if a
setattr failed (i.e. O_TRUNC opens). Obviously, this doesn't seem to be
sufficient. We need to make sure that we clean up these incomplete opens
whenever open_namei returns an error.

I think this means we need a fs-specific "open_cleanup" inode operation. For
most filesystems, this would be a no-op, but for nfsv4 we'd call this. This does
mean that the fix isn't confined to NFSv4, though I think we can make sure that
the impact is negligible for other filesystems. The big question is how to not
break kabi with this. If we add a new inode op, then we'll also have to add a
flag of some sort to make sure that we don't try to dereference it on any
filesystems that don't have the op.

Comment 12 Jeff Layton 2008-05-27 18:17:07 UTC

This turns out to be rather tricky, actually...

There are a lot of special cases in this codepath and we need to make sure that
we hit the right ones. On a good note, the work already done by Peter Staubach
and David Howells already gives us an extended inode_ops struct that nfs4
already uses. We should be able to just tack an open_cleanup function onto that
struct and add a new SB flag for filesystems that define it (just nfsv4 here).

The tricky part is knowing how to call the new operation. There are several
reasons that open_namei can return error, and I'm not sure whether the nameidata
will be properly filled out in all cases.

Comment 13 Jeff Layton 2008-06-18 15:21:44 UTC

Created attachment 309745 [details]
proposed patch -- handle nfs4 incomplete opens more comprehensively

A possible patch for this problem...

This uses some of the infrastructure that Peter S. added to do 64-bit inode
support. It does the following:

1) renames the INO64 flags to something more generic (since the new op doesn't
have anything to do with 64 bit inodes0

2) adds a new lookup_cleanup extended inode operation, and defines such an
operation for nfsv4.

3) removes the place in nfs4_proc_setattr that would clean up incomplete opens

3) has open_namei() call this lookup_cleanup operation whenever the path_lookup
succeeds, but it eventually returns an error.

This seems to work and still fixes the original issue that the inc_open stuff
is intended to fix. It probably needs more testing. If the customer is easily
able to reproduce the problem the oops in this case, then having them test this
patch would be helpful. I'll plan to add this to my next set of test kernels
and will post here when I have them built.

Comment 14 Jeff Layton 2008-06-19 11:13:55 UTC

I have some test kernels built with this patch (plus other ones that I have
queued up for 4.8):

http://people.redhat.com/jlayton/

...if the customer is able to reproduce this problem and can test these
somewhere non-critical, then that would be helpful...

Comment 15 RHEL Program Management 2008-08-07 14:25:01 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 16 RHEL Program Management 2008-09-03 13:03:25 UTC

Updating PM score.

Comment 17 Jeff Layton 2008-09-03 18:59:17 UTC

Created attachment 315673 [details]
patch -- handle nfs4 incomplete opens more comprehensively

Respun patch.

Since I did the original patch, someone reported bug 457407. That bug demonstrated that the earlier patch still didn't cover enough. In particular, it was possible for an open to happen in the d_revalidate codepath. An incomplete open in that case would not be tracked or cleaned up.

With the new lookup_cleanup operation though, we don't really need to keep track of incomplete opens in such a complicated fashion. Since we're always calling that when open_namei errors out, it's sufficient to just do an extra nfs4_close_state on the nfs4_state and not bother with the extra tracking.

This patch implements that. It fixes the panic in bug 457407 and should also fix the problem here.

This patch is in the test kernels on my people page if anyone wishes to test it:

http://people.redhat.com/jlayton/

Comment 18 Jeff Layton 2008-09-03 19:05:18 UTC

*** Bug 457407 has been marked as a duplicate of this bug. ***

Comment 22 Issue Tracker 2008-09-08 11:26:54 UTC

Hello Jeff-san,

Fujitsu confirmed that the patch 27-bz-446396-nfs4-handle-incomple.patch
worked!
Thanks!

Best Regards,
M Oshiro

Internal Status set to 'Waiting on Engineering'
Status set to: Waiting on Tech

This event sent from IssueTracker by moshiro 
 issue 191939

Comment 23 Jeff Layton 2008-09-17 19:04:33 UTC

Created attachment 317001 [details]
patch -- handle nfs4 incomplete opens more comprehensively

Hopefully final patch. Add calls to do_lookup_cleanup in other codepaths that do lookups with open intents. Check for missing extended inode and file ops (since I'm using the same flag). Other than that, everything should pretty much be the same...

Comment 25 Jeff Layton 2008-09-23 15:50:51 UTC

Created attachment 317484 [details]
updated patch -- handle nfs4 incomplete opens more comprehensively

Updated patch, some small changes and extra sanity checks for new undo op.

Comment 27 Vivek Goyal 2009-01-14 14:22:35 UTC

Committed in 78.28.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 31 Jeff Layton 2009-04-02 19:43:54 UTC

*** Bug 493709 has been marked as a duplicate of this bug. ***

Comment 34 Jeff Layton 2009-04-10 12:10:57 UTC

*** Bug 493709 has been marked as a duplicate of this bug. ***

Comment 36 errata-xmlrpc 2009-05-18 19:19:27 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html

Note You need to log in before you can comment on or make changes to this bug.