Bug 697032

Summary: NFSv4 server issue during recovery with running client I/O
Product: Red Hat Enterprise Linux 6 Reporter: Corey Marthaler <cmarthal>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 6.1CC: bfields, rwheeler
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-04-15 16:55:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2011-04-15 16:35:44 UTC
Description of problem:
This is a new bug for the issue mentioned in comment #22 of bug 633540

This has been seen on multiple clusters running HA NFS services.

Apr 14 16:52:24 grant-03 rgmanager[2237]: Service service:nfs1 started
------------[ cut here ]------------
kernel BUG at fs/nfsd/nfs4state.c:390!
invalid opcode: 0000 [#1] SMP
last sysfs file:
/sys/devices/pci0000:00/0000:00:08.0/0000:01:00.0/net/eth0/broadcast
CPU 0
Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs ext3 jbd dlm
configfs sunrpc c]

Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs ext3 jbd dlm
configfs sunrpc c]
Pid: 3555, comm: nfsd Not tainted 2.6.32-130.el6.x86_64 #1 PowerEdge SC1435
RIP: 0010:[<ffffffffa04d32b5>]  [<ffffffffa04d32b5>]
free_generic_stateid+0x35/0xb0 [nfsd]
RSP: 0018:ffff8801198d9b00  EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffff880119f85f38 RCX: ffff8801198d9ae8
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801198d9b0c
RBP: ffff8801198d9b20 R08: ffff880119f85f58 R09: 0000000000000000
R10: 0000000000000010 R11: 0000000000000000 R12: ffff8801198d7dd8
R13: ffff8801198d7e10 R14: ffff8801198d7dd8 R15: ffff880119f85740
FS:  00007f6ce4131700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f6ce413c000 CR3: 0000000218e85000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process nfsd (pid: 3555, threadinfo ffff8801198d8000, task ffff88011a580a80)
Stack:
 ffff8801198d9b20 00000000a04cee88 ffff880119f85f38 ffff8801198d7dd8
<0> ffff8801198d9b50 ffffffffa04d3389 0000000000000000 ffff88021a1591a0
<0> 000000001d270000 ffff88021a15a040 ffff8801198d9d80 ffffffffa04d3a5d
Call Trace:
 [<ffffffffa04d3389>] release_lockowner+0x59/0xb0 [nfsd]
 [<ffffffffa04d3a5d>] nfsd4_lock+0x4cd/0x7e0 [nfsd]
 [<ffffffffa04bda06>] ? nfsd_setuser+0x126/0x2c0 [nfsd]
 [<ffffffffa04b5852>] ? nfsd_setuser_and_check_port+0x62/0xb0 [nfsd]
 [<ffffffffa04b5a07>] ? fh_verify+0x167/0x650 [nfsd]
 [<ffffffffa04c4f01>] nfsd4_proc_compound+0x3d1/0x490 [nfsd]
 [<ffffffffa04b243e>] nfsd_dispatch+0xfe/0x240 [nfsd]
 [<ffffffffa03bf4d4>] svc_process_common+0x344/0x640 [sunrpc]
 [<ffffffff8105d710>] ? default_wake_function+0x0/0x20
 [<ffffffffa03bfb10>] svc_process+0x110/0x160 [sunrpc]
 [<ffffffffa04b2b62>] nfsd+0xc2/0x160 [nfsd]
 [<ffffffffa04b2aa0>] ? nfsd+0x0/0x160 [nfsd]
 [<ffffffff8108de16>] kthread+0x96/0xa0
 [<ffffffff8100c1ca>] child_rip+0xa/0x20
 [<ffffffff8108dd80>] ? kthread+0x0/0xa0
 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
Code: 10 0f 1f 44 00 00 48 8b 77 60 48 89 fb 48 8d 7d ec e8 b0 c1 ff ff 8b 45
ec 83 e0 03
RIP  [<ffffffffa04d32b5>] free_generic_stateid+0x35/0xb0 [nfsd]
 RSP <ffff8801198d9b00>
---[ end trace 536ae40f35a0eb0d ]---
Apr 14 16:52:27 Kernel panic - not syncing: Fatal exception
grant-03 kernel:Pid: 3555, comm: nfsd Tainted: G      D    ----------------  
2.6.32-130.1
 ------------[ cCall Trace:
ut here ]------- [<ffffffff814da981>] ? panic+0x78/0x143
-----
 [<ffffffff814de9c4>] ? oops_end+0xe4/0x100
 [<ffffffff8100f2fb>] ? die+0x5b/0x90
 [<ffffffff814de294>] ? do_trap+0xc4/0x160
 [<ffffffff8100ceb5>] ? do_invalid_op+0x95/0xb0
 [<ffffffffa04d32b5>] ? free_generic_stateid+0x35/0xb0 [nfsd]

Comment 1 J. Bruce Fields 2011-04-15 16:55:20 UTC
Reopen if upstream patch referenced in 696376 doesn't fix the problem.

*** This bug has been marked as a duplicate of bug 696376 ***