Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 696376

Summary:	server BUG() on receipt of bad NFSv4 lock request
Product:	Red Hat Enterprise Linux 6	Reporter:	J. Bruce Fields <bfields>
Component:	kernel	Assignee:	J. Bruce Fields <bfields>
kernel sub component:	NFS	QA Contact:	Filesystem QE <fs-qe>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	urgent
Priority:	unspecified	CC:	cmarthal, klaus.steinberger, kzhang, liko, mzywusko, rwheeler, syeghiay
Version:	6.1	Keywords:	Regression
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	kernel-2.6.32-131.0.5.el6	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-05-19 12:42:50 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description J. Bruce Fields 2011-04-13 22:49:47 UTC

Certain lock failures (e.g. due to receipt of lock request during grace period) will cause a BUG() like:

kernel BUG at fs/nfsd/nfs4state.c:390!
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.1/irq
CPU 12 
Modules linked in: nfs fscache(T) nfsd lockd nfs_acl auth_rpcgss exportfs ext4 ]

Modules linked in: nfs fscache(T) nfsd lockd nfs_acl auth_rpcgss exportfs ext4 ]
Pid: 3309, comm: nfsd Tainted: G           ---------------- T 2.6.32-130.el6.x85
RIP: 0010:[<ffffffffa038b2b5>]  [<ffffffffa038b2b5>] free_generic_stateid+0x35/]
RSP: 0018:ffff8802305a3b00  EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffff88043fc91740 RCX: ffff8802305a3ae8
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8802305a3b0c
RBP: ffff8802305a3b20 R08: ffff88043fc91760 R09: 0000000000000000
R10: 000000000000003c R11: 0000000000000000 R12: ffff8804365f6280
R13: ffff8804365f62b8 R14: ffff8804365f6280 R15: ffff88043fc917b8
FS:  00007f6e57d44700(0000) GS:ffff880247440000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f6e573f4550 CR3: 0000000001a25000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process nfsd (pid: 3309, threadinfo ffff8802305a2000, task ffff88023061a080)
Stack:
 ffff8802305a3b20 00000000a0386e88 ffff88043fc91740 ffff8804365f6280
<0> ffff8802305a3b50 ffffffffa038b389 0000000000000000 ffff88082f3af1a0
<0> 000000001d270000 ffff88082f3b0040 ffff8802305a3d80 ffffffffa038ba5d
Call Trace:
 [<ffffffffa038b389>] release_lockowner+0x59/0xb0 [nfsd]
 [<ffffffffa038ba5d>] nfsd4_lock+0x4cd/0x7e0 [nfsd]
 [<ffffffffa0375a06>] ? nfsd_setuser+0x126/0x2c0 [nfsd]
 [<ffffffffa036d852>] ? nfsd_setuser_and_check_port+0x62/0xb0 [nfsd]
 [<ffffffffa036da07>] ? fh_verify+0x167/0x650 [nfsd]
 [<ffffffffa037cf01>] nfsd4_proc_compound+0x3d1/0x490 [nfsd]
 [<ffffffffa036a43e>] nfsd_dispatch+0xfe/0x240 [nfsd]
 [<ffffffffa02634d4>] svc_process_common+0x344/0x640 [sunrpc]
 [<ffffffff8105d710>] ? default_wake_function+0x0/0x20
 [<ffffffffa0263b10>] svc_process+0x110/0x160 [sunrpc]
 [<ffffffffa036ab62>] nfsd+0xc2/0x160 [nfsd]
 [<ffffffffa036aaa0>] ? nfsd+0x0/0x160 [nfsd]
 [<ffffffff8108de16>] kthread+0x96/0xa0
 [<ffffffff8100c1ca>] child_rip+0xa/0x20
 [<ffffffff8108dd80>] ? kthread+0x0/0xa0
 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
Code: 10 0f 1f 44 00 00 48 8b 77 60 48 89 fb 48 8d 7d ec e8 b0 c1 ff ff 8b 45 e 
RIP  [<ffffffffa038b2b5>] free_generic_stateid+0x35/0xb0 [nfsd]
 RSP <ffff8802305a3b00>

Comment 1 J. Bruce Fields 2011-04-13 22:51:50 UTC

Fix commited upstream as 23fcf2ec93fb8573a653408316af599939ff9a8e

Comment 3 J. Bruce Fields 2011-04-13 23:02:00 UTC

Simplest reproducer I've found is

a) open a file

b) restart the server

c) get a lock on the open file descriptor while the server is still in its grace period (so within 90 seconds of step b).

Comment 4 RHEL Program Management 2011-04-14 16:19:29 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 6 J. Bruce Fields 2011-04-15 16:55:20 UTC

*** Bug 697032 has been marked as a duplicate of this bug. ***

Comment 7 Aristeu Rozanski 2011-04-20 22:06:00 UTC

Patch(es) available on kernel-2.6.32-131.0.5.el6

Comment 10 Nate Straz 2011-04-26 13:29:39 UTC

I ran into this during cluster relocation tests while running a -132 based kernel.  I re-ran it with kernel-2.6.32-131.0.5.el6.x86_64 and made it through the relocation tests.

Is there a clone to make sure this patch makes it into 6.2?

Comment 11 J. Bruce Fields 2011-04-26 15:28:52 UTC

It's in as of kernel-2.6.32-134.el6.  As I understand it, that means it should be headed for both 6.1 and 6.2 already.

Comment 12 Nate Straz 2011-04-26 18:49:48 UTC

Sounds good, I'll mark this verified for 6.1.

Comment 13 errata-xmlrpc 2011-05-19 12:42:50 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html

Comment 14 Yongcheng Yang 2019-11-27 06:19:44 UTC

(In reply to J. Bruce Fields from comment #3)
> Simplest reproducer I've found is:
> a) open a file
> b) restart the server
> c) get a lock on the open file descriptor while the server is still in its grace period (so within 90 seconds of step b).

This scenario has been covered by many tests under /kernel/filesystems/nfs/function/nfslock/ already.