Bug 176848

Summary: NLM: Fix Oops in nlmclnt_mark_reclaim()
Product: Red Hat Enterprise Linux 4 Reporter: Steve Dickson <steved>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED ERRATA QA Contact: Petr Beňas <pbenas>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: dhoward, emcnabb, hallstein, herbert.van.den.bergh, james.brown, jbaron, pbenas, pstehlik, riek, rwheeler, sprabhu, steved, tao, vgoyal
Target Milestone: rcKeywords: Reopened, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-16 15:59:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 537017    
Attachments:
Description Flags
Test program used to lock files.
none
patch -- skip reclaiming locks for inodes on -o nolock mounts none

Description Steve Dickson 2006-01-03 19:05:11 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050921 Red Hat/1.7.12-1.4.1

Description of problem:
When mixing -olock and -onolock mounts on the same client, we have to
check that fl->fl_u.nfs_fl.owner is set before dereferencing it.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.mixing -olock and -onolock mounts
2.
3.
  

Additional info:

tree 2f7f341d38d6a8ae71ff1fefa9e48a467b34f5a1
parent 48e49187753ec3b4fa84a7165c9b7a59f3875b56
author Trond Myklebust <Trond.Myklebust> Tue, 20 Dec 2005 03:11:25 -0500
committer Trond Myklebust <Trond.Myklebust> Tue, 20 Dec 2005 09:12:31 -0500

NLM: Fix Oops in nlmclnt_mark_reclaim()

 When mixing -olock and -onolock mounts on the same client, we have to
 check that fl->fl_u.nfs_fl.owner is set before dereferencing it.

 Signed-off-by: Trond Myklebust <Trond.Myklebust>

 fs/lockd/clntlock.c |    4 ++++
 1 files changed, 4 insertions(+)

diff --git a/fs/lockd/clntlock.c b/fs/lockd/clntlock.c
index 006bb9e..3eaf6e7 100644
--- a/fs/lockd/clntlock.c
+++ b/fs/lockd/clntlock.c
@@ -157,6 +157,8 @@ void nlmclnt_mark_reclaim(struct nlm_hos
 		inode = fl->fl_file->f_dentry->d_inode;
 		if (inode->i_sb->s_magic != NFS_SUPER_MAGIC)
 			continue;
+		if (fl->fl_u.nfs_fl.owner == NULL)
+			continue;
 		if (fl->fl_u.nfs_fl.owner->host != host)
 			continue;
 		if (!(fl->fl_u.nfs_fl.flags & NFS_LCK_GRANTED))
@@ -226,6 +228,8 @@ restart:
 		inode = fl->fl_file->f_dentry->d_inode;
 		if (inode->i_sb->s_magic != NFS_SUPER_MAGIC)
 			continue;
+		if (fl->fl_u.nfs_fl.owner == NULL)
+			continue;
 		if (fl->fl_u.nfs_fl.owner->host != host)
 			continue;
 		if (!(fl->fl_u.nfs_fl.flags & NFS_LCK_RECLAIM))

Comment 3 Tom Coughlan 2006-03-17 14:59:45 UTC
I will assume that "closed deferred" means deferred indefinitely, and remove
this from the RHEL4U4Proposed list. 

Comment 6 Jeff Layton 2007-01-16 15:47:05 UTC
*** Bug 179545 has been marked as a duplicate of this bug. ***

Comment 10 Jeff Layton 2007-01-17 14:55:20 UTC

*** This bug has been marked as a duplicate of 210128 ***

Comment 12 Sachin Prabhu 2009-06-26 14:05:49 UTC
Re-opening this issue since this was reported by a user running 2.6.9-67.0.7.ELsmp which contains a fix for bz 210128.

The problem seen here is caused by a rebooting server causing a client which uses a mixture of mounts mounted with both -olock and -onolock options.

Comment 13 Sachin Prabhu 2009-06-26 14:12:55 UTC
Created attachment 349555 [details]
Test program used to lock files.

Simple program to lock files used in reproducer.

Comment 14 Sachin Prabhu 2009-06-26 14:20:51 UTC
To reproduce
1) Compile locking program
#gcc -o /tmp/lock fcntl_lock-b.c

2) 
Mount share /m1 with option -nolock
#mount -onolock vm22:/test1 /m1
Mount share /m2 with default -olock option 
#mount -olock vm22:/test2 /m2

3) lock files on both shares
# touch /m1/a;/tmp/lock /m1/a
Lock obtained
 Press any key to exit...

On another terminal, lock the file on the share with -olock
# touch /m2/a; /tmp/lock /m2/a2
Lock obtained
 Press any key to exit...

4) restart the nfslock on server to simulate a server reboot. A notify message will be sent to the clients asking them to reclaim their locks.

# service nfslock restart

This will cause the client to crash.

Comment 15 Sachin Prabhu 2009-06-26 14:23:53 UTC
Note: In the previous reproducer, we need a mixture of shares mounted with -onolock and -olock. 

The -olock is required because this ensures that the client is monitored on the server and a notify message will be sent when the server reboots. 

The -onolock will create file_locks with fl->fl_u.nfs_fl.owner set to NULL. This will eventually cause the crash.

The proposed patch for this problem is provided in the summary.

Comment 17 Jeff Layton 2009-07-01 12:08:05 UTC
(In reply to comment #15)

> The -onolock will create file_locks with fl->fl_u.nfs_fl.owner set to NULL.
> This will eventually cause the crash.
> 

Where exactly does "owner" get set to NULL. As best I can tell, it ends up uninitialized. I have no problem with adding a NULL pointer check there if we can verify that it always gets set to NULL in the -onolock case.

Otherwise, we might be better off adding a different sort of check.

A safer bet might be to add a check for this, like the one in do_setlk:

     NFS_SERVER(inode)->flags & NFS_MOUNT_NONLM

...and skip reclaming the lock in that case. I'll see about rolling up a patch to do that.

Comment 18 Sachin Prabhu 2009-07-01 12:27:55 UTC
( reply to c#17)

You are correct.

In any case, checking to see if this particular lock was created on a share which was mounted with the -onolock is a much better option.

Comment 19 Jeff Layton 2009-07-01 14:03:32 UTC
Created attachment 350119 [details]
patch -- skip reclaiming locks for inodes on -o nolock mounts

This patch looks like it'll fix it, but it's untested so far.

Could you test it out with your reproducer and see if it fixes the problem?

Comment 20 Sachin Prabhu 2009-07-01 16:28:16 UTC
Patch tested successfully against reproducer from c#14. No crashes were seen.

Comment 26 Vivek Goyal 2009-08-25 19:22:26 UTC
Committed in 89.11.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 38 Petr Beňas 2011-01-13 13:27:27 UTC
Reproduced in 2.6.9.89.10.EL and verified in 2.6.9.89.11.EL.

Comment 39 errata-xmlrpc 2011-02-16 15:59:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0263.html