Bug 176848
Summary: | NLM: Fix Oops in nlmclnt_mark_reclaim() | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Steve Dickson <steved> | ||||||
Component: | kernel | Assignee: | Jeff Layton <jlayton> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Petr Beňas <pbenas> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 4.0 | CC: | dhoward, emcnabb, hallstein, herbert.van.den.bergh, james.brown, jbaron, pbenas, pstehlik, riek, rwheeler, sprabhu, steved, tao, vgoyal | ||||||
Target Milestone: | rc | Keywords: | Reopened, ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-02-16 15:59:06 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 537017 | ||||||||
Attachments: |
|
Description
Steve Dickson
2006-01-03 19:05:11 UTC
I will assume that "closed deferred" means deferred indefinitely, and remove this from the RHEL4U4Proposed list. *** Bug 179545 has been marked as a duplicate of this bug. *** *** This bug has been marked as a duplicate of 210128 *** Re-opening this issue since this was reported by a user running 2.6.9-67.0.7.ELsmp which contains a fix for bz 210128. The problem seen here is caused by a rebooting server causing a client which uses a mixture of mounts mounted with both -olock and -onolock options. Created attachment 349555 [details]
Test program used to lock files.
Simple program to lock files used in reproducer.
To reproduce 1) Compile locking program #gcc -o /tmp/lock fcntl_lock-b.c 2) Mount share /m1 with option -nolock #mount -onolock vm22:/test1 /m1 Mount share /m2 with default -olock option #mount -olock vm22:/test2 /m2 3) lock files on both shares # touch /m1/a;/tmp/lock /m1/a Lock obtained Press any key to exit... On another terminal, lock the file on the share with -olock # touch /m2/a; /tmp/lock /m2/a2 Lock obtained Press any key to exit... 4) restart the nfslock on server to simulate a server reboot. A notify message will be sent to the clients asking them to reclaim their locks. # service nfslock restart This will cause the client to crash. Note: In the previous reproducer, we need a mixture of shares mounted with -onolock and -olock. The -olock is required because this ensures that the client is monitored on the server and a notify message will be sent when the server reboots. The -onolock will create file_locks with fl->fl_u.nfs_fl.owner set to NULL. This will eventually cause the crash. The proposed patch for this problem is provided in the summary. (In reply to comment #15) > The -onolock will create file_locks with fl->fl_u.nfs_fl.owner set to NULL. > This will eventually cause the crash. > Where exactly does "owner" get set to NULL. As best I can tell, it ends up uninitialized. I have no problem with adding a NULL pointer check there if we can verify that it always gets set to NULL in the -onolock case. Otherwise, we might be better off adding a different sort of check. A safer bet might be to add a check for this, like the one in do_setlk: NFS_SERVER(inode)->flags & NFS_MOUNT_NONLM ...and skip reclaming the lock in that case. I'll see about rolling up a patch to do that. ( reply to c#17) You are correct. In any case, checking to see if this particular lock was created on a share which was mounted with the -onolock is a much better option. Created attachment 350119 [details]
patch -- skip reclaiming locks for inodes on -o nolock mounts
This patch looks like it'll fix it, but it's untested so far.
Could you test it out with your reproducer and see if it fixes the problem?
Patch tested successfully against reproducer from c#14. No crashes were seen. Committed in 89.11.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/ Reproduced in 2.6.9.89.10.EL and verified in 2.6.9.89.11.EL. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0263.html |