Bug 432626 - nlm_blocked refcount leak
Summary: nlm_blocked refcount leak
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jeff Layton
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-02-13 13:56 UTC by Jeff Layton
Modified: 2012-01-04 07:13 UTC (History)
5 users (show)

Fixed In Version: RHBA-2008-0314
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-21 15:09:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch -- remove extra kref_get() from nlmsvc_grant_blocked() (912 bytes, patch)
2008-02-13 19:19 UTC, Jeff Layton
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0314 0 normal SHIPPED_LIVE Updated kernel packages for Red Hat Enterprise Linux 5.2 2008-05-20 18:43:34 UTC

Description Jeff Layton 2008-02-13 13:56:37 UTC
Found while working on the use after free problem in lockd...

Do the following:

On a NFS server fcntl lock a file
On a NFS3 client fcntl lock the same file with F_SETLKW
On the server, release the lock -- client now has the lock
Release the lock on the client -- at this point the nlm_block should be freed
On the server do a "service nfs stop"

lockd will throw this error when coming down:

    lockd: couldn't shutdown host module!

...the above set of steps somehow causes a b_count leak for the nlm_block, which
keeps the nlm_host refcount high. More block-callback-lock attempts seem to
cause the refcount to be even higher.

This certainly leaks memory when lockd goes down, and could also be leaking
memory in other situations (that needs to be confirmed).

Comment 1 Jeff Layton 2008-02-13 19:15:59 UTC
It looks like this is a regression that was introduced by this patch:

RHBZ 196318: NFS byte-range locking support for cluster file systems.

...that patch added a kref_get() call to the top of nlmsvc_grant_blocked(), but
does not remove the kref_get() at the bottom of that function before
nlm_async_call() is called.

I think just removing the old kref_get() will be sufficient to fix this.


Comment 2 Jeff Layton 2008-02-13 19:19:41 UTC
Created attachment 294822 [details]
patch -- remove extra kref_get() from nlmsvc_grant_blocked()

Proposed patch...

Comment 3 RHEL Program Management 2008-02-13 19:59:03 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Don Zickus 2008-03-12 19:41:33 UTC
in kernel-$NEW_VER
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 7 Don Zickus 2008-03-12 20:00:11 UTC
in kernel-2.6.18-85.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 10 errata-xmlrpc 2008-05-21 15:09:23 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html



Note You need to log in before you can comment on or make changes to this bug.