Bug 432867 - memory corruption due to portmap call succeeding after parent rpc_clnt has been freed
memory corruption due to portmap call succeeding after parent rpc_clnt has be...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.1
All Linux
low Severity low
: rc
: ---
Assigned To: Jeff Layton
Martin Jenner
:
Depends On:
Blocks: 448076
  Show dependency treegraph
 
Reported: 2008-02-14 15:37 EST by Jeff Layton
Modified: 2010-03-14 17:30 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 15:23:46 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch -- add new reference counted response container for portmap calls (5.55 KB, patch)
2008-04-04 14:56 EDT, Jeff Layton
no flags Details | Diff
SUNRPC: prevent memory corruption from successful portmap call with dead parent task (5.45 KB, patch)
2008-05-22 07:26 EDT, Jeff Layton
no flags Details | Diff
patch -- SUNRPC: prevent memory corruption from successful portmap call with dead parent task (4.57 KB, patch)
2008-05-22 11:33 EDT, Jeff Layton
no flags Details | Diff
SUNRPC: prevent memory corruption from successful portmap call with dead parent task (5.47 KB, patch)
2008-05-23 06:38 EDT, Jeff Layton
no flags Details | Diff
small test program to set a blocking lock on a file (521 bytes, text/plain)
2008-05-23 06:59 EDT, Jeff Layton
no flags Details
patch -- SUNRPC: prevent memory corruption from successful portmap call with dead parent task (5.70 KB, patch)
2008-07-22 13:29 EDT, Jeff Layton
no flags Details | Diff

  None (edit)
Description Jeff Layton 2008-02-14 15:37:53 EST
While working on bug 254195 yesterday, I hit the following on a RHEL5 debug kernel:
q

Feb 13 15:14:26 dhcp231-179 kernel: Slab corruption: (Tainted: GF    )
start=ffff81000eebf2b0, len=512
Feb 13 15:14:26 dhcp231-179 kernel: Redzone: 0x5a2cf071/0x5a2cf071.
Feb 13 15:14:26 dhcp231-179 kernel: Last user:
[<ffffffff8830cca4>](rpc_destroy_client+0xc6/0xcc [sunrpc])
Feb 13 15:14:26 dhcp231-179 kernel: 
Feb 13 15:14:26 dhcp231-179 kernel: Call Trace:
Feb 13 15:14:26 dhcp231-179 kernel:  <IRQ>  [<ffffffff80007137>] check_poison_ob
j+0x76/0x1d1
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff880e1bc4>]
:8139too:rtl8139_poll+0x1f6/0x424
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff880e1bc4>]
:8139too:rtl8139_poll+0x1f6/0x424
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff8000cae2>]
cache_alloc_debugcheck_after+0x30/0x1c1
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff80011374>]
__kmalloc_track_caller+0x11b/0x127
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff8002fbbe>] __alloc_skb+0x61/0x128
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff880e1a13>]
:8139too:rtl8139_poll+0x45/0x424
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff880e1bc4>]
:8139too:rtl8139_poll+0x1f6/0x424
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff8000cdcc>] net_rx_action+0xa9/0x1bb
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff80012983>] __do_softirq+0x67/0xf3
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff8006839f>] _spin_unlock+0x17/0x20
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff800613d0>] call_softirq+0x1c/0x28
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff800704dc>] do_softirq+0x35/0xa0
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff80070642>] do_IRQ+0xfb/0x104
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff8006ecef>] default_idle+0x0/0x69
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff80060652>] ret_from_intr+0x0/0xf
Feb 13 15:14:26 dhcp231-179 kernel:  <EOI>  [<ffffffff8006ed2b>]
default_idle+0x3c/0x69
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff8006ed29>] default_idle+0x3a/0x69
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff8004b278>] cpu_idle+0x9a/0xbd
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff80446824>] start_kernel+0x243/0x248
Feb 13 15:14:26 dhcp231-179 kernel:  [<ffffffff8044622f>] _sinittext+0x22f/0x236
Feb 13 15:14:26 dhcp231-179 kernel: 
Feb 13 15:14:26 dhcp231-179 kernel: 130: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 00
00 6b 6b
Feb 13 15:14:26 dhcp231-179 kernel: Prev obj: start=ffff81000eebf098, len=512
Feb 13 15:14:26 dhcp231-179 kernel: Redzone: 0x170fc2a5/0x170fc2a5.
Feb 13 15:14:26 dhcp231-179 kernel: Last user:
[<ffffffff883f1905>](nlm_lookup_file+0x115/0x210 [lockd])
Feb 13 15:14:26 dhcp231-179 kernel: 000: 00 00 00 00 00 00 00 00 10 00 01 00 01
01 00 00
Feb 13 15:14:26 dhcp231-179 kernel: 010: 00 00 dd cb 1a 00 c2 4c 12 9e 00 00 00
00 00 00
Feb 13 15:14:26 dhcp231-179 kernel: Next obj: start=ffff81000eebf4c8, len=512
Feb 13 15:14:26 dhcp231-179 kernel: Redzone: 0x5a2cf071/0x5a2cf071.
Feb 13 15:14:26 dhcp231-179 kernel: Last user:
[<ffffffff80040e85>](do_execve+0x233/0x243)
Feb 13 15:14:26 dhcp231-179 kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b
Feb 13 15:14:26 dhcp231-179 kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b

It's hard to confirm since this is memory corruption, but I suspect the problem
is this:

...the patch that I have for bug 254195 calls rpc_shutdown_client on the NLM
rpc_client when taking down lockd. If however, there is a portmap call still
pending then that is *not* taken down. When the pmap call completes, it can end
up setting the cl_port on the already freed clnt. If that memory has already
been reallocated then this is a memory corruptor.

There were some patches that went upstream to fix this by making the portmap
call take a reference to the xprt (and maybe also moving the cl_port field into
that structure). That should fix this problem, so I'll plan to dig those up and
also to see if we can get a consistent reproducer somehow.
Comment 1 Jeff Layton 2008-02-14 15:42:51 EST
Hmmm...except that those patches look to be fairly invasive and probably not
kabi-friendly. We may have to fix this by doing something different than how
upstream did it...
Comment 2 Jeff Layton 2008-02-15 09:57:38 EST
A quick check indicates that RHEL4 has the same problem. Fixing this isn't going
to be trivial. The best method I can see is that when we spawn a new portmap
call that we alloc a small memory region with a refcounter (maybe a kref even)
and a field for the port number. The main rpc task and the portmap task will
each hold a reference. The portmap task will release the reference when it gets
the port (or an error), and the main task will release the ref when the portmap
call completes (or the task is torn down).
Comment 3 Jeff Layton 2008-04-04 14:56:57 EDT
Created attachment 301324 [details]
patch -- add new reference counted response container for portmap calls

First pass at a patch to fix this problem. It seems to work, but I'm not sure
I'm cleaning up everything correctly if the child RPC task exits abnormally
(times out, etc).

The patch adds a new struct that's managed with a kref. The both the parent and
child RPC client will have a reference to it. The intent is to drop this
reference after the child task copies the port number into the container from
the RPC response. The parent will drop the reference when it copies the result
from the container into the cl_port field. We'll also need to drop the
reference if the tasks are killed.

It's this last part that I'm not sure I have correct, but I think I'm pretty
close.

I've also done my best to preserve kabi, but it may need more massaging to
prevent breakage.
Comment 4 Jeff Layton 2008-05-22 07:26:16 EDT
Created attachment 306361 [details]
SUNRPC: prevent memory corruption from successful portmap call with dead parent task

Updated patch. I believe this one gets the refcounting right when the task is
killed.
Comment 5 Jeff Layton 2008-05-22 11:33:02 EDT
Created attachment 306393 [details]
patch -- SUNRPC: prevent memory corruption from successful portmap call with dead parent task

Oof -- the last patch was wrongwrongwrong...

This patch does the right thing. It's also smaller.
Comment 6 Jeff Layton 2008-05-22 12:35:20 EDT
Actually, the original patch was right. We have to do it this way (hook the put
in rpc_destroy_client), so that the child task does a put regardless of how the
RPC task exits.
Comment 7 Jeff Layton 2008-05-23 06:38:27 EDT
Created attachment 306475 [details]
SUNRPC: prevent memory corruption from successful portmap call with dead parent task

Small revision. Since there can be several portmap calls in flight at the same
time, it's possible that the cl_port will be non-zero when pmap_getport_done
runs. If a portmap call succeeds, and then a following one fails we don't want
to clobber the port we got from the earlier call, so check to make sure cl_port
zero before copying the port from the container.
Comment 8 Jeff Layton 2008-05-23 06:59:04 EDT
Created attachment 306479 [details]
small test program to set a blocking lock on a file

To reproduce the memory corruption:

1) set up an NFS server on a machine booted to a kernel with memory poisoning
enabled. The kernel-debug packages have this enabled already. It'll also be
easier to reproduce this if the server's kernel has the patches for bug 254195,
since that helps guarantee that the parent RPC task is killed when NFS is
stopped. After starting the nfs service, wait 90s to make sure you're past the
server's grace period.

2) mount an export from this server on another host (any RHEL client will do). 


3) use the test program to set a lock on a file on the server:

    server# lockit /export/testfile
    hit return to exit

4) do a blocking lock of the same file on the client (make sure you're past the
server's lock grace period):

    client# lockit /mnt/nfs/testfile

(you'll get no output since the program is blocked waiting on the lock)

5) on the client, set up iptables rules to drop all traffic to and from the
server to simulate a network partition:

    client# iptables -A INPUT -s <server addr> -j DROP 
    client# iptables -A OUTPUT -d <server addr> -j DROP 

6) release the lock on the server (hit return), and then run:

    server# service nfs stop

...on the server to bring down nfs and kill off lockd.

7) unpartition the network:

    client# service iptables stop

...the last 2 steps need to be timed right so that the portmap calls are still
being reattempted, but the parent RPC callback is dead.

When you remove the network parition, the portmap call succeeds and the port is
copied into the parent RPC task. Since the parent RPC task should be dead at
this point however, the result is copied into a rpc_clnt struct that has
already been freed.

The results of this are not always evident right away. The corruption will not
be noticed until the memory is reused for other purposes.
Comment 10 RHEL Product and Program Management 2008-05-23 08:48:24 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 12 Jeff Layton 2008-07-22 13:29:36 EDT
Created attachment 312372 [details]
patch -- SUNRPC: prevent memory corruption from successful portmap call with dead parent task 

Updated patch -- make sure we zero out new->cl_pmap_result in rpc_clone_client.
Comment 13 Don Zickus 2008-07-23 14:54:44 EDT
in kernel-2.6.18-99.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 15 Ryan Lerch 2008-11-06 01:41:19 EST
This bug has been marked for inclusion in the Red Hat Enterprise Linux 5.3
Release Notes.

To aid in the development of relevant and accurate release notes, please fill
out the "Release Notes" field above with the following 4 pieces of information:


Cause:   What actions or circumstances cause this bug to present.

Consequence:  What happens when the bug presents.

Fix:   What was done to fix the bug.

Result:  What now happens when the actions or circumstances above occur. (NB:
this is not the same as 'the bug doesn't present anymore')
Comment 19 errata-xmlrpc 2009-01-20 15:23:46 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html

Note You need to log in before you can comment on or make changes to this bug.