Bug 423521
Summary: | memory leak on size-8192 buckets with NFSV4 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Aime Le Rouzic <aime.le-rouzic> |
Component: | kernel | Assignee: | Jeff Layton <jlayton> |
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 5.1 | CC: | cward, potluri, staubach, steved |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-01-20 20:23:38 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 432393 | ||
Attachments: |
Description
Aime Le Rouzic
2007-12-13 15:14:30 UTC
Finally got a chance to look over this today. I see a similar memory leak when
just mounting and unmounting a NFS4 share in a loop:
# for i in `seq 1 100`; do mount /mnt/rhel4; umount /mnt/rhel4; done
...after this, the size-8192 slab gains ~100 more active objects. I'll have to
ponder how we can best track down what's actually doing these kmallocs. Maybe a
systemtap script that traps on kmalloc and does a dump_stack for any that are
>4096 and <8192 ?
Created attachment 294150 [details]
stap script for looking at size 8192 kmallocs and their corresponding kfrees
Systemtap script to try and track this down...
Created attachment 294151 [details]
output from stap script
Output from stap script. It looks like we have 6 kmallocs and 5 kfrees. The
lingering kmalloc seems to be the second one that returned 0xffff8800025b8000.
Stack trace from print_backtrace is:
size = 4120, addr = 0xffff8800025b8000
Returning from: 0xffffffff802c9899 : __kmalloc+0x0/0x9f []
Returning to : 0xffffffff802bd91a : __kzalloc+0x9/0x21 []
0xffffffff802774b8 : kretprobe_trampoline_holder+0x0/0x2 []
0xffffffff80404f07 : reqsk_queue_alloc+0x21/0x99 []
0xffffffff8042bc7b : inet_csk_listen_start+0x1a/0x135 []
0xffffffff8043c59b : inet_listen+0x42/0x68 []
0xffffffff881e2464 : svc_makesock+0x127/0x183 [sunrpc]
0xffffffff881e18b7 : svc_create+0xee/0xf8 [sunrpc]
0xffffffff883711bf : nfs_callback_up+0x9c/0x14d [nfs]
0xffffffff8834fe2f : nfs_get_client+0xfd/0x3df [nfs]
0xffffffff88350158 : nfs4_set_client+0x47/0x173 [nfs]
0xffffffff88350909 : nfs4_create_server+0x7a/0x393 [nfs]
0xffffffff8025e823 : error_exit+0x0/0x6e []
0xffffffff883573b4 : nfs_copy_user_string+0x3c/0x89 [nfs]
0xffffffff88357cdc : nfs4_get_sb+0x1fc/0x323 [nfs]
0xffffffff8020adff : get_page_from_freelist+0x32e/0x3bc []
0xffffffff802cee21 : vfs_kern_mount+0x93/0x11a []
0xffffffff802ceeea : do_kern_mount+0x36/0x4d []
0xffffffff802d855b : do_mount+0x68c/0x6fc []
0xffffffff80418c8b : __qdisc_run+0x36/0x1bb []
0xffffffff8022bf6b : local_bh_enable+0x9/0xa5 []
0xffffffff80230ebb : dev_queue_xmit+0x2f2/0x313 []
0xffffffff80233001 : ip_output+0x29a/0x2dd []
0xffffffff802628b1 : _spin_lock_irqsave+0x9/0x14 []
0xffffffff802229d4 : __up_read+0x19/0x7f []
0xffffffff802d72eb : copy_mount_options+0xce/0x127 []
0xffffffff80297b70 : search_exception_tables+0x1d/0x2d []
0xffffffff802654f6 : do_page_fault+0x10e7/0x12cc []
0xffffffff80263786 : do_debug+0x70/0x151 []
0xffffffff8020b663 : kfree+0x0/0xc5 []
0xffffffff80263f9d : kprobe_handler+0x1ac/0x1c8 []
0xffffffff80263ff4 : kprobe_exceptions_notify+0x3b/0x75 []
0xffffffff802656fb : notifier_call_chain+0x20/0x32 []
0xffffffff8020acc5 : get_page_from_freelist+0x1f4/0x3bc []
0xffffffff8020f2d0 : __alloc_pages+0x65/0x2ce []
0xffffffff8024c3cf : sys_mount+0x8a/0xcd []
0xffffffff8025e2f1 : tracesys+0xa7/0xb2 []
I think there's a lot of garbage in there though, but that gives me some idea
of where to look...
It looks like we're calling svc_setup_socket to create a socket for the nfs4 callback thread, but I don't see where that gets torn down. I suspect that's where the problem is, but need to look a bit more closely. The problem seems to be with the sv_nrthreads count for the callback thread. It's at 2 when we do the umount: RPC: svc_destroy(NFSv4 callback, 2) ...so it doesn't actually tear down the socket or the svc_serv. nfsd and lockd also use those functions and when they go down their refcounts seem to be OK... This appears to be an upstream bug too. On a rawhide machine after mounting and unmounting: svc: svc_destroy(NFSv4 callback, 2) ...and I don't see where the socket got torn down. I think I see the problem, svc_create starts the service with sv_nrthreads==1. Then, svc_create_thread increments that count. nfs_callback_up() isn't handling this correctly. It should be calling svc_destroy() on success and failure, but it isn't. As an example, lockd_up_proto() handles this correctly. I'll post a patch here soon that I can propose upstream to fix this. Created attachment 294563 [details]
patch -- fix reference counting for NFS4 callback thread
This patch seems to fix the problem on rawhide. Backporting to RHEL5 and RHEL4
should be trivial. I'm assuming RHEL4 has this problem too, though I need to
check. I'll clone this BZ if so.
Patch posted upstream. Awaiting comment... Looks like Trond applied the patch, so I'll plan to propose a similar one for RHEL5. It's a bit too late for RHEL5.2, but I'll try to make sure we get something for 5.3. I'll also plan to take this patch into my test kernels for you to test. Once I do, I'll post a note here so that you can test them. Created attachment 295296 [details]
patch -- flush signals before taking down callback thread
Peter noticed that this seems to expose another problem with the callback
thread. It doesn't flush signals on shutdown and that makes the portmap
unregistration fail and throw an error. This patch seems to fix it, but he's
currently chasing an NFS related deadlock and I'd like to understand that
before I send this upstream.
Created attachment 302874 [details]
unified patch
Patch that incorporates both patches that went upstream.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. in kernel-2.6.18-98.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Partners, this bug should be fixed in the latest RHEL 5.3 Snapshot. We believe that you have some interest in its correct functionality, so we're making a friendly request to send us some testing feedback. If you have a chance to test it, please share with us your findings. If you have successfully VERIFIED the fix, please add PartnerVerified to the Bugzilla keywords, along with a description of the results. Thanks! ~~ Snapshot 6 is out ~~ Partners, please test and let us know if this bug has been fixed. Add PartnerVerified keyword if everything works as expected. For any new issues encountered, CLONE this bug and report the issues in the new bug. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |