Bug 243657 - [PATCH] Fix memory leak of dma_alloc_coherent() on x86_64
[PATCH] Fix memory leak of dma_alloc_coherent() on x86_64
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.5
x86_64 Linux
urgent Severity urgent
: ---
: ---
Assigned To: Prarit Bhargava
Martin Jenner
: ZStream
Depends On:
Blocks: 282351
  Show dependency treegraph
 
Reported: 2007-06-11 06:15 EDT by Masaki MAENO
Modified: 2010-10-22 11:34 EDT (History)
4 users (show)

See Also:
Fixed In Version: RHBA-2007-0791
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-11-15 11:28:28 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dma_alloc_coherent memleak fix patch (704 bytes, patch)
2007-06-11 06:15 EDT, Masaki MAENO
no flags Details | Diff
memleak example (19.76 KB, image/png)
2007-06-11 06:18 EDT, Masaki MAENO
no flags Details
debug code (MMDEBUG) to get evidence (1.50 KB, application/octet-stream)
2007-06-11 06:35 EDT, Masaki MAENO
no flags Details
evidence of memleak by debug code (MMDEBUG line) (26.19 KB, application/octet-stream)
2007-06-11 06:37 EDT, Masaki MAENO
no flags Details
RHEL4.6 Fix for this issue (655 bytes, patch)
2007-06-13 10:26 EDT, Prarit Bhargava
no flags Details | Diff

  None (edit)
Description Masaki MAENO 2007-06-11 06:15:51 EDT
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=

[PATCH] Fix memory leak of dma_alloc_coherent() on x86_64

Description of problem:

The memory leak is generated by function dma_alloc_coherent()
on x86_64.

Especially, it is an extensive problem for the machine that 
installed hp Proliant Support Pack (=PSP). Because the memory 
leak path frequently passes by cciss_ioctl() of cmaidad and 
cmaeventd that work when PSP is installed.
The memory leak has been generated by the pace of "100KB/h -- 
15MB/h" in a certain hp PSP environment!!! The kernel entered
the state of a no response and rebooted.


This problem has already been fixed by Vanilla Kernel 2.6.10.
  - ChangeLog         : http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.10
  - VanillaKernelPatch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commitdiff;h=792b87d770df447f3e4190d2b4732a3a36800adb
  > <ak@suse.de>
  > 	[PATCH] x86_64: Fallback to swiotlb for dma_alloc_coherent
  > 	
  > 	From: Suresh B Siddha
  > 	
  > 	Coresponding change to IA64 code is in, so this can be merged too.
  > 	
  > 	- fallback to swiotlb for consistent DMA mappings
  > 	- fix a memory leak in dma_alloc_coherent


I hope that Red Hat takes my appended patch file to RHEL4.5 
kernel errata release.
  >diff -urN kernel-2.6.9-55.EL.org/arch/x86_64/kernel/pci-gart.c
kernel-2.6.9-55.EL/arch/x86_64/kernel/pci-gart.c
  >--- kernel-2.6.9-55.EL.org/arch/x86_64/kernel/pci-gart.c      2007-06-11
17:58:08.000000000 +0900
  >+++ kernel-2.6.9-55.EL/arch/x86_64/kernel/pci-gart.c      2007-06-11
17:58:38.000000000 +0900
  >@@ -238,6 +238,7 @@
  >                        if (high) {
  >                                if (!(gfp & GFP_DMA)) {
  >                                        gfp |= GFP_DMA;
  >+                                       free_pages((unsigned long)memory,
get_order(size));
  >                                        goto again;
  >                                }
  >                                goto free;


Steps to Reproduce & Actual results:

1. You install hp Proliant Support Pack.
2. hpasm service runs. (cmaidad and cmaeventd works.)
3. The hidden memory (*1) keeps increasing. 
   (example: memleak.png (Vertical: memleak amount [KB], Horizontal: time [min]))
   (*1): The hidden memory is a value of "MemTotal - MemFree - MemUsage(*2)" 
         in /proc/meminfo.
   (*2): The "MemUsage" is a value of "Active + Inactive + Slab + PageTables
         + VmallocUsed" in /proc/meminfo.

Expected results:

The memory leak is not generated by function dma_alloc_coherent().
Comment 1 Masaki MAENO 2007-06-11 06:15:51 EDT
Created attachment 156692 [details]
dma_alloc_coherent memleak fix patch
Comment 2 Masaki MAENO 2007-06-11 06:18:29 EDT
Created attachment 156693 [details]
memleak example
Comment 3 Masaki MAENO 2007-06-11 06:35:08 EDT
Created attachment 156694 [details]
debug code (MMDEBUG) to get evidence
Comment 4 Masaki MAENO 2007-06-11 06:37:18 EDT
Created attachment 156696 [details]
evidence of memleak by debug code (MMDEBUG line)
Comment 5 Masaki MAENO 2007-06-11 22:32:06 EDT
I enumerate the memleak condition for attention. 

* Condition:
- Arch: x86_64
- Memory: larger than 4GB (if cciss)
- Function: arch/x86_64/kernel/pci-gart.c:dma_alloc_coherent()
- Detail:
    The memory leak is generated 4KB a degree when the bus-address of 
    acquired memory is 4GB or more. (if cciss)
Comment 6 Prarit Bhargava 2007-06-13 10:19:17 EDT
I took Masaki's testcode and ran it on an AMD box in Westford.  Sure enough,
there is a memory leak.  I patched the kernel with the patch above and the leak
was solved.

I'm redo-ing the patch and will submit to rhkernel-list.

P.
Comment 7 Prarit Bhargava 2007-06-13 10:26:23 EDT
Created attachment 156880 [details]
RHEL4.6 Fix for this issue
Comment 8 RHEL Product and Program Management 2007-06-13 10:32:09 EDT
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.
Comment 9 Masaki MAENO 2007-06-14 04:16:14 EDT
OK. Thank you.
I hope that new maintenance kernel of RHEL4 is released early. 
 
Comment 10 Jason Baron 2007-06-19 10:09:38 EDT
committed in stream U6 build 55.9. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 12 Masaki MAENO 2007-06-19 22:54:01 EDT
I got your tree of kernel-2.6.9-55.9 and confirmed this patch is available.
And, I confirmed that it booted and worked well.
Thank you.
Comment 29 Zhang Kexin 2007-10-31 09:41:14 EDT
got no right hardware do the test, (it needs a system that has a device that
calls dma_alloc_coherent,SB600 system should be OK), so just do code review.
Comment 31 errata-xmlrpc 2007-11-15 11:28:28 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html

Note You need to log in before you can comment on or make changes to this bug.