Bug 243657 - [PATCH] Fix memory leak of dma_alloc_coherent() on x86_64
Summary: [PATCH] Fix memory leak of dma_alloc_coherent() on x86_64
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.5
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Prarit Bhargava
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks: 282351
TreeView+ depends on / blocked
 
Reported: 2007-06-11 10:15 UTC by Masaki MAENO
Modified: 2018-10-19 22:36 UTC (History)
4 users (show)

Fixed In Version: RHBA-2007-0791
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-11-15 16:28:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
dma_alloc_coherent memleak fix patch (704 bytes, patch)
2007-06-11 10:15 UTC, Masaki MAENO
no flags Details | Diff
memleak example (19.76 KB, image/png)
2007-06-11 10:18 UTC, Masaki MAENO
no flags Details
debug code (MMDEBUG) to get evidence (1.50 KB, application/octet-stream)
2007-06-11 10:35 UTC, Masaki MAENO
no flags Details
evidence of memleak by debug code (MMDEBUG line) (26.19 KB, application/octet-stream)
2007-06-11 10:37 UTC, Masaki MAENO
no flags Details
RHEL4.6 Fix for this issue (655 bytes, patch)
2007-06-13 14:26 UTC, Prarit Bhargava
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0791 0 normal SHIPPED_LIVE Updated kernel packages available for Red Hat Enterprise Linux 4 Update 6 2007-11-14 18:25:55 UTC

Description Masaki MAENO 2007-06-11 10:15:51 UTC
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=

[PATCH] Fix memory leak of dma_alloc_coherent() on x86_64

Description of problem:

The memory leak is generated by function dma_alloc_coherent()
on x86_64.

Especially, it is an extensive problem for the machine that 
installed hp Proliant Support Pack (=PSP). Because the memory 
leak path frequently passes by cciss_ioctl() of cmaidad and 
cmaeventd that work when PSP is installed.
The memory leak has been generated by the pace of "100KB/h -- 
15MB/h" in a certain hp PSP environment!!! The kernel entered
the state of a no response and rebooted.


This problem has already been fixed by Vanilla Kernel 2.6.10.
  - ChangeLog         : http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.10
  - VanillaKernelPatch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commitdiff;h=792b87d770df447f3e4190d2b4732a3a36800adb
  > <ak>
  > 	[PATCH] x86_64: Fallback to swiotlb for dma_alloc_coherent
  > 	
  > 	From: Suresh B Siddha
  > 	
  > 	Coresponding change to IA64 code is in, so this can be merged too.
  > 	
  > 	- fallback to swiotlb for consistent DMA mappings
  > 	- fix a memory leak in dma_alloc_coherent


I hope that Red Hat takes my appended patch file to RHEL4.5 
kernel errata release.
  >diff -urN kernel-2.6.9-55.EL.org/arch/x86_64/kernel/pci-gart.c
kernel-2.6.9-55.EL/arch/x86_64/kernel/pci-gart.c
  >--- kernel-2.6.9-55.EL.org/arch/x86_64/kernel/pci-gart.c      2007-06-11
17:58:08.000000000 +0900
  >+++ kernel-2.6.9-55.EL/arch/x86_64/kernel/pci-gart.c      2007-06-11
17:58:38.000000000 +0900
  >@@ -238,6 +238,7 @@
  >                        if (high) {
  >                                if (!(gfp & GFP_DMA)) {
  >                                        gfp |= GFP_DMA;
  >+                                       free_pages((unsigned long)memory,
get_order(size));
  >                                        goto again;
  >                                }
  >                                goto free;


Steps to Reproduce & Actual results:

1. You install hp Proliant Support Pack.
2. hpasm service runs. (cmaidad and cmaeventd works.)
3. The hidden memory (*1) keeps increasing. 
   (example: memleak.png (Vertical: memleak amount [KB], Horizontal: time [min]))
   (*1): The hidden memory is a value of "MemTotal - MemFree - MemUsage(*2)" 
         in /proc/meminfo.
   (*2): The "MemUsage" is a value of "Active + Inactive + Slab + PageTables
         + VmallocUsed" in /proc/meminfo.

Expected results:

The memory leak is not generated by function dma_alloc_coherent().

Comment 1 Masaki MAENO 2007-06-11 10:15:51 UTC
Created attachment 156692 [details]
dma_alloc_coherent memleak fix patch

Comment 2 Masaki MAENO 2007-06-11 10:18:29 UTC
Created attachment 156693 [details]
memleak example

Comment 3 Masaki MAENO 2007-06-11 10:35:08 UTC
Created attachment 156694 [details]
debug code (MMDEBUG) to get evidence

Comment 4 Masaki MAENO 2007-06-11 10:37:18 UTC
Created attachment 156696 [details]
evidence of memleak by debug code (MMDEBUG line)

Comment 5 Masaki MAENO 2007-06-12 02:32:06 UTC
I enumerate the memleak condition for attention. 

* Condition:
- Arch: x86_64
- Memory: larger than 4GB (if cciss)
- Function: arch/x86_64/kernel/pci-gart.c:dma_alloc_coherent()
- Detail:
    The memory leak is generated 4KB a degree when the bus-address of 
    acquired memory is 4GB or more. (if cciss)


Comment 6 Prarit Bhargava 2007-06-13 14:19:17 UTC
I took Masaki's testcode and ran it on an AMD box in Westford.  Sure enough,
there is a memory leak.  I patched the kernel with the patch above and the leak
was solved.

I'm redo-ing the patch and will submit to rhkernel-list.

P.

Comment 7 Prarit Bhargava 2007-06-13 14:26:23 UTC
Created attachment 156880 [details]
RHEL4.6 Fix for this issue

Comment 8 RHEL Program Management 2007-06-13 14:32:09 UTC
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.

Comment 9 Masaki MAENO 2007-06-14 08:16:14 UTC
OK. Thank you.
I hope that new maintenance kernel of RHEL4 is released early. 
 

Comment 10 Jason Baron 2007-06-19 14:09:38 UTC
committed in stream U6 build 55.9. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 12 Masaki MAENO 2007-06-20 02:54:01 UTC
I got your tree of kernel-2.6.9-55.9 and confirmed this patch is available.
And, I confirmed that it booted and worked well.
Thank you.


Comment 29 Zhang Kexin 2007-10-31 13:41:14 UTC
got no right hardware do the test, (it needs a system that has a device that
calls dma_alloc_coherent,SB600 system should be OK), so just do code review.

Comment 31 errata-xmlrpc 2007-11-15 16:28:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html



Note You need to log in before you can comment on or make changes to this bug.