Bug 146789 - Implement a better solution to the dma memory allocation done in the kernel
Summary: Implement a better solution to the dma memory allocation done in the kernel
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: ia32e
OS: Linux
high
high
Target Milestone: ---
Assignee: Larry Woodman
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: RHEL3U8CanFix
TreeView+ depends on / blocked
 
Reported: 2005-02-01 17:16 UTC by Joshua Giles
Modified: 2007-11-30 22:07 UTC (History)
11 users (show)

Fixed In Version: RHSA-2006-0437
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-07-20 13:18:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
This is the x86_64-swiotlb.patch that is being referred to here. (2.83 KB, patch)
2005-07-25 20:15 UTC, Larry Woodman
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0437 0 normal SHIPPED_LIVE Important: Updated kernel packages for Red Hat Enterprise Linux 3 Update 8 2006-07-20 13:11:00 UTC

Description Joshua Giles 2005-02-01 17:16:27 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET 
CLR 1.1.4322)

Description of problem:
Implement a better solution to the dma memory allocation done in the 
kernel when you have a 32 bit device on a 64bit extended architecture 
OS.  Basically pci_alloc_consistent will have to change.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.  Put a 32bit device (SCSI or RAID device for example) on RHEL 3 
x86_64
2.  Add stress and wait for a kernel panic
3.
    

Additional info:
We would like to see this fixed for U5.

Comment 1 Joshua Giles 2005-02-01 17:20:21 UTC
A proposed fix from Matt Domsch:
"linux-2.6.9-5.EL/arch/x86_64/kernel/pci-nommu.c:

/*
 * Dummy IO MMU functions
 */

void *dma_alloc_coherent(struct device *hwdev, size_t size,
                         dma_addr_t *dma_handle, unsigned gfp)
{
        void *ret;
        u64 mask;
        int order = get_order(size);

        if (hwdev)
                mask = hwdev->coherent_dma_mask & *hwdev->dma_mask;
        else
                mask = 0xffffffff;
        for (;;) {
                ret = (void *)__get_free_pages(gfp, order);
                if (ret == NULL)
                        return NULL;
                *dma_handle = virt_to_bus(ret);
                if ((*dma_handle & ~mask) == 0)
                        break;
                free_pages((unsigned long)ret, order);
                if (gfp & GFP_DMA)
                        return NULL;
                gfp |= GFP_DMA;
        }

        memset(ret, 0, size);
        return ret;
}


So RHEL4 doesn't have quite the same restriction.  When it allocates 
a page, if that page isn't DMA-able by the device, then it frees it 
up and tries again, using the GFP_DMA flag this time.  Because 
generally there are *some* pages available in ZONE_NORMAL that are 
below the 4GB address limit, this works quite well in practice.

The same idea could/should be backported to RHEL3, it's certainly not 
been done yet for the RHEL3 U4 or earlier kernels."

Comment 6 Marty Wesley 2005-05-26 06:46:19 UTC
PM ACK for U6

Comment 10 Larry Woodman 2005-06-10 20:00:28 UTC
Development ACK.  We are waiting for Emulex to verify this works OK for them.

Larry


Comment 23 Larry Woodman 2005-07-25 20:15:11 UTC
Created attachment 117134 [details]
This is the x86_64-swiotlb.patch that is being referred to here.

Comment 27 Ernie Petrides 2005-07-27 23:23:01 UTC
Larry Troan, regarding comment #25, this is a RHEL3 bug.  Why is building
the patch into a RHEL4 kernel relevant?

Comment 28 Issue Tracker 2005-07-28 18:05:07 UTC
From User-Agent: XML-RPC

Per Matt....

Finger check.... kenel-2.4.21-32.EL.smp   RHEL3....


This event sent from IssueTracker by ltroan
 issue 73055

Comment 32 Ernie Petrides 2005-08-09 18:00:46 UTC
U6 is closed (and in beta already).

Comment 35 Larry Troan 2005-08-31 14:37:57 UTC
This Bug apparently is not a DUP of Bug 146954 (Engineering's call) but is tied
to to it. 

It is believed that there is a common patch which will resolve both the problems
described here and those described in bug 146954. Both bugs are now public.

Comment 38 Larry Troan 2005-09-19 19:53:56 UTC
Joshua, per Matt's email, can we close this bug since Dell no longer requires a
solution to this problem? 

Comment 39 Larry Troan 2005-09-19 20:13:10 UTC
Clarifying: Dell's solution is for customer's experiencing this problem to
migrate to RHEL4. 

Comment 40 Ernie Petrides 2006-04-20 01:20:18 UTC
A fix for this problem has just been committed to the RHEL3 U8
patch pool this evening (in kernel version 2.4.21-40.7.EL).


Comment 42 Joshua Giles 2006-05-30 15:56:56 UTC
Closing per customer feedback.
*A patch was included that adds the "maxdma" option which will workaround this
problem.

Comment 43 Ernie Petrides 2006-05-30 20:23:02 UTC
Reverting to ON_QA after reopening.

Comment 45 Red Hat Bugzilla 2006-07-20 13:18:23 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0437.html



Note You need to log in before you can comment on or make changes to this bug.