Bug 455813

Summary: Under heavy memory usage dma_alloc_coherent does not return aligned address
Product: Red Hat Enterprise Linux 5 Reporter: Prarit Bhargava <prarit>
Component: kernelAssignee: Prarit Bhargava <prarit>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: high    
Version: 5.3CC: dzickus, mgahagan, reich
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 20:03:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
RHEL5 fix for this issue
none
Upstream bug fix related to this issue
none
Additional Upstream bug fix related to this issue
none
RHEL5 fix for this issue (with DMA short-circuit code and debug)
none
RHEL5 fix for this issue
none
Upstream patch that fixes this issue
none
RHEL5 fix for this issue none

Description Prarit Bhargava 2008-07-17 23:22:30 UTC
Description of problem:

Under heavy load, pci_alloc_consistent()/dma_alloc_coherent() does not return an
address aligned to the size argument passed in.  The documentation is clear on this:

"pci_alloc_consistent returns two values: the virtual address which you
can use to access it from the CPU and dma_handle which you pass to the
card.

The cpu return address and the DMA bus master address are both
guaranteed to be aligned to the smallest PAGE_SIZE order which
is greater than or equal to the requested size.  This invariant
exists (for example) to guarantee that if you allocate a chunk
which is smaller than or equal to 64 kilobytes, the extent of the
buffer you receive will not cross a 64K boundary."

Version-Release number of selected component (if applicable): 97.el5


How reproducible: 100%


Steps to Reproduce: 
1. Reserve all of DMA'able memory below 4G.
2. Use a 32-bit DMA mask and attempt to reserve memory.  The dma memory
allocation code will attempt to get iommu pages.  alloc_iommu() is broken -- it
does not return aligned addresses.
  
Actual results:

Unaligned dma addresses are returned from pci_alloc_consistent()/dma_alloc_coherent.

Expected results:

The addresses should be aligned to the size argument passed in.

Additional info: A little tricky to reproduce.  One module loaded that exhausts
DMA memory, and then another to request more DMA memory with a 32-bit mask. 
Alternatively, you can short circuit the code to do a map_single request (which
calls the iommu code).  The latter option became my preferred choice.

The solution was to backport some code from upstream (see attached patch). 
Additionally a bug was found in that code and a patch was sent upstream to fix.

Comment 1 Prarit Bhargava 2008-07-17 23:23:22 UTC
This work was based on the report from 298811.  This allocation method is also
broken in RHEL4.

Comment 2 Prarit Bhargava 2008-07-17 23:24:11 UTC
Created attachment 312088 [details]
RHEL5 fix for this issue

Initial rough patch.

Comment 3 Prarit Bhargava 2008-07-17 23:25:33 UTC
Created attachment 312089 [details]
Upstream bug fix related to this issue

Sent to LKML & Jesse Barnes.

Comment 4 Prarit Bhargava 2008-07-18 13:41:20 UTC
Created attachment 312135 [details]
Additional Upstream bug fix related to this issue

Comment 5 Prarit Bhargava 2008-07-18 13:41:53 UTC
Patch in comment #4 sent to LKML and Jesse Barnes.

Comment 6 Prarit Bhargava 2008-07-18 13:47:32 UTC
Created attachment 312136 [details]
RHEL5 fix for this issue (with DMA short-circuit code and debug)

Comment 7 Prarit Bhargava 2008-07-18 18:30:43 UTC
Created attachment 312164 [details]
RHEL5 fix for this issue

Comment 8 Prarit Bhargava 2008-07-21 14:30:58 UTC
Links to upstream submits for this issue:

http://marc.info/?l=linux-kernel&m=121664984730778&w=2

http://marc.info/?l=linux-kernel&m=121664984830791&w=2

P.

Comment 9 Prarit Bhargava 2008-07-23 11:22:46 UTC
Created attachment 312463 [details]
Upstream patch that fixes this issue

Submitted to LKML.

Comment 10 Prarit Bhargava 2008-07-23 11:24:50 UTC
Patch upstream here:

http://marc.info/?l=linux-kernel&m=121681201313560&w=2

P.

Comment 11 RHEL Program Management 2008-07-25 17:01:35 UTC
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".

Comment 13 Prarit Bhargava 2008-08-07 15:03:07 UTC
Created attachment 313699 [details]
RHEL5 fix for this issue

Posted patch.

Comment 14 RHEL Program Management 2008-08-07 15:03:38 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 15 Don Zickus 2008-09-10 20:14:15 UTC
in kernel-2.6.18-110.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 19 errata-xmlrpc 2009-01-20 20:03:20 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html