Bug 455813 - Under heavy memory usage dma_alloc_coherent does not return aligned address
Under heavy memory usage dma_alloc_coherent does not return aligned address
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.3
All Linux
high Severity medium
: rc
: ---
Assigned To: Prarit Bhargava
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-07-17 19:22 EDT by Prarit Bhargava
Modified: 2009-01-20 15:03 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 15:03:20 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
RHEL5 fix for this issue (9.59 KB, patch)
2008-07-17 19:24 EDT, Prarit Bhargava
no flags Details | Diff
Upstream bug fix related to this issue (819 bytes, patch)
2008-07-17 19:25 EDT, Prarit Bhargava
no flags Details | Diff
Additional Upstream bug fix related to this issue (1.92 KB, patch)
2008-07-18 09:41 EDT, Prarit Bhargava
no flags Details | Diff
RHEL5 fix for this issue (with DMA short-circuit code and debug) (14.48 KB, patch)
2008-07-18 09:47 EDT, Prarit Bhargava
no flags Details | Diff
RHEL5 fix for this issue (9.62 KB, patch)
2008-07-18 14:30 EDT, Prarit Bhargava
no flags Details | Diff
Upstream patch that fixes this issue (4.91 KB, patch)
2008-07-23 07:22 EDT, Prarit Bhargava
no flags Details | Diff
RHEL5 fix for this issue (10.50 KB, patch)
2008-08-07 11:03 EDT, Prarit Bhargava
no flags Details | Diff

  None (edit)
Description Prarit Bhargava 2008-07-17 19:22:30 EDT
Description of problem:

Under heavy load, pci_alloc_consistent()/dma_alloc_coherent() does not return an
address aligned to the size argument passed in.  The documentation is clear on this:

"pci_alloc_consistent returns two values: the virtual address which you
can use to access it from the CPU and dma_handle which you pass to the
card.

The cpu return address and the DMA bus master address are both
guaranteed to be aligned to the smallest PAGE_SIZE order which
is greater than or equal to the requested size.  This invariant
exists (for example) to guarantee that if you allocate a chunk
which is smaller than or equal to 64 kilobytes, the extent of the
buffer you receive will not cross a 64K boundary."

Version-Release number of selected component (if applicable): 97.el5


How reproducible: 100%


Steps to Reproduce: 
1. Reserve all of DMA'able memory below 4G.
2. Use a 32-bit DMA mask and attempt to reserve memory.  The dma memory
allocation code will attempt to get iommu pages.  alloc_iommu() is broken -- it
does not return aligned addresses.
  
Actual results:

Unaligned dma addresses are returned from pci_alloc_consistent()/dma_alloc_coherent.

Expected results:

The addresses should be aligned to the size argument passed in.

Additional info: A little tricky to reproduce.  One module loaded that exhausts
DMA memory, and then another to request more DMA memory with a 32-bit mask. 
Alternatively, you can short circuit the code to do a map_single request (which
calls the iommu code).  The latter option became my preferred choice.

The solution was to backport some code from upstream (see attached patch). 
Additionally a bug was found in that code and a patch was sent upstream to fix.
Comment 1 Prarit Bhargava 2008-07-17 19:23:22 EDT
This work was based on the report from 298811.  This allocation method is also
broken in RHEL4.
Comment 2 Prarit Bhargava 2008-07-17 19:24:11 EDT
Created attachment 312088 [details]
RHEL5 fix for this issue

Initial rough patch.
Comment 3 Prarit Bhargava 2008-07-17 19:25:33 EDT
Created attachment 312089 [details]
Upstream bug fix related to this issue

Sent to LKML & Jesse Barnes.
Comment 4 Prarit Bhargava 2008-07-18 09:41:20 EDT
Created attachment 312135 [details]
Additional Upstream bug fix related to this issue
Comment 5 Prarit Bhargava 2008-07-18 09:41:53 EDT
Patch in comment #4 sent to LKML and Jesse Barnes.
Comment 6 Prarit Bhargava 2008-07-18 09:47:32 EDT
Created attachment 312136 [details]
RHEL5 fix for this issue (with DMA short-circuit code and debug)
Comment 7 Prarit Bhargava 2008-07-18 14:30:43 EDT
Created attachment 312164 [details]
RHEL5 fix for this issue
Comment 8 Prarit Bhargava 2008-07-21 10:30:58 EDT
Links to upstream submits for this issue:

http://marc.info/?l=linux-kernel&m=121664984730778&w=2

http://marc.info/?l=linux-kernel&m=121664984830791&w=2

P.
Comment 9 Prarit Bhargava 2008-07-23 07:22:46 EDT
Created attachment 312463 [details]
Upstream patch that fixes this issue

Submitted to LKML.
Comment 10 Prarit Bhargava 2008-07-23 07:24:50 EDT
Patch upstream here:

http://marc.info/?l=linux-kernel&m=121681201313560&w=2

P.
Comment 11 RHEL Product and Program Management 2008-07-25 13:01:35 EDT
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 13 Prarit Bhargava 2008-08-07 11:03:07 EDT
Created attachment 313699 [details]
RHEL5 fix for this issue

Posted patch.
Comment 14 RHEL Product and Program Management 2008-08-07 11:03:38 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 15 Don Zickus 2008-09-10 16:14:15 EDT
in kernel-2.6.18-110.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 19 errata-xmlrpc 2009-01-20 15:03:20 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html

Note You need to log in before you can comment on or make changes to this bug.