Bug 1085124 - [Doc] In some rare circumstances on a few systems, data corruption can occur when using NET_DMA (used by qpid) on Intel HW
Summary: [Doc] In some rare circumstances on a few systems, data corruption can occur ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: doc-Getting_Started_Guide
Version: 4.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: 5.0 (RHEL 7)
Assignee: RHOS Documentation Team
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks: 1040649
TreeView+ depends on / blocked
 
Reported: 2014-04-07 21:58 UTC by arkady kanevsky
Modified: 2015-04-20 11:18 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
In some rare and specific circumstances on a few systems, data corruption can occur when using NET_DMA. In these cases, a call to dma_skb_copy_datagram_iovec() to perform the DMA does not deliver the data to the expected location. As a workaround, the chip manufacturer recommends that NET_DMA be disabled on the latest upstream kernels. This can be done by blacklisting the ioatdma module. A kbase article is available at https://access.redhat.com/articles/879293. The problem will not occur with NET_DMA disabled. Recent hardware optimizations have effectively obviated the advantages of using the ioatdma driver on modern platforms. Therefore, Red Hat also recommends disabling ioatdma on all platforms.
Clone Of:
Environment:
Last Closed: 2015-04-20 11:18:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description arkady kanevsky 2014-04-07 21:58:59 UTC
Description of problem:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/dma/Kconfig?id=77873803363c9e831fc1d1e6895c084279090c22

Version-Release number of selected component (if applicable):
OSP4 A3

How reproducible:
There are also a set of patches implementing a debug facility that traps this data corruption condition:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/lib/dma-debug.c?id=0abdd7a81b7e3fd781d7fabcca49501852bba17e
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/lib/dma-debug.c?id=59f2e7df574c78e952d79435de3f4867349403aa
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/lib/dma-debug.c?id=3b7a6418c7494b8bf0bf0537ddee1dedbca10f51


Steps to Reproduce:
1.
2.
3.

Actual results:
Known RHEL6 problem. Fixed in 6.6. But OSP4. A3 is deployed on RHEL-6.5

Expected results:
Need to be documented in OSP4 deployment guide.
And specifically in deployments instruction for Dell/RedHat joint solution.

Additional info:

Comment 9 Andrew Dahms 2015-04-20 11:18:20 UTC
This bug was documented as a known issue, and will now be closed.

Please raise a new bug or let me know if you have any further concerns regarding this bug, and we will follow up for you.

Kind regards,

Andrew


Note You need to log in before you can comment on or make changes to this bug.