Bug 1085124

Summary: [Doc] In some rare circumstances on a few systems, data corruption can occur when using NET_DMA (used by qpid) on Intel HW
Product: Red Hat OpenStack Reporter: arkady kanevsky <arkady_kanevsky>
Component: doc-Getting_Started_GuideAssignee: RHOS Documentation Team <rhos-docs>
Status: CLOSED CURRENTRELEASE QA Contact: RHOS Documentation Team <rhos-docs>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.0CC: adahms, ddomingo, dprince, morazi, rlandman, tross, yeylon
Target Milestone: ---Keywords: Documentation, Triaged
Target Release: 5.0 (RHEL 7)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
In some rare and specific circumstances on a few systems, data corruption can occur when using NET_DMA. In these cases, a call to dma_skb_copy_datagram_iovec() to perform the DMA does not deliver the data to the expected location. As a workaround, the chip manufacturer recommends that NET_DMA be disabled on the latest upstream kernels. This can be done by blacklisting the ioatdma module. A kbase article is available at https://access.redhat.com/articles/879293. The problem will not occur with NET_DMA disabled. Recent hardware optimizations have effectively obviated the advantages of using the ioatdma driver on modern platforms. Therefore, Red Hat also recommends disabling ioatdma on all platforms.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-04-20 11:18:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1040649    

Description arkady kanevsky 2014-04-07 21:58:59 UTC
Description of problem:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/dma/Kconfig?id=77873803363c9e831fc1d1e6895c084279090c22

Version-Release number of selected component (if applicable):
OSP4 A3

How reproducible:
There are also a set of patches implementing a debug facility that traps this data corruption condition:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/lib/dma-debug.c?id=0abdd7a81b7e3fd781d7fabcca49501852bba17e
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/lib/dma-debug.c?id=59f2e7df574c78e952d79435de3f4867349403aa
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/lib/dma-debug.c?id=3b7a6418c7494b8bf0bf0537ddee1dedbca10f51


Steps to Reproduce:
1.
2.
3.

Actual results:
Known RHEL6 problem. Fixed in 6.6. But OSP4. A3 is deployed on RHEL-6.5

Expected results:
Need to be documented in OSP4 deployment guide.
And specifically in deployments instruction for Dell/RedHat joint solution.

Additional info:

Comment 9 Andrew Dahms 2015-04-20 11:18:20 UTC
This bug was documented as a known issue, and will now be closed.

Please raise a new bug or let me know if you have any further concerns regarding this bug, and we will follow up for you.

Kind regards,

Andrew