Bug 688410

Summary: NUMA problems in transparent hugepages
Product: Red Hat Enterprise Linux 6 Reporter: Andi Kleen <andi.kleen>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED ERRATA QA Contact: Chao Ye <cye>
Severity: unspecified Docs Contact:
Priority: low    
Version: 6.0CC: aarcange, arozansk, czhang, dnelson, kzhang, lwoodman, peterm, prarit, qcai, syeghiay
Target Milestone: rcKeywords: TestOnly
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 12:45:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 741979    
Bug Blocks:    

Description Andi Kleen 2011-03-16 23:47:26 UTC
We found in some internal workloads that hugepaged may move pages
allocated with MPOL_DEFAULT (default local first touch NUMA policy) to different
nodes when coalescing pages. This causes performance problems.

In addition THP copies could also move data to the wrong node.

And THP would not correctly do interleaving with the NUMA interleave policy.

This was fixed upstream with:

5c4b4be3b6b937256103a5ae49177e0c3a17cb8f
19ee151e140daa5183c4984981801e542e0544fb
c5bd99c36043e58f14de507dba3f1e016cc52a91
885e87a190bbb1ebb83cea1bf7a62b13b5dd38d3
24b3ba8dd78cf6d7d2a8fea3ab4797654dd456a6

Comment 2 Larry Woodman 2011-03-18 18:52:29 UTC
The last 3 commits are not in Linus's latest upstream kernel, where/when should I find them officially?

[root@dhcp47-183 linux-2.6]# git show 24b3ba8dd78cf6d7d2a8fea3ab4797654dd456a6
fatal: bad object 24b3ba8dd78cf6d7d2a8fea3ab4797654dd456a6
[root@dhcp47-183 linux-2.6]# git show 885e87a190bbb1ebb83cea1bf7a62b13b5dd38d3
fatal: bad object 885e87a190bbb1ebb83cea1bf7a62b13b5dd38d3
[root@dhcp47-183 linux-2.6]# git show c5bd99c36043e58f14de507dba3f1e016cc52a91
fatal: bad object c5bd99c36043e58f14de507dba3f1e016cc52a91


Larry Woodman

Comment 3 Larry Woodman 2011-03-18 19:02:12 UTC

Andi, I see these 4 commits in Linus's tree:

2f5f9486f8c12e3aa40fe3775a18cb14efc5cea2
236344d6b417d05a3080477639234fd9ca97568d
19ee151e140daa5183c4984981801e542e0544fb
5c4b4be3b6b937256103a5ae49177e0c3a17cb8f


is this everything we need???

Larry

Comment 4 Andi Kleen 2011-03-18 19:59:35 UTC
Hmm sorry probably I confused git trees. Yes Larry your list is fine
and should fix this. Thanks.

Comment 5 Larry Woodman 2011-03-22 16:24:41 UTC
Every patch hunk fails to apply to RHEL6.  I'll have to do a bit of manual work here.  This will be pushed off to RHEL6.2

Larry

Comment 6 Andrea Arcangeli 2011-03-22 21:48:42 UTC
The patch errors I think are because the fixes are already included in RHEL6.1. We only skipped the vmstat patch because it's a new feature and it is visible in /proc/vmstat and it's not numa related. (that one will be deferred to 6.2)

Comment 7 Larry Woodman 2011-03-23 18:09:51 UTC
All of these patches are in RHEL6.1, kernel-2.6.32-124:

2f5f9486f8c12e3aa40fe3775a18cb14efc5cea2
236344d6b417d05a3080477639234fd9ca97568d
19ee151e140daa5183c4984981801e542e0544fb
5c4b4be3b6b937256103a5ae49177e0c3a17cb8f


Larry Woodman

Comment 8 Andi Kleen 2011-03-23 18:17:33 UTC
Thanks!

Comment 9 Larry Woodman 2011-04-04 17:33:59 UTC

The following RedHat commits:

9a3918236b06f4197cf24b9098ba841042f9a669
184c0dbf14aa866312e633688ba6d501d7eabc26
c81ba68e900146b7c9e1eb37b395ee045406433e
72590e5f9859d91d2dea87d689085a88432ba9a3

correspond to the following upstream commits:

2f5f9486f8c12e3aa40fe3775a18cb14efc5cea2
236344d6b417d05a3080477639234fd9ca97568d
19ee151e140daa5183c4984981801e542e0544fb
5c4b4be3b6b937256103a5ae49177e0c3a17cb8f

Not sure how we missed the upstream commit IDs in the git log but it happpened.

Larry

Comment 11 Andrea Arcangeli 2011-04-19 15:55:22 UTC
*** Bug 679999 has been marked as a duplicate of this bug. ***

Comment 12 RHEL Program Management 2011-06-16 12:50:17 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 19 errata-xmlrpc 2011-12-06 12:45:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1530.html