Bug 870326
Summary: | migrate_pages() reports success, but pages are not moved to desired node | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jan Stancek <jstancek> | ||||
Component: | kernel | Assignee: | Larry Woodman <lwoodman> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Kernel General QE <kernel-general-qe> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.4 | CC: | aarcange, aquini, atomlin, jburke, lwoodman, nobody+295318, pbunyan, riel, wgomerin | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-10-14 18:30:41 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1270638 | ||||||
Attachments: |
|
Description
Jan Stancek
2012-10-26 07:46:49 UTC
Created attachment 633710 [details]
reproducer v1
On host with 4 nodes (0-3):
# gcc mpages.c -lnuma
# cat bigfile > /dev/null
# ./a.out
1. shared mem is on node: 1
2. shared mem is on node: 1
3. shared mem is on node: 2
4. shared mem is on node: 1
The problem is alloc_pages_exact_node() is called by new_node_page() in mm/mempolicy.c with just GFP_HIGHUSER_MOVABLE rather than GFP_HIGHUSER_MOVABLE|GFP_THISNODE like new_page_node() does in mm/migrate.c. The problem is when I make this change: ----------------------------------------------------------------------- diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 060437d..93cab05 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -920,7 +920,7 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist, static struct page *new_node_page(struct page *page, unsigned long node, int **x) { - return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0); + return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE|GFP_THISNODE, 0); } /* ------------------------------------------------------------------------- the whole migrate_pages() system call fails with -ENOMEM which isnt what the man pages says. [root@hp-dl580g7-01 lwoodman]# ./a.out migrate_pages failed: -1 Cannot allocate memory Actually the man pages says it might move the pages to another node... ---------------------------------------------------------------------------- MIGRATE_PAGES(2) Linux Programmer's Manual MIGRATE_PAGES(2) NAME migrate_pages - move all pages in a process to another set of nodes SYNOPSIS #include <numaif.h> long migrate_pages(int pid, unsigned long maxnode, const unsigned long *old_nodes, const unsigned long *new_nodes); Link with -lnuma. DESCRIPTION migrate_pages() moves all pages of the process pid that are in memory nodes old_nodes to the memory nodes in new_nodes. Pages not located in any node in old_nodes will not be migrated. As far as possible, the kernel maintains the relative topology relationship inside old_nodes during the migration to new_nodes. The old_nodes and new_nodes arguments are pointers to bit masks of node numbers, with up to maxnode bits in each mask. These masks are maintained as arrays of unsigned long integers (in the last long integer, the bits beyond those specified by maxnode are ignored). The maxnode argument is the maximum node number in the bit mask plus one (this is the same as in mbind(2), but different from select(2)). The pid argument is the ID of the process whose pages are to be moved. To move pages in another process, the caller must be privileged (CAP_SYS_NICE) or the real or effective user ID of the calling process must match the real or saved-set user ID of the target process. If pid is 0, then migrate_pages() moves pages of the calling process. Pages shared with another process will only be moved if the initiating process has the CAP_SYS_NICE privilege. RETURN VALUE On success migrate_pages() returns zero. On error, it returns -1, and sets errno to indicate the error. ERRORS ERRORS EPERM Insufficient privilege (CAP_SYS_NICE) to move pages of the process specified by pid, or insufficient privilege (CAP_SYS_NICE) to access the specified target nodes. ESRCH No process matching pid could be found. VERSIONS The migrate_pages() system call first appeared on Linux in version 2.6.16. CONFORMING TO This system call is Linux-specific. NOTES For information on library support, see numa(7). Use get_mempolicy(2) with the MPOL_F_MEMS_ALLOWED flag to obtain the set of nodes that are allowed by the calling process's cpuset. Note that this information is subject to change at any time by manual or automatic reconfiguration of the cpuset. Use of migrate_pages() may result in pages whose location (node) violates the memory policy established for the specified addresses (see mbind(2)) and/or the specified process (see set_mempolicy(2)). That is, memory policy does not constrain the destination nodes used by migrate_pages(). After further investigation the move_pages() syscall already calls alloc_pages_exact_node() with GFP_THISNODE so it will return -ENOMEM if pages can not be allocated on the desired node. At this point I'd say migrate_pages() is supposed to silently fail when it cant get memory on the target node and move pages is supposed actually fail when it cant get memory on the target node. I'll check upstream with Christoph Lameter, the author of both system calls. Larry This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux. I checked with Christoph Lameter and he did say migrate_pages() is supposed to silently fail when it cant get memory on the target node and move_pages() is supposed actually fail when it cant get memory on the target node. Are you OK with this or should we try to convince him otherwise??? Larry Woodman (In reply to comment #6) > I checked with Christoph Lameter and he did say migrate_pages() is supposed > to silently fail when it cant get memory on the target node and move_pages() > is supposed actually fail when it cant get memory on the target node. > > Are you OK with this or should we try to convince him otherwise??? I can accept, that 'silent fail' is the way it's supposed to work, but it would be nice to mention that also in documentation [1]. migrate_pages(2) currently says: RETURN VALUE ... a return of zero means that all pages were successfully moved [1] http://git.kernel.org/pub/scm/docs/man-pages/man-pages |