Bug 845268

Summary: mbind reports success, but has no effect
Product: Red Hat Enterprise Linux 7 Reporter: Jan Stancek <jstancek>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: czhang, jburke, kmcmartin, lwoodman, mkosaki, zliu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-3.7.0-0.30.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-10 10:14:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Stancek 2012-08-02 14:02:53 UTC
Description of problem:
LTP testcases, which rely on mbind are currently failing. It appears that starting with 3.5.0-0.23.el7 mbind syscall has no effect.

For example:
---------------------- mbind_test.c --------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
#include <numaif.h>
#include <errno.h>

/* gcc mbind_test.c -lnuma -o mbind_test -Wall */
#define MAXNODE 4096

int main()
{
        int ret;
        int len;
        int policy = -1;
        unsigned char *p;
        unsigned long mask[MAXNODE] = { 0 };
        unsigned long retmask[MAXNODE] = { 0 };

        len = getpagesize();
        p = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
                    0, 0);
        if (p == MAP_FAILED)
                printf("mbind err: %d\n", errno);

        mask[0] = 1;
        ret = mbind(p, len, MPOL_BIND, mask, MAXNODE, 0);
        if (ret < 0)
                printf("mbind err: %d %d\n", ret, errno);
        ret = get_mempolicy(&policy, retmask, MAXNODE, p, MPOL_F_ADDR);
        if (ret < 0)
                printf("get_mempolicy err: %d %d\n", ret, errno);

        if (policy == MPOL_BIND)
                printf("OK\n");
        else
                printf("ERROR: policy is %d\n", policy);

        return 0;
}
---------------------- /mbind_test.c -------------------

strace snippet with 3.3.0-0.20.el7.x86_64 -> OK
...
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) = 0x7f033809e000
mbind(0x7f033809e000, 4096, MPOL_BIND, {0x0000000000000001, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, ...}, 4096, 0) = 0
get_mempolicy(MPOL_BIND, {0x0000000000000001, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, ...}, 4096, 0x7f033809e000, MPOL_F_ADDR) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f033809d000
write(1, "OK\n", 3OK

strace snippet with 3.5.0-0.23.el7.x86_64 -> PROBLEM
...
mbind(0x7fb85ee4e000, 4096, MPOL_BIND, {0x0000000000000001, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, ...}, 4096, 0) = 0
get_mempolicy(MPOL_DEFAULT, {000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, ...}, 4096, 0x7fb85ee4e000, MPOL_F_ADDR) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb85ee4d000
write(1, "ERROR: policy is 0\n", 19ERROR: policy is 0

Version-Release number of selected component (if applicable):
3.5.0-0.23.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. run mbind_test.c in description
  
Actual results:
get_mempolicy() returns different policy than what mbind() has set, returned nodemask is empty

Expected results:
get_mempolicy() returns policy MPOL_BIND, returned nodemask should match the one passed to mbind()

Additional info:
failing LTP testcases include: mbind01, get_mempolicy01, move_pages01

Comment 1 Jan Stancek 2012-08-03 13:15:35 UTC
Reproducible also with upstream 3.6.0-rc1.

Works OK if I revert following commit:
commit 05f144a0d5c2207a0349348127f996e104ad7404
Author: Mel Gorman <mgorman>
Date:   Wed May 23 12:48:13 2012 +0100

    mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages

    Dave Jones' system call fuzz testing tool "trinity" triggered the
    following bug error with slab debugging enabled

Comment 2 KOSAKI Motohiro 2012-08-06 19:05:55 UTC
I have a patch for upstream. I'll handle this.

thanks.

Comment 3 Jan Stancek 2012-10-03 09:21:45 UTC
Following rebase to 3.6, this issue is still reproducible:

mbind(0x7f2bf165e000, 4096, MPOL_BIND, {0x0000000000000001, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, ...}, 4096, 0) = 0
get_mempolicy(MPOL_DEFAULT, {000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, 000000000000000000, ...}, 4096, 0x7f2bf165e000, MPOL_F_ADDR) = 0

# uname -r
3.6.0-0.27.el7.x86_64

Comment 4 KOSAKI Motohiro 2012-10-09 21:21:29 UTC
merged upstream now. (3.7-rc1).

Comment 5 Jan Stancek 2013-01-10 10:14:27 UTC
# uname -r
3.7.0-0.30.el7.x86_64
# ./mbind_test 
OK

Attached reproducer and LTP testcases (mbind01, get_mempolicy01, move_pages01) passed, closing as CURRENTRELEASE.