Bug 1365266 - [RHELSA-7.3][lpt-lite] move_pages04 1 TFAIL : move_pages04.c:131: status[1] is -14
Summary: [RHELSA-7.3][lpt-lite] move_pages04 1 TFAIL : move_pages04.c:131: status[1...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel-aarch64
Version: 7.3
Hardware: aarch64
OS: Linux
unspecified
medium
Target Milestone: rc
: 7.4
Assignee: Kernel Drivers
QA Contact: Jeff Bastian
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-08 18:14 UTC by PaulB
Modified: 2016-10-27 21:20 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-27 21:20:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description PaulB 2016-08-08 18:14:37 UTC
Description of problem:
 Running KT1 on target system listed in comment#1,
/kernel/distribution/ltp/lite had the following failure:
 move_pages04    1  TFAIL  :  move_pages04.c:131: status[1] is -14

Version-Release number of selected component (if applicable):
 distro: RHEL-7.3-20160707.2 Server aarch64
 kernel: 4.5.0-3.el7
 task: /kernel/distribution/ltp/lite 20160510-7 

How reproducible:
 unknown

Steps to Reproduce:
1. Install target system listed in comment#1 with 
   RHEL-7.3-20160707.2 Server aarch64
2. Install kernel-4.5.0-3.el7
3. run /kernel/distribution/ltp/lite 20160510-7 

Actual results:
https://beaker.engineering.redhat.com/jobs/1435778
https://beaker.engineering.redhat.com/recipes/2939557#task44065244
http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2016/08/14357/1435778/2939557/44065244/217420387/resultoutputfile.log
---<-snip->---
move_pages04                   FAIL       1   
---<-snip->---

http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2016/08/14357/1435778/2939557/44065244/RHELKT1LITE.FILTERED.run.log
---<-snip->---
<<<test_start>>>
tag=move_pages04 stime=1470443314
cmdline="move_pages.sh 04"
contacts=""
analysis=exit
<<<test_output>>>
move_pages04    1  TFAIL  :  move_pages04.c:131: status[1] is -14
<<<execution_status>>>
initiation_status="ok"
duration=0 termination_type=exited termination_id=1 corefile=no
cutime=1 cstime=0
<<<test_end>>>
---<-snip->---

Expected results:
 move_pages04 subtask completes successfully

Additional info:

Comment 3 PaulB 2016-08-24 14:54:39 UTC
All,
Issue persists and is isolated to cavium-thunderx2 hosts.
distro: RHEL-7.3-20160817.1 Server aarch64
kernel: 4.5.0-5.el7
host: cavium-thunderx2

https://beaker.engineering.redhat.com/recipes/2993875#task44806350
http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2016/08/14624/1462439/2993875/44806350/RHELKT1LITE.FILTERED.run.log
---<-snip->---
<<<test_start>>>
tag=move_pages04 stime=1471986926
cmdline="move_pages.sh 04"
contacts=""
analysis=exit
<<<test_output>>>
move_pages04    1  TFAIL  :  move_pages04.c:131: status[1] is -14
<<<execution_status>>>
initiation_status="ok"
duration=0 termination_type=exited termination_id=1 corefile=no
cutime=1 cstime=0
<<<test_end>>>
---<-snip->---

Best,
-pbunyan

Comment 4 Jon Masters 2016-08-29 20:10:37 UTC
Yeah - it'll be specific to NUMA systems, such as ThunderX. Following up.

Comment 6 Jan Stancek 2016-09-21 12:35:14 UTC
Looks like test needs an update after following kernel commit:

commit d899844e9c98c9c74b4d9926fd3bd66a225f6978
Author: Kirill A. Shutemov <kirill.shutemov.com>
Date:   Fri Sep 4 15:47:53 2015 -0700

    mm: fix status code which move_pages() returns for zero page
    
    The manpage for move_pages(2) specifies that status code for zero page is
    supposed to be -EFAULT.  Currently kernel return -ENOENT in this case.
    
    follow_page() can do it for us, if we would ask for FOLL_DUMP.  The use of
    FOLL_DUMP also means that the upper layer page tables pages are no longer
    allocated.
    
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov.com>
    Reviewed-by: Christoph Lameter <cl>
    Cc: Hugh Dickins <hughd>
    Signed-off-by: Andrew Morton <akpm>
    Signed-off-by: Linus Torvalds <torvalds>

Comment 7 Jan Stancek 2016-10-03 13:48:09 UTC
(In reply to Jan Stancek from comment #6)
> Looks like test needs an update after following kernel commit:

Posted: http://lists.linux.it/pipermail/ltp/2016-October/002728.html

# uname -r
4.8.0-1.el7.test.x86_64

# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3
node 0 size: 5933 MB
node 0 free: 3952 MB
node 1 cpus: 4 5 6 7
node 1 size: 6047 MB
node 1 free: 3759 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10

# ./move_pages04 
move_pages04    1  TPASS  :  status[1] has expected value

Comment 8 Jan Stancek 2016-10-06 07:26:49 UTC
(In reply to Jan Stancek from comment #7)
> (In reply to Jan Stancek from comment #6)
> > Looks like test needs an update after following kernel commit:
> 
> Posted: http://lists.linux.it/pipermail/ltp/2016-October/002728.html

https://github.com/linux-test-project/ltp/commit/d539a004dde3b760f610ef7cae90a96de8489ec8

Comment 9 John Feeney 2016-10-27 21:09:21 UTC
So is this a test problem that only is exhibited on NUMA enabled systems?

If so, the bz should be moved somewhere so it can be addressed properly.

Thanks.

Comment 10 Jan Stancek 2016-10-27 21:20:48 UTC
(In reply to John Feeney from comment #9)
> So is this a test problem that only is exhibited on NUMA enabled systems?

Yes.

> If so, the bz should be moved somewhere so it can be addressed properly.

It's been already addressed, both upstream (comment 8) and internally (tests/kernel commit 970e332d14a).

Closing as NOTABUG.


Note You need to log in before you can comment on or make changes to this bug.