Bug 670063

Summary: pages stuck in ksm pages_volatile
Product: Red Hat Enterprise Linux 6 Reporter: Qian Cai <qcai>
Component: kernelAssignee: Andrea Arcangeli <aarcange>
Status: CLOSED ERRATA QA Contact: Caspar Zhang <czhang>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: aarcange, czhang, qcai, tburke
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-121.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-23 20:37:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qian Cai 2011-01-17 03:14:31 UTC
Description of problem:
LTP ksm01 failed due to incorrect pages_volatile values.

# ./ksm01 -s 512
ksm01       0  TINFO  :  child 0 stops.
ksm01       0  TINFO  :  KSM merging...
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  wait for all children to stop.
ksm01       0  TINFO  :  child 2 stops.
ksm01       0  TINFO  :  resume all children.
ksm01       0  TINFO  :  child 0 continues...
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 2 continues...
ksm01       0  TINFO  :  child 0 allocates 512 MB filled with 'c'.
ksm01       0  TINFO  :  child 1 allocates 512 MB filled with 'a'.
ksm01       0  TINFO  :  child 2 allocates 512 MB filled with 'a'.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  child 2 stops.
ksm01       0  TINFO  :  child 0 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 1.
ksm01       0  TINFO  :  pages_shared is 2.
ksm01       0  TINFO  :  pages_sharing is 393214.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 393216.
ksm01       0  TINFO  :  wait for child 1 to stop.
ksm01       0  TINFO  :  resume child 1.
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 1 changes memory content to 'b'.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 1.
ksm01       0  TINFO  :  pages_shared is 3.
ksm01       0  TINFO  :  pages_sharing is 393213.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 393216.
ksm01       0  TINFO  :  wait for child 1 to stop.
ksm01       0  TINFO  :  resume all children.
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 0 continues...
ksm01       0  TINFO  :  child 2 continues...
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 2 changes memory content to 'd'
ksm01       0  TINFO  :  child 0 verifies memory content.
ksm01       0  TINFO  :  child 0 changes memory content to 'd'.
ksm01       0  TINFO  :  child 1 changes memory content to 'd'
ksm01       0  TINFO  :  child 2 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 1.
ksm01       0  TINFO  :  pages_shared is 3.
ksm01       1  TFAIL  :  pages_shared is not 1.
ksm01       0  TINFO  :  pages_sharing is 393192.
ksm01       2  TFAIL  :  pages_sharing is not 393215.
ksm01       0  TINFO  :  pages_volatile is 22.
ksm01       3  TFAIL  :  pages_volatile is not 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 393216.
ksm01       0  TINFO  :  wait for all children to stop.
ksm01       0  TINFO  :  child 0 stops.
ksm01       0  TINFO  :  resume child 1.
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 1 changes one page to 'e'.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 1.
ksm01       0  TINFO  :  pages_shared is 1.
ksm01       0  TINFO  :  pages_sharing is 393214.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 1.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 393216.
ksm01       0  TINFO  :  wait for child 1 to stop.
ksm01       0  TINFO  :  resume all children.
ksm01       0  TINFO  :  KSM unmerging...
ksm01       0  TINFO  :  child 0 continues...
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 2 continues...
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 2 stops.
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 0 verifies memory content.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  child 0 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 2.
ksm01       0  TINFO  :  pages_shared is 0.
ksm01       0  TINFO  :  pages_sharing is 0.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 393216.
ksm01       0  TINFO  :  wait for all children to stop.
ksm01       0  TINFO  :  resume all children.
ksm01       0  TINFO  :  stop KSM.
ksm01       0  TINFO  :  child 0 continues...
ksm01       0  TINFO  :  child 2 continues...
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 0.
ksm01       0  TINFO  :  pages_shared is 0.
ksm01       0  TINFO  :  pages_sharing is 0.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 393216.

# numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5
node 0 size: 2047 MB
node 0 free: 1652 MB
node 1 cpus: 6 7 8 9 10 11
node 1 size: 2045 MB
node 1 free: 1791 MB
node 2 cpus: 18 19 20 21 22 23
node 2 size: 2048 MB
node 2 free: 1929 MB
node 3 cpus: 12 13 14 15 16 17
node 3 size: 2048 MB
node 3 free: 1743 MB
node distances:
node   0   1   2   3 
  0:  10  16  16  16 
  1:  16  10  16  16 
  2:  16  16  10  16 
  3:  16  16  16  10 

Version-Release number of selected component (if applicable):
kernel in bug 647334#c7

How reproducible:
always

Steps to Reproduce:
1. git clone git://git.engineering.redhat.com/users/qcai/ltp.git
2. cd ltp; make autotools; ./configure; make
3. cd testcases/kernel/mem/ksm/
4. ./ksm01 -s 512 (depends on memory size)
  
Actual results:
The test failed

Expected results:
The test passed

Additional info:
If the memory allocation size in the test went up above a certain size, the test then started to fail. For example, in the above example on the AMD Magny-Cours system, it failed when memory allocation size was 512M and beyond. When it was 384M and below, it passed.

On an Intel Nehalem-EX system, when it was 640M and beyond, it failed.

# numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39
node 0 size: 16265 MB
node 0 free: 15234 MB
node 1 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47
node 1 size: 16384 MB
node 1 free: 15718 MB
node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55
node 2 size: 16384 MB
node 2 free: 15781 MB
node 3 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63
node 3 size: 16384 MB
node 3 free: 15726 MB
node distances:
node   0   1   2   3 
  0:  10  21  21  21 
  1:  21  10  21  21 
  2:  21  21  10  21 
  3:  21  21  21  10 

# ./ksm01 -s 640
ksm01       0  TINFO  :  child 0 stops.
ksm01       0  TINFO  :  KSM merging...
ksm01       0  TINFO  :  wait for all children to stop.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  child 2 stops.
ksm01       0  TINFO  :  resume all children.
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 1 allocates 640 MB filled with 'a'.
ksm01       0  TINFO  :  child 2 continues...
ksm01       0  TINFO  :  child 2 allocates 640 MB filled with 'a'.
ksm01       0  TINFO  :  child 0 continues...
ksm01       0  TINFO  :  child 0 allocates 640 MB filled with 'c'.
ksm01       0  TINFO  :  child 2 stops.
ksm01       0  TINFO  :  child 0 stops.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 1.
ksm01       0  TINFO  :  pages_shared is 2.
ksm01       0  TINFO  :  pages_sharing is 491518.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 491520.
ksm01       0  TINFO  :  wait for child 1 to stop.
ksm01       0  TINFO  :  resume child 1.
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 1 changes memory content to 'b'.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 1.
ksm01       0  TINFO  :  pages_shared is 3.
ksm01       0  TINFO  :  pages_sharing is 491517.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 491520.
ksm01       0  TINFO  :  wait for child 1 to stop.
ksm01       0  TINFO  :  resume all children.
ksm01       0  TINFO  :  child 2 continues...
ksm01       0  TINFO  :  child 2 changes memory content to 'd'
ksm01       0  TINFO  :  child 0 continues...
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 0 verifies memory content.
ksm01       0  TINFO  :  child 2 stops.
ksm01       0  TINFO  :  child 0 changes memory content to 'd'.
ksm01       0  TINFO  :  child 1 changes memory content to 'd'
ksm01       0  TINFO  :  child 0 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 1.
ksm01       0  TINFO  :  pages_shared is 1.
ksm01       0  TINFO  :  pages_sharing is 438415.
ksm01       1  TFAIL  :  pages_sharing is not 491519.
ksm01       0  TINFO  :  pages_volatile is 53099.
ksm01       2  TFAIL  :  pages_volatile is not 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 491520.
ksm01       0  TINFO  :  wait for all children to stop.
ksm01       0  TINFO  :  resume child 1.
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 1 changes one page to 'e'.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 1.
ksm01       0  TINFO  :  pages_shared is 1.
ksm01       0  TINFO  :  pages_sharing is 491518.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 1.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 491520.
ksm01       0  TINFO  :  wait for child 1 to stop.
ksm01       0  TINFO  :  resume all children.
ksm01       0  TINFO  :  KSM unmerging...
ksm01       0  TINFO  :  child 2 continues...
ksm01       0  TINFO  :  child 0 continues...
ksm01       0  TINFO  :  child 2 stops.
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 1 verifies memory content.
ksm01       0  TINFO  :  child 0 verifies memory content.
ksm01       0  TINFO  :  child 0 stops.
ksm01       0  TINFO  :  child 1 stops.
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 2.
ksm01       0  TINFO  :  pages_shared is 0.
ksm01       0  TINFO  :  pages_sharing is 0.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 491520.
ksm01       0  TINFO  :  wait for all children to stop.
ksm01       0  TINFO  :  resume all children.
ksm01       0  TINFO  :  stop KSM.
ksm01       0  TINFO  :  child 2 continues...
ksm01       0  TINFO  :  child 0 continues...
ksm01       0  TINFO  :  child 1 continues...
ksm01       0  TINFO  :  check!
ksm01       0  TINFO  :  run is 0.
ksm01       0  TINFO  :  pages_shared is 0.
ksm01       0  TINFO  :  pages_sharing is 0.
ksm01       0  TINFO  :  pages_volatile is 0.
ksm01       0  TINFO  :  pages_unshared is 0.
ksm01       0  TINFO  :  sleep_millisecs is 0.
ksm01       0  TINFO  :  pages_to_scan is 491520.

Comment 1 Andrea Arcangeli 2011-01-20 01:34:34 UTC
I assume you're using my last kernel build that should include the fix for this.

The lru_add_drain_all happens only at the end of a full ksm scan. That means the bigger the memory size to scan, the less frequently the flush will happen.

Before reading pages_volatile, can you wait /sys/kernel/mm/ksm/full_scans to increase 2 times? that may fix it if it was a timing issue because we only drain the lru at the end of the scan.

Comment 2 Qian Cai 2011-01-20 03:42:01 UTC
Yes, you are right. I'll fix the test cases.

Comment 3 Andrea Arcangeli 2011-02-14 16:39:06 UTC
fix posted to rhkernel-list Message-ID: <20110214163829.GF6494>

Comment 4 RHEL Program Management 2011-02-14 20:59:55 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 5 Aristeu Rozanski 2011-03-10 17:57:24 UTC
Patch(es) available on kernel-2.6.32-121.el6

Comment 9 errata-xmlrpc 2011-05-23 20:37:48 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html