Bug 620364

Summary: Parallel dd in tmpfs will failed when the whole memory is pass 1/2 of total memory
Product: Red Hat Enterprise Linux 6 Reporter: Joy Pu <ypu>
Component: kernelAssignee: mm-maint-bot <mm-maint>
kernel sub component: Memory Management QA Contact: Chao Ye <cye>
Status: CLOSED NOTABUG Docs Contact:
Severity: medium    
Priority: low CC: agk, cye, jmarchan, wgomerin
Version: 6.1Keywords: RHELNAK
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-11 11:24:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 846704, 1270638    

Description Joy Pu 2010-08-02 10:40:55 UTC
Description:
In the RHEL6.0 host, Parallel dd always failed with "out of memory" warning when try to allocate a piece of memory which is bigger than 1/2 of physical memory plus swap. This makes the result of dd is much smaller than expect. The same test will pass in RHEL 5.5. And this will happen both with transparent hugepage off and on.

Version-Release number of selected component (if applicable):2.6.32-54.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. mount tmpfs(Using 8G as the swap is about 6G and memory is 4G)
mount -t tmpfs -o size=8G none /mnt
2. Parallel dd
for i in `seq 5`; do dd if=/dev/zero of=/mnt/$i bs=1G count=1 & done
As swap plus memory is about 10G in my test machine, try to make the total momery size to 5G.

Actual results:
Parallel dd will always failed with "out of memory" warning

Expected results:
Parallel dd can succuess as in RHEL 5.5

Additional info:
1. Memory info:
# cat /proc/meminfo 
MemTotal:        3858740 kB
MemFree:         3516900 kB
Buffers:           13640 kB
Cached:           132660 kB
SwapCached:            0 kB
Active:            86200 kB
Inactive:          74396 kB
Active(anon):      18216 kB
Inactive(anon):        0 kB
Active(file):      67984 kB
Inactive(file):    74396 kB
Unevictable:        5308 kB
Mlocked:            5308 kB
SwapTotal:       6094840 kB
SwapFree:        6094840 kB
Dirty:             15908 kB
Writeback:             0 kB
AnonPages:         19612 kB
Mapped:            15088 kB
Shmem:               260 kB
Slab:             135960 kB
SReclaimable:      19844 kB
SUnreclaim:       116116 kB
KernelStack:        1248 kB
PageTables:         3992 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     8024208 kB
Committed_AS:     168780 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      123012 kB
VmallocChunk:   34359607440 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        6976 kB
DirectMap2M:     2023424 kB
DirectMap1G:     2097152 kB


2.Host cpuinfo
processor       : 2
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 2
model name      : AMD Phenom(tm) 8750 Triple-Core Processor
stepping        : 3
cpu MHz         : 1200.000
cache size      : 512 KB
physical id     : 0
siblings        : 3
core id         : 2
cpu cores       : 3
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
bogomips        : 4809.89
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

3. Out of memory infomation
Total swap = 6094840kB
1048575 pages RAM
84252 pages reserved
7489 pages shared
921283 pages non-shared
Out of memory: kill process 9135 (bash) score 46759 or a child
Killed process 11284 (dd) vsz:1153672kB, anon-rss:626228kB, file-rss:388kB
dd: page allocation failure. order:0, mode:0x200da
Pid: 11284, comm: dd Not tainted 2.6.32-54.el6.x86_64 #1
Call Trace:
 [<ffffffff8111d63f>] __alloc_pages_nodemask+0x65f/0x7e0
 [<ffffffff8114ea94>] alloc_pages_vma+0x84/0x110
 [<ffffffff81143259>] read_swap_cache_async+0xe9/0x140
 [<ffffffff81143c69>] ? valid_swaphandles+0x69/0x160
 [<ffffffff81143337>] swapin_readahead+0x87/0xc0
 [<ffffffff8113465c>] handle_pte_fault+0x6cc/0xa40
 [<ffffffff81134bbd>] handle_mm_fault+0x1ed/0x2b0
 [<ffffffff814d1453>] do_page_fault+0x123/0x3a0
 [<ffffffff814ceec5>] page_fault+0x25/0x30
 [<ffffffff8110be5f>] ? iov_iter_fault_in_readable+0x2f/0x60
 [<ffffffff811264a8>] ? shmem_write_end+0x48/0x60
 [<ffffffff8110d4fe>] generic_file_buffered_write+0xde/0x2a0
 [<ffffffff81071e27>] ? current_fs_time+0x27/0x30
 [<ffffffff8110ee80>] __generic_file_aio_write+0x250/0x480
 [<ffffffff8110f11f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffff8116ab7a>] do_sync_write+0xfa/0x140
 [<ffffffff81091980>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81207f2b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811fb2c6>] ? security_file_permission+0x16/0x20
 [<ffffffff8116ae78>] vfs_write+0xb8/0x1a0
 [<ffffffff814d1484>] ? do_page_fault+0x154/0x3a0
 [<ffffffff8116b8b1>] sys_write+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   0
CPU    2: hi:  186, btch:  31 usd: 159
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   7
CPU    1: hi:  186, btch:  31 usd:  92
CPU    2: hi:  186, btch:  31 usd:  93
active_anon:730104 inactive_anon:184789 isolated_anon:0
 active_file:45 inactive_file:0 isolated_file:0
 unevictable:1330 dirty:0 writeback:40 unstable:0
 free:2 slab_reclaimable:4103 slab_unreclaimable:29329
 mapped:924 shmem:111229 pagetables:3621 bounce:0
Node 0 DMA free:8kB min:252kB low:312kB high:376kB active_anon:0kB inactive_anon:15712kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15356kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:6112kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 3447 3952 3952
Node 0 DMA32 free:0kB min:58724kB low:73404kB high:88084kB active_anon:2747812kB inactive_anon:550252kB active_file:92kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3529760kB mlocked:0kB dirty:0kB writeback:160kB mapped:0kB shmem:351024kB slab_reclaimable:1000kB slab_unreclaimable:908kB kernel_stack:24kB pagetables:9328kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:87232 all_unreclaimable? yes
lowmem_reserve[]: 0 0 505 505
Node 0 Normal free:0kB min:8600kB low:10748kB high:12900kB active_anon:172604kB inactive_anon:173192kB active_file:88kB inactive_file:0kB unevictable:5320kB isolated(anon):0kB isolated(file):0kB present:517120kB mlocked:5320kB dirty:0kB writeback:0kB mapped:3696kB shmem:87780kB slab_reclaimable:15412kB slab_unreclaimable:116408kB kernel_stack:1216kB pagetables:5156kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2708 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
Node 0 DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
114439 total pagecache pages
2178 pages in swap cache
Swap cache stats: add 5986252, delete 5984074, find 1891436/2198961
Free swap  = 420kB
Total swap = 6094840kB
1048575 pages RAM
84252 pages reserved
7492 pages shared
942918 pages non-shared
dd: page allocation failure. order:0, mode:0x200da
Pid: 11284, comm: dd Not tainted 2.6.32-54.el6.x86_64 #1
Call Trace:
 [<ffffffff8111d63f>] __alloc_pages_nodemask+0x65f/0x7e0
 [<ffffffff8114ea94>] alloc_pages_vma+0x84/0x110
 [<ffffffff8112660c>] shmem_alloc_page+0x4c/0x60
 [<ffffffff81138ebb>] ? __vm_enough_memory+0x3b/0x190
 [<ffffffff811294f7>] shmem_getpage+0x157/0x900
 [<ffffffff8110be5f>] ? iov_iter_fault_in_readable+0x2f/0x60
 [<ffffffff81129d3d>] shmem_write_begin+0x2d/0x30
 [<ffffffff8110d52e>] generic_file_buffered_write+0x10e/0x2a0
 [<ffffffff81071e27>] ? current_fs_time+0x27/0x30
 [<ffffffff8110ee80>] __generic_file_aio_write+0x250/0x480
 [<ffffffff8110f11f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffff8116ab7a>] do_sync_write+0xfa/0x140
 [<ffffffff81091980>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81207f2b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811fb2c6>] ? security_file_permission+0x16/0x20
 [<ffffffff8116ae78>] vfs_write+0xb8/0x1a0
 [<ffffffff814d1484>] ? do_page_fault+0x154/0x3a0
 [<ffffffff8116b8b1>] sys_write+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:  77
CPU    2: hi:  186, btch:  31 usd: 166
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   7
CPU    1: hi:  186, btch:  31 usd:  92
CPU    2: hi:  186, btch:  31 usd:  93
active_anon:730057 inactive_anon:184581 isolated_anon:96
 active_file:45 inactive_file:0 isolated_file:0
 unevictable:1330 dirty:0 writeback:3 unstable:0
 free:64 slab_reclaimable:4103 slab_unreclaimable:29329
 mapped:924 shmem:111192 pagetables:3621 bounce:0
Node 0 DMA free:8kB min:252kB low:312kB high:376kB active_anon:0kB inactive_anon:15740kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15356kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:6112kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 3447 3952 3952
Node 0 DMA32 free:248kB min:58724kB low:73404kB high:88084kB active_anon:2747624kB inactive_anon:549648kB active_file:92kB inactive_file:0kB unevictable:0kB isolated(anon):128kB isolated(file):0kB present:3529760kB mlocked:0kB dirty:0kB writeback:12kB mapped:0kB shmem:350876kB slab_reclaimable:1000kB slab_unreclaimable:908kB kernel_stack:24kB pagetables:9328kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:64 all_unreclaimable? no
lowmem_reserve[]: 0 0 505 505
Node 0 Normal free:0kB min:8600kB low:10748kB high:12900kB active_anon:172680kB inactive_anon:172940kB active_file:88kB inactive_file:0kB unevictable:5320kB isolated(anon):128kB isolated(file):0kB present:517120kB mlocked:5320kB dirty:0kB writeback:0kB mapped:3696kB shmem:87780kB slab_reclaimable:15412kB slab_unreclaimable:116408kB kernel_stack:1216kB pagetables:5156kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2772 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
Node 0 DMA32: 58*4kB 2*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 248kB
Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
114328 total pagecache pages
2137 pages in swap cache
Swap cache stats: add 5986358, delete 5984221, find 1891436/2198961
Free swap  = 0kB
Total swap = 6094840kB
1048575 pages RAM
84252 pages reserved
7483 pages shared
942798 pages non-shared

Comment 2 RHEL Program Management 2010-08-02 11:08:03 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 3 RHEL Program Management 2010-08-18 21:25:52 UTC
Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Red Hat Enterprise Linux. Unfortunately, we
are unable to address this request in the current release. Because we
are in the final stage of Red Hat Enterprise Linux 6 development, only
significant, release-blocking issues involving serious regressions and
data corruption can be considered.

If you believe this issue meets the release blocking criteria as
defined and communicated to you by your Red Hat Support representative,
please ask your representative to file this issue as a blocker for the
current release. Otherwise, ask that it be evaluated for inclusion in
the next minor release of Red Hat Enterprise Linux.

Comment 4 RHEL Program Management 2011-10-07 15:09:41 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 5 Jerome Marchand 2016-04-11 11:11:19 UTC
(In reply to Joy Pu from comment #0)
> Description:
> In the RHEL6.0 host, Parallel dd always failed with "out of memory" warning
> when try to allocate a piece of memory which is bigger than 1/2 of physical
> memory plus swap. This makes the result of dd is much smaller than expect.
> The same test will pass in RHEL 5.5. And this will happen both with
> transparent hugepage off and on.
> 
> Version-Release number of selected component (if
> applicable):2.6.32-54.el6.x86_64
> 
> How reproducible:
> Always
> 
> Steps to Reproduce:
> 1. mount tmpfs(Using 8G as the swap is about 6G and memory is 4G)
> mount -t tmpfs -o size=8G none /mnt
> 2. Parallel dd
> for i in `seq 5`; do dd if=/dev/zero of=/mnt/$i bs=1G count=1 & done

This will allocate 5 buffers of 1 GB, and write 5 GB to a filesystem residing in memory. This is not surprising that one can reach an OOM condition under these circumstances. I don't know what changed between RHEL 5 and 6 that would explain the different behavior, but I would not consider that a bug.

An obvious workaround is to use a reasonable block size, e.g.:
for i in `seq 5`; do dd if=/dev/zero of=/mnt/$i bs=1M count=1024 & done