Bug 699589

Summary: mm: check zone->all_unreclaimable in all_unreclaimable()
Product: Red Hat Enterprise Linux 6 Reporter: Qian Cai <qcai>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED CURRENTRELEASE QA Contact: Madper Xie <cxie>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: aquini, ccui, jiajyang, liwan
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 682632 Environment:
Last Closed: 2014-02-26 14:19:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 645770, 682632    
Bug Blocks: 767187, 846704    
Attachments:
Description Flags
testing log on RHEL-6.5 none

Description Qian Cai 2011-04-26 05:33:07 UTC
Clone for RHEL6.

+++ This bug was initially created as a clone of Bug #682632 +++

Description of problem:
https://lkml.org/lkml/2011/3/5/121

Version-Release number of selected component (if applicable):
kernel-2.6.18-245.el5

How reproducible:
always

Steps to Reproduce:
https://lkml.org/lkml/2011/3/5/111
  
Actual results:
System memory was exhausted but no OOM was triggered.

Expected results:
Triggered OOM.

--- Additional comment from caiqian on 2011-03-07 00:19:06 EST ---

(In reply to comment #0)
Correction.

> Actual results:
> System memory was exhausted but no OOM was triggered.
System was hung and deadlock.

> Expected results:
> Triggered OOM.
No deadlock or hung.

--- Additional comment from caiqian on 2011-04-11 10:46:11 EDT ---

Now, there is a new patchset.
http://marc.info/?l=linux-mm&m=130249983913965&w=2

Comment 1 Larry Woodman 2011-07-25 13:56:44 UTC
RHEL6 does not have the code that this BZ refers to or that the patch modifies.  I cant seem to get the system to hang up like is described here, can you???

Larry Woodman

Comment 2 Qian Cai 2011-07-25 14:15:56 UTC
Yes, we did reproduce this hung in RHEL6 before, and tested a little bit upstream but found some more patches are needed to fix another oom issue, 

http://marc.info/?l=linux-kernel&m=130587844107425&w=2

Currently, we are waiting for oom refresh went into RHEL6.2 for re-validation of this problem.

Comment 9 Suzanne Logcher 2012-05-18 20:48:06 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 10 Larry Woodman 2014-01-23 20:30:16 UTC
It been a long time since this BZ was opened and LOTS of upstream mm patches have been backported to RHEL6.  Can someone see if this is still a problem with RHEL6.5???

Larry Woodman

Comment 11 Cui Chun 2014-01-24 02:54:17 UTC
(In reply to Larry Woodman from comment #10)
> It been a long time since this BZ was opened and LOTS of upstream mm patches
> have been backported to RHEL6.  Can someone see if this is still a problem
> with RHEL6.5???
> 
> Larry Woodman

Yes, Larry. QE will check it with RHEL6.5 and see if this is still a problem.

Thanks,
Chun

Comment 12 Li Wang 2014-01-24 07:14:56 UTC
Created attachment 854811 [details]
testing log on RHEL-6.5

Comment 13 Li Wang 2014-01-24 07:28:09 UTC
(In reply to Larry Woodman from comment #10)
> It been a long time since this BZ was opened and LOTS of upstream mm patches
> have been backported to RHEL6.  Can someone see if this is still a problem
> with RHEL6.5???
> 
> Larry Woodman

testing on RHEL-6.5 kernel 2.6.32-431.el6.x86_64, System memory was exhausted and OOM was triggered. its seems the issue is not existing on RHEL-6.5 any more.

more testing log as the attachments.
---snip---
[ 7744]     0  7744    33567      131   7       0             0 python 
[ 7745]     0  7745    33567      130   4       0             0 python 
[ 7746]     0  7746    33567      130   7       0             0 python 
[ 7747]     0  7747    33567      132   0       0             0 python 
[ 7748]     0  7748    33567      132   4       0             0 python 
[ 7749]     0  7749    33567      130   4       0             0 python 
[ 7750]     0  7750    33567      130   5       0             0 python 
[ 7751]     0  7751    33567      132   3       0             0 python 
[ 7752]     0  7752    33567      130   1       0             0 python 
[ 7753]     0  7753    33567      116   6       0             0 python 
[ 7754]     0  7754    33567      119   3       0             0 python 
Out of memory: Kill process 1480 (rsyslogd) score 1 or sacrifice child 
Killed process 1480, UID 0, (rsyslogd) total-vm:249088kB, anon-rss:76kB, file-rss:24kB 
rs:main Q:Reg invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0 
rs:main Q:Reg cpuset=/ mems_allowed=0 
Pid: 1481, comm: rs:main Q:Reg Not tainted 2.6.32-431.el6.x86_64 #1 
Call Trace: 
 [<ffffffff810d05b1>] ? cpuset_print_task_mems_allowed+0x91/0xb0 
 [<ffffffff81122960>] ? dump_header+0x90/0x1b0 
 [<ffffffff8122798c>] ? security_real_capable_noaudit+0x3c/0x70 
 [<ffffffff81122de2>] ? oom_kill_process+0x82/0x2a0 
 [<ffffffff81122d21>] ? select_bad_process+0xe1/0x120 
 [<ffffffff81123220>] ? out_of_memory+0x220/0x3c0 
 [<ffffffff8112fb3c>] ? __alloc_pages_nodemask+0x8ac/0x8d0 
 [<ffffffff8116e482>] ? kmem_getpages+0x62/0x170 
 [<ffffffff8116f09a>] ? fallback_alloc+0x1ba/0x270 
 [<ffffffff8116eaef>] ? cache_grow+0x2cf/0x320 
 [<ffffffff8116ee19>] ? ____cache_alloc_node+0x99/0x160 
 [<ffffffff8116fd9b>] ? kmem_cache_alloc+0x11b/0x190 
 [<ffffffff810efcf5>] ? taskstats_exit+0x305/0x390 
 [<ffffffff81076f17>] ? do_exit+0x157/0x870 
 [<ffffffff81077688>] ? do_group_exit+0x58/0xd0 
 [<ffffffff8108d046>] ? get_signal_to_deliver+0x1f6/0x460 
 [<ffffffff8100a265>] ? do_signal+0x75/0x800 
 [<ffffffff8118e7b4>] ? cp_new_stat+0xe4/0x100 
 [<ffffffff812334eb>] ? selinux_file_permission+0xfb/0x150 
 [<ffffffff810b1c0b>] ? sys_futex+0x7b/0x170 
 [<ffffffff8100aa80>] ? do_notify_resume+0x90/0xc0 
 [<ffffffff8100b341>] ? int_signal+0x12/0x17

Comment 14 Li Wang 2014-01-24 07:32:59 UTC
(In reply to Li Wang from comment #13)
> (In reply to Larry Woodman from comment #10)
> > It been a long time since this BZ was opened and LOTS of upstream mm patches
> > have been backported to RHEL6.  Can someone see if this is still a problem
> > with RHEL6.5???
> > 
> > Larry Woodman
> 
> testing on RHEL-6.5 kernel 2.6.32-431.el6.x86_64, System memory was
> exhausted and OOM was triggered. its seems the issue is not existing on
> RHEL-6.5 any more.
> 
> more testing log as the attachments.
> ---snip---
> [ 7744]     0  7744    33567      131   7       0             0 python 
> [ 7745]     0  7745    33567      130   4       0             0 python 
[...]

but I found that the OOM was not killing riht task, the problem like this:
https://bugzilla.redhat.com/show_bug.cgi?id=822790
it should be pay attention to the new OOM problem.

Comment 15 Linda Wang 2014-02-26 14:19:46 UTC
*nod* Since bug 822790 addresses some of the OOM killing the
wrong tasks.. closing this issue per comment#13.