525941 – OOM on i686 kernel-smp

Bug 525941 - OOM on i686 kernel-smp

Summary: OOM on i686 kernel-smp

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.7.z
Hardware:	i686
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	rc
Target Release:	---
Assignee:	Jiri Pirko
QA Contact:	Petr Beňas
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-09-27 12:03 UTC by Qian Cai
Modified:	2015-05-05 01:17 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-02-16 16:05:11 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
PID range checker (1.13 KB, text/plain) 2009-09-27 12:03 UTC, Qian Cai	no flags	Details
Free Low memory watcher (360 bytes, text/plain) 2009-09-30 13:36 UTC, Jiri Pirko	no flags	Details
untested patch (715 bytes, patch) 2009-10-01 18:54 UTC, Qian Cai	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:0263	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 4.9 kernel security and bug fix update	2011-02-16 15:14:55 UTC

Description Qian Cai 2009-09-27 12:03:55 UTC

Created attachment 362820 [details]
PID range checker

Description of problem:
After the fix for BZ 479182, a program (attached) that checking PID range will trigger the OOM pretty quickly. This has only been observed on a particular machine installed kernel-smp so far. UP kernel and kernel-hugemem is not affected. The problem exists for both RHEL4.7.z and RHEL4.8.

Red Hat Enterprise Linux AS release 4 (Nahant Update 7)
Kernel 2.6.9-78.0.14.ELsmp on an i686

gs-bl460cg1-01.rhts.bos.redhat.com login: oom-killer: gfp_mask=0xd0
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
cpu 2 hot: low 2, high 6, batch 1
cpu 2 cold: low 0, high 2, batch 1
cpu 3 hot: low 2, high 6, batch 1
cpu 3 cold: low 0, high 2, batch 1
cpu 4 hot: low 2, high 6, batch 1
cpu 4 cold: low 0, high 2, batch 1
cpu 5 hot: low 2, high 6, batch 1
cpu 5 cold: low 0, high 2, batch 1
cpu 6 hot: low 2, high 6, batch 1
cpu 6 cold: low 0, high 2, batch 1
cpu 7 hot: low 2, high 6, batch 1
cpu 7 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
cpu 2 hot: low 32, high 96, batch 16
cpu 2 cold: low 0, high 32, batch 16
cpu 3 hot: low 32, high 96, batch 16
cpu 3 cold: low 0, high 32, batch 16
cpu 4 hot: low 32, high 96, batch 16
cpu 4 cold: low 0, high 32, batch 16
cpu 5 hot: low 32, high 96, batch 16
cpu 5 cold: low 0, high 32, batch 16
cpu 6 hot: low 32, high 96, batch 16
cpu 6 cold: low 0, high 32, batch 16
cpu 7 hot: low 32, high 96, batch 16
cpu 7 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
cpu 2 hot: low 32, high 96, batch 16
cpu 2 cold: low 0, high 32, batch 16
cpu 3 hot: low 32, high 96, batch 16
cpu 3 cold: low 0, high 32, batch 16
cpu 4 hot: low 32, high 96, batch 16
cpu 4 cold: low 0, high 32, batch 16
cpu 5 hot: low 32, high 96, batch 16
cpu 5 cold: low 0, high 32, batch 16
cpu 6 hot: low 32, high 96, batch 16
cpu 6 cold: low 0, high 32, batch 16
cpu 7 hot: low 32, high 96, batch 16
cpu 7 cold: low 0, high 32, batch 16

Free pages:    65117304kB (65101120kB HighMem)
Active:5996 inactive:2021 dirty:0 writeback:0 unstable:0 free:16279326 slab:88868 mapped:4435 pagetables:294
DMA free:12456kB min:64kB low:128kB high:192kB active:0kB inactive:0kB present:16384kB pages_scanned:0 all_unreclaimable? yes
protections[]: 0 0 0
Normal free:3728kB min:3728kB low:7456kB high:11184kB active:56kB inactive:316kB present:901120kB pages_scanned:1155 all_unreclaimable? yes
protections[]: 0 0 0
HighMem free:65101120kB min:512kB low:1024kB high:1536kB active:23928kB inactive:7768kB present:65929212kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
DMA: 4*4kB 3*8kB 2*16kB 3*32kB 4*64kB 2*128kB 2*256kB 0*512kB 1*1024kB 1*2048kB 2*4096kB = 12456kB
Normal: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3728kB
HighMem: 1034*4kB 549*8kB 261*16kB 191*32kB 149*64kB 69*128kB 32*256kB 12*512kB 3*1024kB 1*2048kB 15880*4096kB = 65101120kB
5330 pagecache pages
Swap cache: add 0, delete 0, find 0/0, race 0+0
0 bounce buffer pages
Free swap:       5406712kB
16711679 pages of RAM
16285268 pages of HIGHMEM
329568 reserved pages
6191 pages shared
0 pages swap cached
Kernel panic - not syncing: out of memory. panic_on_oom is selected

Version-Release number of selected component (if applicable):
kernel-2.6.9-78.0.14.EL
kernel-2.6.9-89.11.EL

How reproducible:
always

Steps to Reproduce:
1. reserve gs-bl460cg1-01.rhts.bos.redhat.com installed RHEL4.7 i386 for RHTS.
2. echo 99999 >/proc/sys/kernel/pid_max
3. gcc pidseqchk.c -o pidseqchk
4. ./pidseqchk
  
Actual results:
OOM.

Expected results:
No OOM.

Comment 2 Qian Cai 2009-09-27 12:08:53 UTC

After pulled out this patch for 2.6.9-78.0.14.EL kernel, OOM is gone.
linux-2.6.9-pidhashing-fix-alloc_pidmap.patch

Comment 4 Qian Cai 2009-09-28 08:07:26 UTC

Does this problem also contribute to this bug?
Bug 510371 - task_struct (and related slabcache) grow continously in RHEL 4
https://bugzilla.redhat.com/show_bug.cgi?id=510371

Comment 5 Jiri Pirko 2009-09-29 15:03:44 UTC

I was trying to reproduce this on intel-s5000phb-01.rhts.eng.bos.redhat.com x64_64 with no luck. However on gs-bl460cg1-01.rhts.bos.redhat.com i686 and 2.6.9-78.0.14.ELsmp kernel this is triggered in a moment. I tried to take 2.6.9-78 and only add patch linux-2.6.9-pidhashing-fix-alloc_pidmap.patch. This kernel works just fine. This kernel is on gs-bl460cg1-01.rhts.bos.redhat.com atm named 2.6.9-78.EL.testsmp.

I also looked into a code and patched alloc_pidmap() is almost identical to the ones in RHEL5 and upstream.

Therefore I think that the regression is brought by a different patch and my patch only uncovers the issue with pidseqchk app. Thoughts?

Comment 6 Vitaly Mayatskikh 2009-09-29 17:22:51 UTC

Very true, I also didn't find divergence with upstream in the patch.

Comment 7 Qian Cai 2009-09-30 02:19:02 UTC

Jiri, I have tried the 2.6.9-78.EL.testsmp kernel you mentioned, but it does not look like the patch is applied since BZ #479182 can still be reproducible there.

# echo 99999 >/proc/sys/kernel/pid_max

# ./pidseqchk
...
sequence break:  13868 - 32767 (new 65536)
sequence break:  65536 - 70188 (new 70190)
sequence break:  70190 - 98303 (new 131072)
sequence break:  131072 - 131072 (new 300)
...

Comment 8 Qian Cai 2009-09-30 07:06:23 UTC

It has rebuild a kernel and confirmed that,

* 2.6.9-78.EL.smp + linux-2.6.9-pidhashing-fix-alloc_pidmap.patch = OK
* 2.6.9-78.0.5.EL.smp + linux-2.6.9-pidhashing-fix-alloc_pidmap.patch = OOM

I am continuing bisecting.

Comment 9 Jiri Pirko 2009-09-30 11:05:33 UTC

update:

Indeed, my 2.6.9-78.EL.testsmp didn't include linux-2.6.9-pidhashing-fix-alloc_pidmap.patch. I put it there as linux-kernel-test.patch but it looks like it is ignored - Cai did it the same way and therefore he has negative result with 2.6.9-78.EL.smp in comment #8. I did another build of 2.6.9-78.EL.smp which includes my patch (2.6.9-78.EL.test2smp) and I have the oom killer too. So it looks like there is really something wrong with this patch. Investigation continues...

Comment 10 Jiri Pirko 2009-09-30 11:10:21 UTC

According to slabtop, size-4096 goes off the roof...

Therefore the suspect is patch line:
+                       void *page = kzalloc(PAGE_SIZE, GFP_KERNEL);

Comment 11 Jiri Pirko 2009-09-30 13:29:20 UTC

During the pidseqchk run slab-4096 goes up to ~300MB. This is allocated in the lowmem. Because gs-bl460cg1-01 has 63GB of the memory, the lowmem with 2.6.9-78.EL.test2smp is small and pidseqchk run causes to fill it all and eventually triggers the OOM killer. This doesn't happen on systems with e.g. 1GB
of RAM (intel-s5000phb-01) when the lowmem is bigger. When you use hugemem kernel, the lowmem is significantly bigger and OOM doesn't appear.

Anyway booting 2.6.9-78.EL.test2smp kernel on gs-bl460cg1-01 gives you 
warning you are using >16GB of RAM.

Here are relevant parts of /proc/meminfo for hosts/kernels:

gs-bl460cg1-01.rhts.bos.redhat.com 2.6.9-78.EL.test2hugemem

MemTotal:     65526128 kB
MemFree:      65400908 kB
LowTotal:      2873716 kB
LowFree:       2829260 kB

-------------------------------------------------------------

gs-bl460cg1-01.rhts.bos.redhat.com 2.6.9-78.EL.test2smp

MemTotal:     65528440 kB
MemFree:      65406300 kB
LowTotal:       387368 kB
LowFree:        343580 kB

-------------------------------------------------------------

intel-s5000phb-01.rhts.eng.bos.redhat.com 2.6.9-78.EL.test2smp

MemTotal:      1028768 kB
MemFree:        909168 kB
LowTotal:       903516 kB
LowFree:        866468 kB


Hope this clears it up.

Closing this as NOTABUG. Feel free to reopen.

Comment 12 Jiri Pirko 2009-09-30 13:36:45 UTC

Created attachment 363184 [details]
Free Low memory watcher

This application shows the low memory level and how it fluctuates.

Comment 14 Qian Cai 2009-10-01 18:50:56 UTC

Done some research for RHEL5, and found that kernel-PAE has also had around 300M low memory in this machine. However, the OOM is unable to be reproduced there due to it does not allow pid_max to be set to more than 32768 for 32-bit.

# echo -n 32769 >/proc/sys/kernel/pid_max 
-bash: echo: write error: Invalid argument

Therefore, how about bring a similar behaviour into RHEL4 as well -- unable to set pid_max to more than 32768 for 32-bit kernels other than hugemem?

Comment 15 Qian Cai 2009-10-01 18:54:13 UTC

Created attachment 363384 [details]
untested patch

Comment 16 Jiri Pirko 2009-10-05 10:31:30 UTC

a.d. c#14: Okay Cai, that seems reasonable.

Following two commits solves the problem:

http://linux.bkbits.net:8080/linux-2.6/?PAGE=gnupatch&REV=1.1938.166.68
http://linux.bkbits.net:8080/linux-2.6/?PAGE=gnupatch&REV=1.1938.166.69

I've already backported these to RHEL4 and I'm going to test it.

Comment 18 Qian Cai 2009-10-06 14:03:07 UTC

Jiri, one question though. So, it is not allowed to set the max_pid bigger than 0x8000 even with hugemem kernel anymore? Since it has 4G/4G split there, so I am not sure if there is any existing RHEL4 customers expect larger max_pid in that setup.

Comment 19 Jiri Pirko 2009-10-06 14:09:10 UTC

(In reply to comment #18)
> Jiri, one question though. So, it is not allowed to set the max_pid bigger than
> 0x8000 even with hugemem kernel anymore? Since it has 4G/4G split there, so I
> am not sure if there is any existing RHEL4 customers expect larger max_pid in
> that setup.  

Correct. This is the way it's done in upstream kernel and also in RHEL5 kernel.

Comment 20 RHEL Program Management 2010-10-12 17:51:04 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 21 Vivek Goyal 2010-10-14 14:41:19 UTC

Committed in 89.43.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 25 Petr Beňas 2010-12-07 10:47:30 UTC

Reproduced in 89.42.ELsmp and verified in 89.43.ELsmp.

Comment 27 errata-xmlrpc 2011-02-16 16:05:11 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0263.html

Note You need to log in before you can comment on or make changes to this bug.