Bug 134829

Summary: kernel not "un-caching" and thinks it is running out of memory
Product: Red Hat Enterprise Linux 3 Reporter: Joshua Jensen <joshua>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: petrides, riel
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-10-14 17:44:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vmstat taken every 2 seconds
none
iostat taken every 2 seconds
none
slabinfo taken every 2 seconds
none
OOM messages none

Description Joshua Jensen 2004-10-06 15:45:25 UTC
Description of problem:

Starting with a freshly rebooted machine, I write a BIG file to SAN
attached storage:

dd if=/dev/zero of=/esite001/testfile bs=1G count=10
                                                                     
         
I watch the cached memory go from something like 17 megs to 12 gigs:

[jjensen@foxhound jjensen]$ free -m


[jjensen@foxhound jjensen]$ free -m
             total       used       free     shared    buffers  cached
Mem:         32025      10494      21531          0         16   10044
-/+ buffers/cache:        433      31592
Swap:         4094          0       4094

All usually is stable at this point... but when I issue:
dd if=/dev/zero of=/esite001/testfile2 bs=1G count=10

Everything falls apart.  In dmesg I see this:
ENOMEM in journal_alloc_journal_head, retrying.
Mem-info:
Zone:DMA freepages:   424 min:     0 low:     0 high:     0
Zone:Normal freepages:  1190 min:  1279 low:  4544 high:  6304
Zone:HighMem freepages:5075753 min:   255 low:127488 high:191232
Free pages:      5077366 (5075752 HighMem)
( Active: 41477/2289682, inactive_laundry: 343654, inactive_clean:
343432, free: 5077365 )
  aa:0 ac:0 id:0 il:0 ic:0 fr:424
  aa:0 ac:3395 id:310 il:375 ic:0 fr:1190
  aa:12375 ac:25707 id:2289250 il:343405 ic:343432 fr:5075747
0*4kB 0*8kB 0*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 1*1024kB
0*2048kB 0*4096kB = 1696kB)
478*4kB 34*8kB 5*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB
1*2048kB 0*4096kB = 4760kB)
1*4kB 0*8kB 1*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 1*1024kB
1*2048kB 4956*4096kB = 20302932kB)
Swap cache: add 0, delete 0, find 0/0, race 0+0
89320 pages of slabcache
488 pages of kernel stacks
488 pages of kernel stacks
0 lowmem pagetables, 831 highmem pagetables
Free swap:       4192956kB
8388608 pages of RAM
8093680 pages of HIGHMEM
190101 reserved pages
2317494 pages shared
0 pages swap cached
Out of Memory: Killed process 31670 (nscd).
Out of Memory: Killed process 31685 (nscd).
Out of Memory: Killed process 31686 (nscd).
Out of Memory: Killed process 31687 (nscd).
Out of Memory: Killed process 31688 (nscd).
Out of Memory: Killed process 31689 (nscd).

The process that is killed seems random... sometimes it is bash,
sometime auditd, etc

It looks to me, in my simple understanding of the kernel's VM, that
the kernel is feeling memory starved and killing process, even though
it has 12 GIGS of cache it could easily get rid of.  Setting
vm.pagecache to 1 20 50 doesn't help.


Version-Release number of selected component (if applicable):

RHEL3 U2, with this as the kernel:

Linux machinename 2.4.21-15.0.4.ELsmp #1 SMP Sat Jul 31 01:25:25 EDT
2004 i686 i686 i386 GNU/Linux

Comment 1 Larry Woodman 2004-10-06 18:15:05 UTC
Joshua, the problem here is that the normal memory zone is exhausted.
Of the ~225000 pages the system started with there are only ~5000 left
and there are ~90000 in the slabcache, leaving ~120000 lowmem pages
unaccounted for.  This is typically due to a buggy driver that
allocates lowmem and never frees it or a huge number of bounce buffers
outstanding.  Can you get vmstat, iostat and "cat /proc/slabinfo"
outputs while the system is running so I can see where the memory is
going?

Also, please use the latest RHEL3-U4 candidate kernel to get this
inof, it located here:

>>>http://people.redhat.com/~lwoodman/.RHEL3/


Larry


Comment 2 Joshua Jensen 2004-10-07 14:41:31 UTC
Ok then... just loaded the latest kernel and it doesn't solve this
problem by itself.  I'll attach the vmstat, iostat, and cat
/proc/slabinfo.  These reading were taken for 10 minutes or so up
until the box essentially locks up

Comment 3 Joshua Jensen 2004-10-07 14:57:01 UTC
Created attachment 104896 [details]
vmstat taken every 2 seconds

Comment 4 Joshua Jensen 2004-10-07 14:57:41 UTC
Created attachment 104897 [details]
iostat taken every 2 seconds

Comment 5 Joshua Jensen 2004-10-07 14:58:45 UTC
Created attachment 104898 [details]
slabinfo taken every 2 seconds

Comment 6 Larry Woodman 2004-10-07 18:15:42 UTC
And did you still get OOM kills?  If so, please attache the dmesg
output as well.

Larry


Comment 7 Joshua Jensen 2004-10-07 19:36:03 UTC
No OOM kill messages, though I was watching dmesg.  The box just
locked up this time after I saw:

ENOMEM in journal_alloc_journal_head, retrying.

However, from /var/log/messages I see this after a power reset:

Oct  6 19:02:51 foxhound kernel: ENOMEM in journal_alloc_journal_head,
retrying.
Oct  6 19:03:02 foxhound kernel: Mem-info:
Oct  6 19:03:02 foxhound kernel: Zone:DMA freepages:  1293 min:     0
low: 0 high:     0
Oct  6 19:03:02 foxhound kernel: Zone:Normal freepages:   638 min: 
1279 low: 4544 high:  6304



Comment 8 Joshua Jensen 2004-10-12 14:18:03 UTC
Thoughts?

Comment 9 Larry Woodman 2004-10-12 20:26:15 UTC
Joshua, can you retry it and see of you get more data from AltSysrq M?

Larry
 

Comment 10 Joshua Jensen 2004-10-13 19:28:57 UTC
I have access to a serial console that goes to a conncentrator that I
can tellnet to, and an HP ILO "console", and and ssh session.  I can't
get any of them to work with AltSysrqM (yes, I enabled it).  Are they
supposed to work, or do I have to have a direct serial or keyboard
connection?

Comment 11 Larry Woodman 2004-10-13 19:36:40 UTC
Joshua, this latest hang that you saw was my bug.  Can you grab the
appropriate kernel from here and give it a try?

>>>http://people.redhat.com/coughlan/RHEL3-perf-test/


Larry



Comment 12 Joshua Jensen 2004-10-13 22:07:33 UTC
When booting from this kernel, I see from dmesg this:

scsi::resize_dma_pool: WARNING, dma_sectors=19632, wanted=26160, scaling
      WARNING, not enough memory, pool not expanded

Not sure if this has anything to do with anything though

Comment 13 Joshua Jensen 2004-10-13 22:12:12 UTC
Hmmm... just before the  scsi::resize_dma_pool message I see about 100
of these lines:

Oct 13 18:06:23 foxhound kernel: Unable to attach sg device <3, 0, 0,
229> type=0, minor number exceed 255
Oct 13 18:06:23 foxhound kernel: Unable to attach sg device <3, 0, 0,
230> type=0, minor number exceed 255
Oct 13 18:06:23 foxhound kernel: Unable to attach sg device <3, 0, 0,
231> type=0, minor number exceed 255
Oct 13 18:06:23 foxhound kernel: Unable to attach sg device <3, 0, 0,
232> type=0, minor number exceed 255


Yes, this is attached to a massive SAN setup.  Don't know if this is
related though.

Comment 14 Joshua Jensen 2004-10-13 22:20:12 UTC
Created attachment 105166 [details]
OOM messages

Comment 15 Joshua Jensen 2004-10-13 22:21:30 UTC
Ok... with the new perf-test kernel, I can still get dd to crash
things... and this time it gave lots of OOM messages.  See the
attached file I just created in Comment #14

Comment 16 Larry Woodman 2004-10-14 14:26:08 UTC
Wait a minute here, you are running an SMP kernel on a 32GB system!

We dont support that, more than half of lowmem is consumed in the
mem_map at boot-time.  Please grab the hugemem kernel and run with
that and let me know if that runs without problems ASAP.


Larry


Comment 17 Joshua Jensen 2004-10-14 17:24:22 UTC
Back to the original kernel we started off, 15.0.4, in hugemem form. 
Everything is working fine:

             total       used       free     shared    buffers    cached
Mem:         32277      23090       9186          0        113     20161
-/+ buffers/cache:       2815      29462
Swap:         4094          0       4094


What is the line between the smp and hugemem kernel... is it 16 gigs
or more of memory?

Comment 18 Larry Woodman 2004-10-14 17:44:00 UTC
Yes, upto 16GB the smp kernel should be used and above 16GB the 
hugemem kernel should be used.

BTW, the installation procedure should follow these rules, if it does
not please open a bug for that.

Thanks for your help Joshua, Larry Woodman