140861 – (VM) Kernel prefers swapping instead of releasing cache memory

Bug 140861 - (VM) Kernel prefers swapping instead of releasing cache memory

Summary: (VM) Kernel prefers swapping instead of releasing cache memory

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Larry Woodman
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-11-25 18:40 UTC by Marc-Christian Petersen
Modified:	2007-11-30 22:07 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-10-19 19:13:15 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
experimental patch to page_anon (503 bytes, patch) 2004-11-26 17:53 UTC, Rik van Riel	no flags	Details \| Diff
better zone balancing (1.61 KB, patch) 2004-12-06 20:09 UTC, Marc-Christian Petersen	no flags	Details \| Diff
info requested during swapping (17.87 KB, text/plain) 2005-09-21 04:44 UTC, Rob	no flags	Details
View All

Description Marc-Christian Petersen 2004-11-25 18:40:42 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20041001
Fuckraccoon/0.10.1

Description of problem:
Hi,

this is simply a re-open of
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=89226 as it
_also_ and _still_ applies to RHEL3 kernels.

[root@scalix root]# cat /proc/meminfo 
        total:    used:    free:  shared: buffers:  cached:
Mem:  1977856000 617140224 1360715776        0 111403008 362024960
Swap: 534634496   532480 534102016
MemTotal:      1931500 kB
MemFree:       1328824 kB
MemShared:           0 kB
Buffers:        108792 kB
Cached:         353020 kB
SwapCached:        520 kB
Active:         190836 kB
ActiveAnon:      19332 kB
ActiveCache:    171504 kB
Inact_dirty:    283300 kB
Inact_laundry:   58996 kB
Inact_clean:     26736 kB
Inact_target:   111972 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      1931500 kB
LowFree:       1328824 kB
SwapTotal:      522104 kB
SwapFree:       521584 kB
[root@scalix root]# uptime
 19:22:08  up 23:21,  2 users,  load average: 0.00, 0.00, 0.00
[root@scalix root]# rpm -q kernel
kernel-2.4.21-20.EL
[root@scalix root]#

For what on earth does the kernel swap if there is 1,3GB memory
_available_? .. Guys, you've been ignoring this bug for over a year
now. I've supplied vm_anon_lru for latest RHEL3 kernel in #89226 which
prevents rmap from braindead swapping instead releasing cache. There's
only one bug vs. rmap: if one application allocates more RAM than the
system has, it oom kills the application instead of starting to swap.
This has something to do with active/inactive. As I am _not_ a VM
guru, I am unable to fix this 100% on my own. I really cry for help
now that you, Rik, Larry, take a deep look into this issue and fix
this long outstanding bug up. I am here to test whatever you come up
with (I offered this also some months ago via private email, you may
remember) ... Maybe a look at vm_mapped_ratio in 2.4 mainline / 2.4-aa
may help here too.

ciao, Marc

Version-Release number of selected component (if applicable):
kernel-source-2.4.21-20.EL

How reproducible:
Always

Steps to Reproduce:
1. run updatedb and/or start some applications
2. watch cache growing and growing, see the VM swapping and cache
growing and growing w/o stopping to grow.
3.
    

Actual Results:  Silly swapping instead releasing cache.

Expected Results:  Release cache and if it's almost empty, swap. Or a
knob which controls when and how to swap.

Additional info:

Comment 1 Rik van Riel 2004-11-25 18:54:13 UTC

Marc,

quite a few of the issues you mentioned have been fixed already, in kernel
2.4.21-25.EL.  Please retest with the RHEL3 Update 4 beta kernel and let us know
if things are still misbehaving.  Chances are there are still some corner cases
left, but we can only change things very slowly in RHEL, otherwise we end up
introducing regressions...

Comment 2 Marc-Christian Petersen 2004-11-25 21:42:28 UTC

Hi Rik,

hmm, before I submitted this bugzilla entry, I searched the redhat ftp
and only found 20.EL. I also did up2date -u and nothing newer was
available. Funny :) ... Now searching google I found 2 hits.
Downloading it currently and retest things.

Thank you.

ciao, Marc

Comment 3 Rik van Riel 2004-11-25 22:31:33 UTC

I suspect they're in the beta channel, which is why up2date won't see
them by default.

Comment 4 Marc-Christian Petersen 2004-11-26 12:32:06 UTC

Hi Rik,

ok, for me nothing has changed in any direction. It still prefers to
swap braindeadly. I guess this is a never-ending story with rmap ;(

ciao, Marc

Comment 5 Marc-Christian Petersen 2004-11-26 16:12:20 UTC

Hi again,

a fix for all this issues is really vm_anon_lru set to 0 (in 2.4-aa,
SLES and since yesterday also 2.4 mainline). With this I can see cache
shrinking and if it's almost empty, the VM starts to swap. The only
problem is, I wrote that above, that with vm_anon_lru 0 (means feature
enabled) and a program wants to allocate more memory than memtotal,
the process is killed via OOM instead using available swap space. For
you it might be a trivial joke to fix vm_anon_lru up to work properly
with rmap, but for me it's not possible, as I don't understand the VM.
It's still a mistery for me :(

If you have interest in fixing this, send me whatever you may create
then, I test it.

http://linux.bkbits.net:8080/linux-2.4/cset@1.1540?nav=index.html|ChangeSet@-4d

ciao, Marc

Comment 6 Rik van Riel 2004-11-26 17:53:23 UTC

Created attachment 107488 [details]
experimental patch to page_anon

Marc, I suspect this patch (by Larry Woodman) can be considered the -rmap
equivalent of the upstream patch you quoted.  The main reason we have not added
it into RHEL3 yet is that we do not know whether it will cause regressions and
are still testing it to make sure it does the right thing.

If you feel like testing it, please do. Your test results can help us decide
whether or not the patch is safe to add to RHEL3.

Comment 7 Marc-Christian Petersen 2004-11-26 18:02:46 UTC

Hey Rik,

coolio =) ... Recompiling now.

Thanks alot.

ciao, Marc

Comment 8 Marc-Christian Petersen 2004-11-26 18:06:05 UTC

Hi again,

sorry, my fingers and brain were too fast. I already tested this
patch, it's this:

http://linuxvm.bkbits.net:8080/linux-2.4-rmap/gnupatch@40928fe12mmfR1jtsu8yWoGP2Yn3ig

and it did not help at all too.

Well, it's almost that one. So should I only test that one you
attached? I doubt that it'll make a difference, but anyway, I'll test.

ciao, Marc

Comment 9 Rik van Riel 2004-11-26 18:11:09 UTC

The patch didn't use to help, because other parts of the VM were still
buggy.  However, now that Larry has fixed a lot of other parts of the
VM,  I suspect the patch might actually work as advertised...

Comment 10 Larry Woodman 2004-11-27 13:42:34 UTC

Marc, please try 2.4.21-26.EL first.  If that doesnt work for you
apply the page_anon patch and if that still doesnt work get us several
AltSysrq-M outputs so we can debug the problem you are having.

Larry Woodman

Comment 11 Marc-Christian Petersen 2004-11-28 01:57:52 UTC

Hi Larry,

ok, but sorry, where do I get it? I found .25 on google but it seems
.26 is too new ...

ciao, Marc

Comment 12 Marc-Christian Petersen 2004-11-28 14:13:57 UTC

bleh, I mean, it isn't available in the RHEL3 beta channels. Maybe .26
was a typo and you meant .25? ;)

ciao, Marc

Comment 13 Larry Woodman 2004-11-29 19:14:23 UTC

Marc, I copied the i686 smp and hugemem kernels here:

>>>http://people.redhat.com/~lwoodman/RHEL3/


Larry

Comment 14 Marc-Christian Petersen 2004-11-29 20:13:35 UTC

Hi Larry,

cool. Thanks. Trying now ... May I have the source rpm too?

ciao, Marc

Comment 15 Larry Woodman 2004-11-29 20:53:14 UTC

OK Marc, the src.rpm is at the same location.

Larry

Comment 16 Marc-Christian Petersen 2004-11-29 21:00:14 UTC

Hi Larry,

thanks alot :) ... Ok, just for the record:

I've tested -25.EL with page-anon.patch for the past days and I have
to say, that lots of the braindamage from earlier kernels is gone. The
whole system is alot smoother while doing heavy things (I/O, swapping
and such) and the VM really starts to free cache before starting to
swap. Note: this is -25.EL with page-anon (w/o page-anon it's still
kind of braindamaged (the VM I mean)) - It's still not 100% perfect
but you are almost there. Thanks for all the work you've done to the
VM. I am now playing with -26.EL and give it a shot too and let you
know how things are there.

Please stay tuned.

ciao, Marc

Comment 17 Marc-Christian Petersen 2004-11-29 21:22:40 UTC

Hi again,

ok, I diff'ed -25.EL and -26.EL to find differences and the only
difference is in fs/binfmt_elf.c so I doubt anything will change with
.26 from .25 in VM direction.

ciao, Marc

Comment 18 Larry Woodman 2004-11-29 21:35:54 UTC

You are right Marc, there is no VM diffs between .25 and .26!  As far
as the page_anon patch is concerned it is and will not be part of
RHEL3-U4 because it hasnt received the testing it needs.  Your imput
will help determin if its the correct thing to do for U5.  

Please try experimenting with /proc/sys/vm/pagecache.maxpercnent(third
value) in without the page_anon patch and see if it helps your system.
 The default value in U4 is 30%, increasing it will cause more
swapping and decreasing it will cause less.  Obviously without the
page_anon patch we will reactivate mapped pagecache pages as though
they were anonymous and that will cause inconsistencies.

Larry Woodman

Comment 19 Marc-Christian Petersen 2004-11-29 22:57:25 UTC

Hi Larry,

well, my experience (we talked about it via private email some time
ago) is that /proc/sys/vm/pagecache is a pseudo tweak tunable for me
as it does not change anything, well, almost not anything. Remember
when I said that pagecache = 1 10 10 is the only way to go to fix up
some of the braindamage of the VM? :) I've played with pagecache again
and it still does not make a difference. I tried for example 1 15 16
or 1 15 20 or 1 5 10 and such but with all these different values the
VM starts to swap very very soon. I can easily trigger it on my p4 1gb
ram: start vmware with winxp (256MB ram), start Quake3 and use a small
map -> voila: ~20mb in swap where cache is ~700 megabyte and growing
and growing and swap usage is also growing on a per minute basis
gaming quake3. Example: I play quake3 for 15 minutes and I am
150-200mb in swap. For my usage it seems page_anon is the only way to
go for now. _With_ page_anon patch applied and default pagecache 1 15
30 (U4) I can easily start vmware with winxp inside, start quake3,
play it, even start complete KDE, Mozilla Firefox, xchat and whatnot
and I am not in swap, instead I see cache shrinkage (as I expected to
see :)

I use that RHEL3 kernel on my desktop (beside the scalix groupware
machine where it was bought for) as it's easier to trash^Wtest the VM
there.

You don't seem to like page_anon ;) so I assume it's time now for some
sysrq-M things? :)

Another note: It _seems to me_ that if there is a program that
allocates alot of ram at once (like vmware does, like quake3 does),
the VM is too slow to catch that up and starts to swap as there isn't
lots of free memory available.

ciao, Marc

Comment 20 Larry Woodman 2004-11-30 14:52:09 UTC

Marc, first of all, I wrote the page_anon patch so yes I do like it. 
However, we need to make sure that it does ahve benefit and does not
have negative side effects that over-shadow the positive side effects.

What the system does without the page_anon patch is to move pagecache
pages to the active anon list when they are re-activated if they a
mapped.  This means that applications that mmap lots of file pages and
push the system into heavy page reclamation will swap rather than
reclaiming pagecache pages.  With the page_anon patch, those mapped
pagecache pages will be reactivated back onto the active cache list
rather than the active anon list and this will prevent the swapping.
However, the pages of critical mapped files such as libc, libx, etc,
will also be reactivated to the active cache list and that has the
potential of causing an overall performance degradation.  This is what
needs to be tested and if I dont find any degradation I will make sure
it goes into U5.

Evdently vmware, mmap()s lots of file pages and that is why it runs so
much better with the page_anon patch.  If this

Larry Woodman

Comment 21 Marc-Christian Petersen 2004-12-02 11:20:02 UTC

Hi Larry,

I am very sorry but I had by accident the vm_anon_lru patch I've
supplied in #89226 applied to the kernel source and were using that
one enabled and disabled it when there was almost no free memory
(page_anon was applied too). Now disabled vm_anon_lru at all and only
using page_anon the system is still swapping like an idiot. I am able
to get a good working kernel when the following is done:

1. boot with vm_anon_lru to 0 (feature enabled)
2. Fillup memory (i.e. start X, kmail, firefox, vmware etc.)
3. When there is no swap usage, set vm_anon_lru to 1 (feature
   disabled)

Now the kernel needs a very long time to start to swap.

Overall: page_anon alone does nothing for me. pagecache tweak does
nothing for me too.

ciao, Marc

Comment 22 Larry Woodman 2004-12-02 14:34:36 UTC

Marc, pardon my ignorance about this but I dont know what vm_anon_lru
patch you are talking about.  There is no vm_anon_lru variable in
RHEL3, it was removed as part of the rmap changes.  Can you attach
that patch so I can see it.  Also, please explain the workload you are
running.

Larry

Comment 23 Marc-Christian Petersen 2004-12-02 16:04:18 UTC

Hi Larry,

there has never been vm_anon_lru in RHEL because is was just merged in
2.4.29-pre1 ;) ... I've told about vm_anon_lru at the very first
beginning of this bugreport. vm_anon_lru comes from 2.4-aa tree from
Andrea Arcangeli and it is in all SLES kernels.

ciao, Marc

Comment 24 Marc-Christian Petersen 2004-12-02 16:57:08 UTC

Hi again,

we just got a report from a customer that they also see this kind of
braindamage with RHEL3 kernels. 8GB memtotal, 2GB swap partition which
is 100% _used_ with ~6GB _free_ memory.

The machine is running Oracle.

ciao, Marc

Comment 25 Larry Woodman 2004-12-02 18:13:40 UTC

That(swapping with lots of free memory) is typically caused by lowmem
exhaustion.  Lowmem is constrained to ~1GB on smp kernels and ~4GB on
hughmem kernels and it can be exhausted by the kernel before
highmem(the remaining ~5GB on an smp kernel) and therefore the system
can swapout lowmem in order to reclaim it.

Please ask the customer to open up a bug and get me an AltSysrq-M and
a /proc/slabinfo output when his system is in that state.

Larry Woodman

Comment 26 Marc-Christian Petersen 2004-12-02 18:42:46 UTC

Hi Larry,

I've asked the customer to do so and he will fill out the needed
things in this bug since it perfectly fits in here.

My personal opinion: I've search RH bugzilla and found uncountable
bugreports of the same thing I've reported here, tons of users
supplied meminfo dump, slabinfo, process listing and whatnot and it
did not help, or at least it did not help much to get rid of the
problems so much customers have. I doubt anything our customer will
provide here will help to fix things, b/c it'll be almost kind of the
same things tons of others already supplied to bugzilla.

Anyway, I may be silly or dumb in VM related things but vm_anon_lru
might really help you|us to get rid of the problems. There's only one
problem left (I've described it at the beginning of this bug). Did you
take a look at that one or should I attach it here so you can have a look?

ciao, Marc

Comment 27 Rik van Riel 2004-12-02 18:49:33 UTC

Marc,

adding the vm_anon_lru patch to -rmap would result in the system being
completely unable to swap out any anonymous memory, leading to an OOM
kill whenever the anonymous memory is using up all of a memory zone.

This is because rmap (and upstream 2.6) walk the LRU lists to find
freeable pages, and never walk the process page tables.

Comment 28 Marc-Christian Petersen 2004-12-02 18:53:16 UTC

Hi Rik,

that's what I said in the very first beginning (ok, not so detailed as
you now but ... ;)

Anyway, I thought I share my experience with it so you and Larry might
get an idea what might be wrong.

ciao, Marc

Comment 29 Larry Woodman 2004-12-03 01:25:44 UTC

Marc, sorry but I cant seem to find any bugs that show 8GB total, 6GB
free and the 2GB swapfile totally used.  Like I said, this is
indicitive of lowmem exhaustion but it is very unusual.  I'd really
appreciate gettting an AltSysrq-M from a system in such a strang
state, I've never seen something that odd.

Thanks, Larry

Comment 30 Marc-Christian Petersen 2004-12-06 20:09:39 UTC

Created attachment 107985 [details]
better zone balancing

Comment 31 Marc-Christian Petersen 2004-12-06 20:10:21 UTC

Hi Larry, Rik,

sorry, but my customer still did not do the things I've asked for, so
we all have to wait now :(

Beside that, I now have a perfectly working VM, and I mean perfectly
:) (for my workload) ... It's awesome. The only things I've
incorporated into the current RHEL3 -26.EL kernel are:

1. page_anon
2. attached patch above.
3. pagecache set to 1 10 10

and for interactivity (plain RHEL3 kernels are just too slow like a
dog when there is cpu load):

https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=103064&action=view
https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=103065&action=view


I want to introduce a "tuning" boot parameter thing, or a /proc value
to change the behaviour. Imagine, you boot with it and the VM will be
tuned to use the above, if you ommit the parameter, everything is
default again like it is now in -26.EL. The parameter name is just
pulled out of my &$&($ ;) ... We can for sure name it different. What
do you think? If you agree, I'll cook up something.

So page_anon can be safely integrated, even in U4 which defaults to
off and if you boot with the special parameter (or tune a /proc value)
page_anon becomes active along the other things)

ciao, Marc

Comment 32 Larry Woodman 2005-01-11 20:33:51 UTC

Marc, the page_anon change has already been committed into what will
become RHEL3-U5.  However, the fixup_freespace patch that you included
looks questionable.  The only caller of fixup_freespace(__alloc_pages)
will never make the call unless direct_reclaim is set.  This makes the
check for direct_reclaim inside the fixup_freespace() routine a big
no-op anyway!
------------------- __alloc_pages() -------------------------------
       if (direct_reclaim) {
                for (;;) {
                        zone_t *z = *(zone++);
                        if (!z)
                                break;
                        if (!z->size)
                                continue;
                        if (z->free_pages < z->pages_min)
                                fixup_freespace(z, direct_reclaim);
                }
        }
------------------------------------------------------------------

Can you run the pre-RHEL3-U5 kernel with just the anon_page change and
see if this alone corrects the problem you are seeing?

Thanks, Larry Woodman

Comment 33 Marc-Christian Petersen 2005-01-12 14:29:34 UTC

Hi Larry,

where do I find that pre-RHEL3-U5 kernel? Or in other words, what's
the version number of it?

ciao, Marc

Comment 34 LDB 2005-07-13 15:07:47 UTC

My enterprise clients are experiencing this bug sporadically. I am not sure
where this bug was OFFICIALLY fixed. Can you please let me know which RHEL U this
was officially fixed? I thought it was 2.4.21-27.0.4 but that is just a guess.

Thank you,

LDB

Comment 35 Ernie Petrides 2005-07-22 00:23:10 UTC

This bug has not been fixed in any officially released RHEL3 kernel,
which is why this bugzilla report is still in ASSIGNED state.

Comment 36 Rob 2005-07-28 20:19:13 UTC

We also see this on our RHE 3.5 server, where we have 1 GB of physical memory, 
of which there is always 650+ MB of buffers cached (i.e. plenty of memory 
available), but Swap goes up to 75-125 MB of swap used throughout the day with 
no decrease in available physical memory. For us, it seems the culrpits are 
background processes such as spamd from spamassassin and clamd from clamav, 
which do not spike throughout the day, but seem to consume large amounts of 
swap space. We do not see this on our RHE 4.x servers, or older 2.2.x kernels.

Restarting spamd and clamd frees up 95% of used swap space, but it shouldn't go 
to this level in the first place. If you need us to get any data, just let me 
know.

Thanks.

Rob

Comment 37 rodrigo 2005-08-02 15:02:04 UTC

Hi, we have exactly the same bug on RH3 U5. 

When it will be solved ?

Comment 38 rodrigo 2005-08-02 15:09:56 UTC

Hi, we have exactly the same bug on RH3 U5. 

When it will be solved ?

Comment 39 Rob 2005-08-26 14:42:03 UTC

Does anyone know if RH3 Beta U6 solves this bug in swap space being consumed 
prematurely?

Rob

Comment 40 Ernie Petrides 2005-08-26 18:35:20 UTC

I have no record of this being fixed in U6 (which is why it's still
in ASSIGNED state).

Comment 41 Rob 2005-08-26 20:57:45 UTC

Ernie, thanks for the followup. Do you know if this will be fixed soon? Does 
Red Hat need more data? For an enterprise OS to have a large kernel memory 
problem where swap space gets used heavily when plenty of cached buffer memory 
is available seems to be a serious issue, especially for those of us who invest 
lots of $$ into the OS and use it extensively on many production servers.

Watching loads rise to 1+ because of swap usage when a GB or more of memory is 
free is not the most comforting thing.

Rob

Comment 42 Ernie Petrides 2005-08-26 21:33:36 UTC

I don't know.

Larry, please follow up on Rob's questions.

Comment 43 Rik van Riel 2005-08-26 23:01:54 UTC

This bug is not a black and white issue, where it is fixed or not.

Performance is a continuum with many shades of grey, and I believe that the
situation has been improved spectacularly in U5 and U6.  However, having said
that I realise that in some situations things still do not behave quite as they
should.

I have a question for Rob, in response to comment #36: is there significant
swapin IO during the day, or is the data that was swapped out also in the swap
cache, and is there no swap IO happening?

If the data that's in swap is also resident in memory, there should not be a
performance degradation (and the problem is mostly cosmetic).  If there is a
performance degradation, we should analyze it and fix it.

Comment 44 Rob 2005-08-26 23:15:22 UTC

Snapshot of what is currently happening:
drum:~# free
             total       used       free     shared    buffers     cached
Mem:       1023612     998676      24936          0     130340     510708
-/+ buffers/cache:     357628     665984
Swap:      2096472      90936    2005536

drum:~# vmstat
procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0  90936  24936 130340 510716    1    2     7     2    3     3  3  1  5  8

drum:~# cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  1048178688 1022152704 26025984        0 134098944 579231744
Swap: 2146787328 93118464 2053668864
MemTotal:      1023612 kB
MemFree:         25416 kB
MemShared:           0 kB
Buffers:        130956 kB
Cached:         502904 kB
SwapCached:      62752 kB
Active:         705200 kB
ActiveAnon:     245264 kB
ActiveCache:    459936 kB
Inact_dirty:    133676 kB
Inact_laundry:   21576 kB
Inact_clean:     18508 kB
Inact_target:   175792 kB
HighTotal:      129216 kB
HighFree:         2160 kB
LowTotal:       894396 kB
LowFree:         23256 kB
SwapTotal:     2096472 kB
SwapFree:      2005536 kB
CommitLimit:   2608276 kB
Committed_AS:   549848 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     2048 kB

The "si" and "so" blocks under vmstat never go down, only up. Even if I 
restart almost every single program, I can get "Swap used" down to 1 MB or 
less, but si and so still show blocks being swapped. If you need me to get 
more specific info, please let me know what commands to run, and I'll do so.

We'll notice sometimes that the load average will sit around 1 to 1.5, with 
nothing running really, and stay that way for a few hours, and then drop back 
to zero. The system doesn't seem sluggish during these times, but it's hard to 
monitor actual system performance when you're not sure what is driving the 
load, or eating up swap.

Rob

Comment 45 LDB 2005-08-26 23:27:19 UTC

Hey Rob, we have this problem as well, and I was wondering could I trouble you
to post /proc/slabinfo as well? I am just curious from a comparision perspective.

Thank you,

LDB

Comment 46 Rob 2005-08-26 23:32:49 UTC

as requested:
drum:~# cat /proc/slabinfo
slabinfo - version: 1.1 (SMP)
kmem_cache            96     96    244    6    6    1 : 1008  252
ip_conntrack        2055   2890    384  287  289    1 :  496  124
ip_fib_hash           95    224     32    2    2    1 : 1008  252
ext3_xattr             0      0     44    0    0    1 : 1008  252
journal_head        1839   9548     48   52  124    1 : 1008  252
revoke_table           6    500     12    2    2    1 : 1008  252
revoke_record        448    448     32    4    4    1 : 1008  252
clip_arp_cache         0      0    256    0    0    1 : 1008  252
ip_mrt_cache           0      0    128    0    0    1 : 1008  252
tcp_tw_bucket       1115   1500    128   38   50    1 : 1008  252
tcp_bind_bucket      846   1008     32    8    9    1 : 1008  252
tcp_open_request    1110   1110    128   37   37    1 : 1008  252
inet_peer_cache      116    116     64    2    2    1 : 1008  252
secpath_cache          0      0    128    0    0    1 : 1008  252
xfrm_dst_cache         0      0    256    0    0    1 : 1008  252
ip_dst_cache        1682   2325    256  117  155    1 : 1008  252
arp_cache            255    360    256   17   24    1 : 1008  252
flow_cache             0      0    128    0    0    1 : 1008  252
blkdev_requests     4096   4110    128  137  137    1 : 1008  252
kioctx                 0      0    128    0    0    1 : 1008  252
kiocb                  0      0    128    0    0    1 : 1008  252
dnotify_cache          0      0     20    0    0    1 : 1008  252
file_lock_cache      360    360     96    9    9    1 : 1008  252
async_poll_table       0      0    140    0    0    1 : 1008  252
fasync_cache           0      0     16    0    0    1 : 1008  252
uid_cache            260   1120     32    4   10    1 : 1008  252
skbuff_head_cache   1203   3634    168   64  158    1 : 1008  252
sock                 458    718   1408  229  359    1 :  240   60
sigqueue            1008   1015    132   35   35    1 : 1008  252
kiobuf                 0      0    128    0    0    1 : 1008  252
cdev_cache            13    116     64    2    2    1 : 1008  252
bdev_cache           116    116     64    2    2    1 : 1008  252
mnt_cache             18    116     64    2    2    1 : 1008  252
inode_cache        73128  88592    512 12592 12656    1 :  496  124
dentry_cache       76813 152400    128 5072 5080    1 : 1008  252
dquot                750    750    128   25   25    1 : 1008  252
filp                7652   7680    128  256  256    1 : 1008  252
names_cache          120    136   4096  120  136    1 :  240   60
buffer_head        83231 108395    108 2928 3097    1 : 1008  252
mm_struct            585    640    384   59   64    1 :  496  124
vm_area_struct      6855  21672     68  205  387    1 : 1008  252
fs_cache             845   1102     64   16   19    1 : 1008  252
files_cache          586    616    512   84   88    1 :  496  124
signal_cache         621   1160     64   19   20    1 : 1008  252
sighand_cache        341    422   1408  171  211    1 :  240   60
pte_chain           7636  19470    128  437  649    1 : 1008  252
pae_pgd              845   1160     64   15   20    1 : 1008  252
size-131072(DMA)       0      0 131072    0    0   32 :    0    0
size-131072            0      0 131072    0    0   32 :    0    0
size-65536(DMA)        0      0  65536    0    0   16 :    0    0
size-65536             0      0  65536    0    0   16 :    0    0
size-32768(DMA)        0      0  32768    0    0    8 :    0    0
size-32768             0      1  32768    0    1    8 :    0    0
size-16384(DMA)        0      0  16384    0    0    4 :    0    0
size-16384            21     44  16384   21   44    4 :    0    0
size-8192(DMA)         0      0   8192    0    0    2 :    0    0
size-8192              4     86   8192    4   86    2 :    0    0
size-4096(DMA)         0      0   4096    0    0    1 :  240   60
size-4096            396    456   4096  396  456    1 :  240   60
size-2048(DMA)         0      0   2048    0    0    1 :  240   60
size-2048            349    616   2048  196  308    1 :  240   60
size-1024(DMA)         0      0   1024    0    0    1 :  496  124
size-1024            554    748   1024  158  187    1 :  496  124
size-512(DMA)          0      0    512    0    0    1 :  496  124
size-512             688    688    512   86   86    1 :  496  124
size-256(DMA)          0      0    256    0    0    1 : 1008  252
size-256            1065   1065    256   71   71    1 : 1008  252
size-128(DMA)          0      0    128    0    0    1 : 1008  252
size-128            3098   4140    128  133  138    1 : 1008  252
size-64(DMA)           0      0    128    0    0    1 : 1008  252
size-64             6540  12990    128  256  433    1 : 1008  252
size-32(DMA)           0      0     64    0    0    1 : 1008  252
size-32             4634  12006     64  194  207    1 : 1008  252

Comment 47 Rik van Riel 2005-08-26 23:36:35 UTC

You may want to check out "vmstat 5".  The first line of vmstat is simply the
average number of swapins/outs a second over the system lifetime, and does not
reflect the current state of the system.

Comment 48 Rob 2005-08-26 23:48:21 UTC

Rik, thanks for the "new" command I learned today. Here is the output:
drum:~# vmstat 5
procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0  90936  30744 131852 503244    1    2     7     2    3     3  3  1  5  8
 0  0  90936  30724 131860 503244    0    0     0    50  142    36  0  0 98  2
 0  0  90936  32068 131880 503252    0    0     0   174  149  1265  7  2 88  2
 0  1  90936  32560 131908 503320    0    0    13   138  188    88  0  0 97  3
 0  0  90936  32600 131920 503420    0    0    10   214  322   159  1  0 95  4
 0  0  90936  32068 131932 503572    0    0    28   108  227    73  0  0 94  6
 0  0  90936  31508 131952 503600    0    0     5   247  266   135  1  1 92  7
 1  0  90936  31708 131964 503604    0    0     2    31  157    41  0  0 99  1
 0  0  90936  31052 131980 503644    0    0     6   194  235   129  2  1 93  5
 0  0  90936  31048 132004 503816    0    0    34    88  230    71  0  0 92  7
---

So it appears although free lists 90 MB of swap and meminfo showing swap 
cached, actual swap-in/out are zero...showing a "cosmetic" memory usage it 
seems (I think). I'll try and capture data when the load stays sustained 
around 1.

Comment 49 Rob 2005-08-26 23:55:14 UTC

Just an addendum to my last post. I watched "vmstat 2" over the last few 
minutes, and I see si and so spikes of 2 to 10 blocks every 15-20 cycles, 
which I assume is some swapping occurring. Nothing destroying the server, but 
with 600+ MB of free memory, no swap should be utilized I would think...

Rob

Comment 50 Rik van Riel 2005-08-27 00:22:18 UTC

I suspect these are coming from the highmem zone.  We can check to see if we
happen to have any patches in the current RHEL3 tree that might disturb the
balancing of allocations between the highmem zone and the low memory zones.

On the other hand, one small spike every 15-20 seconds should not hurt
performance, when you have several hundred kB/second in filesystem IO. It
means that far less than 1% of your disk IO is swap IO.

Earlier RHEL3 kernels used to have actual performance problems, because of
which the severity of this bug was "high".  If nobody has actual performance
problems any more, I suspect we can drop the severity of this bug to "normal",
or even "low".

Comment 51 LDB 2005-08-27 01:07:35 UTC

Rik:

When this happens to our servers it is quite impactful. I do not think that
dropping the severity level on an enterprise distribution is wise at this
juncture. This is only because we have potential financial impacts if this
happens on our servers. Moreover, I would like it solved because some of
our customers using RHEL 3 MIGHT not have the luxury of upgrading to RHEL
4 easily.

The 2.6 kernel introduces something called "swappiness". I know RH backports
many 2.6 functionalities in its RHEL distributions. Is there a similar
parameter to vm.swappiness in RHEL 3 that can be user adjusted?

Comment 52 LDB 2005-08-27 01:08:21 UTC

Rik:

When this happens to our servers it is quite impactful. I do not think that
dropping the severity level on an enterprise distribution is wise at this
juncture. This is only because we have potential financial impacts if this
happens on our servers. Moreover, I would like it solved because some of
our customers using RHEL 3 MIGHT not have the luxury of upgrading to RHEL
4 easily.

The 2.6 kernel introduces something called "swappiness". I know RH backports
many 2.6 functionalities in its RHEL distributions. Is there a similar
parameter to vm.swappiness in RHEL 3 that can be user adjusted?


Thank you,

Lawrence Bowie

Comment 53 Rob 2005-08-27 01:12:29 UTC

Lawrence,

I'll second your opinion on this being severe, as the data I posted earlier 
was at a load of 0.05 or so. We have had this swap issue drive load to over 1 
sustained over a few hours on a system that is very lightly loaded at all 
times, so it can impact performance. If I can catch it doing this again, I'll 
post more stats during an "impact event".

Rob

Comment 54 Rik van Riel 2005-08-27 04:18:36 UTC

RHEL3 has the /proc/sys/vm/pagecache.  You can reduce the percentage to which
the page cache (within a memory zone) needs to be reduced before swapping can
start to 15% with the following command:

# echo "2 10 15" > /proc/sys/vm/pagecache

This will also set the cache memory target to 10%, and the minimum to 2%.  This
may reduce, or even eliminate, swapping on systems with a very cache heavy workload.

Rob, stats during an "impact event" could be very helpful in tracking down the
problem, as well as establishing its severity (to determine the magnitude by
which things might need to be adjusted).

Comment 55 Rob 2005-08-28 13:23:35 UTC

ok, another rhel 3 server, nothing running out of the ordinary, only 
connection is ssh by me, load sitting around 1 for no reason, lots of swap 
space taken. Details below:

w
 09:19:44  up 3 days, 16:18,  1 user,  load average: 1.09, 0.75, 0.56
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT

 09:22:34  up 3 days, 16:21,  1 user,  load average: 0.64, 0.76, 0.59
84 processes: 82 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total    0.5%    0.0%    1.5%   0.0%     0.0%    0.5%   97.5%
Mem:   510160k av,  281284k used,  228876k free,       0k shrd,   60200k buff
                    236268k actv,   18372k in_d,     368k in_c
Swap: 1052216k av,  155332k used,  896884k free                   64704k cached

vmstat 1
procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0 155332 230488  58820  62976    2    3    41    77   28    58  5  1 92  1
 0  0 155332 230488  58820  62976    0    0     0     0  113    23  1  0 99  0
 0  0 155332 230488  58844  62976    0    0     0   132  129    52  0  0 97  3
 0  0 155332 230488  58844  62976    0    0     0     0  121    22  0  0 100  0
 0  0 155332 230488  58844  62980    0    0     0     0  304    86  6  0 94  0
 0  0 155332 230480  58844  62980    0    0     0     0  151    59  2  0 98  0
 0  0 155332 230360  58844  62980    0    0     0     0  154    73  2  1 97  0
 0  0 155332 230352  58860  62980    0    0     0   340  170    68  0  0 99  1
 0  0 155332 230352  58860  62988    0    0     0     0  139    51  0  0 100  0
 1  0 155332 225196  58860  62988    0    0     0     0  147    56 37  5 58  0
 0  0 155332 230368  58860  63064    0    0     0     0  212    68 34  1 65  0
 0  0 155332 230364  58860  63064    0    0     0     0  137    46  2  0 98  0
 0  0 155332 230364  58880  63068    0    0     0   512  149    65  0  0 94  6
 0  0 155332 230364  58880  63068    0    0     0     0  130    31  0  0 100  0
 0  0 155332 230372  58880  63068    0    0     0     0  118    30  0  0 100  0
 0  0 155332 230372  58880  63108    0    0    40     0  146    28  0  0 99  1
 0  0 155332 230372  58880  63108    0    0     0     0  127    21  0  0 100  0
 0  0 155332 230372  58884  63112    0    0     0   224  150    38  0  0 86 14
 0  0 155332 230372  58884  63112    0    0     0     0  115    27  2  0 98  0
 0  0 155332 230372  58884  63112    0    0     0     0  132    30  0  0 100  0
 0  0 155332 230372  58884  63112    0    0     0     0  114    20  0  0 100  0
 0  0 155332 230372  58884  63112    0    0     0     0  115    19  0  0 100  0
 0  0 155332 230372  58904  63112    0    0     0   248  238    56  0  1 96  3
 0  0 155332 230372  58904  63112    0    0     0     0  126    25  2  0 98  0
 0  0 155332 230372  58904  63116    0    0     0     0  111    21  0  0 100  0
 0  0 155332 230372  58904  63116    0    0     0     0  180    52  1  0 99  0
 0  0 155332 230372  58904  63120    0    0     0     0  246    95  0  0 100  0
 0  0 155332 230372  58932  63124    0    0     0   328  212    88  6  1 79 14
 0  0 155332 230372  58932  63124    0    0     0     0  131    34  0  0 100  0
 1  0 155332 228232  58932  63124    0    0     0     0  118    27 16  0 84  0
 0  0 155332 230244  58932  63132    0    0     0     0  277   109 15  2 83  0
 0  0 155332 230244  58932  63132    0    0     0     0  115    17  0  0 100  0

free
             total       used       free     shared    buffers     cached
Mem:        510160     279892     230268          0      58964      63132
-/+ buffers/cache:     157796     352364
Swap:      1052216     155332     896884

cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  522403840 286609408 235794432        0 60424192 183779328
Swap: 1077469184 159059968 918409216
MemTotal:       510160 kB
MemFree:        230268 kB
MemShared:           0 kB
Buffers:         59008 kB
Cached:          63216 kB
SwapCached:     116256 kB
Active:         234208 kB
ActiveAnon:     136348 kB
ActiveCache:     97860 kB
Inact_dirty:     18084 kB
Inact_laundry:    5732 kB
Inact_clean:       368 kB
Inact_target:    51676 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       510160 kB
LowFree:        230268 kB
SwapTotal:     1052216 kB
SwapFree:       896884 kB
CommitLimit:   1307296 kB
Committed_AS:   400036 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     4096 kB

cat /proc/slabinfo 
slabinfo - version: 1.1
kmem_cache            72    105    108    3    3    1
ip_conntrack        1622   1790    384  167  179    1
ip_fib_hash           15    112     32    1    1    1
urb_priv               0      0     64    0    0    1
ext3_xattr             0      0     44    0    0    1
journal_head        1084   1771     48   17   23    1
revoke_table           7    250     12    1    1    1
revoke_record          0    112     32    0    1    1
clip_arp_cache         0      0    256    0    0    1
ip_mrt_cache           0      0    128    0    0    1
tcp_tw_bucket          5     90    128    1    3    1
tcp_bind_bucket       16    112     32    1    1    1
tcp_open_request       0     90    128    0    3    1
inet_peer_cache        4     58     64    1    1    1
secpath_cache          0      0    128    0    0    1
xfrm_dst_cache         0      0    256    0    0    1
ip_dst_cache         102    225    256    7   15    1
arp_cache              3     15    256    1    1    1
flow_cache             0      0    128    0    0    1
blkdev_requests     2976   3000    128  100  100    1
kioctx                 0      0    128    0    0    1
kiocb                  0      0    128    0    0    1
dnotify_cache          0      0     20    0    0    1
file_lock_cache        2     41     92    1    1    1
async_poll_table       0      0    140    0    0    1
fasync_cache           0      0     16    0    0    1
uid_cache              7    112     32    1    1    1
skbuff_head_cache    191    276    168   12   12    1
sock                  52    138   1408   28   69    1
sigqueue               0     29    132    0    1    1
kiobuf                 0      0    128    0    0    1
cdev_cache            14    116     64    2    2    1
bdev_cache             9     58     64    1    1    1
mnt_cache             19     58     64    1    1    1
inode_cache         2930   2940    512  419  420    1
dentry_cache        5759   5790    128  193  193    1
dquot                 30     90    128    3    3    1
filp                 791    810    128   27   27    1
names_cache            0      5   4096    0    5    1
buffer_head        47442  47484    104 1319 1319    1
mm_struct             57    120    384    9   12    1
vm_area_struct      3058  11368     68   61  203    1
fs_cache              56    174     64    2    3    1
files_cache           57    119    512   10   17    1
signal_cache          80    174     64    2    3    1
sighand_cache         69    132   1408   37   66    1
pte_chain           7019  10380    128  235  346    1
size-131072(DMA)       0      0 131072    0    0   32
size-131072            0      0 131072    0    0   32
size-65536(DMA)        0      0  65536    0    0   16
size-65536             0      0  65536    0    0   16
size-32768(DMA)        0      0  32768    0    0    8
size-32768             0      1  32768    0    1    8
size-16384(DMA)        0      0  16384    0    0    4
size-16384            21     22  16384   21   22    4
size-8192(DMA)         0      0   8192    0    0    2
size-8192              4     13   8192    4   13    2
size-4096(DMA)         0      0   4096    0    0    1
size-4096             41     53   4096   41   53    1
size-2048(DMA)         0      0   2048    0    0    1
size-2048             84    172   2048   44   86    1
size-1024(DMA)         0      0   1024    0    0    1
size-1024             64    112   1024   16   28    1
size-512(DMA)          0      0    512    0    0    1
size-512              77     88    512   10   11    1
size-256(DMA)          0      0    256    0    0    1
size-256              54     75    256    4    5    1
size-128(DMA)          4     30    128    1    1    1
size-128             814    900    128   30   30    1
size-64(DMA)           0      0    128    0    0    1
size-64              230    300    128   10   10    1
size-32(DMA)          68    116     64    2    2    1
size-32              462    638     64   11   11    1

Thoughts?

Comment 56 Rik van Riel 2005-08-28 13:38:24 UTC

- no processes are running or waiting on IO
- the CPU is over 90% idle
- there is no swapin or swapout IO
- of the 150MB that got swapped out, 110MB is also resident in memory (see the
SwapCached line in /proc/meminfo)
- the other 40MB that got swapped out is probably not needed, otherwise it would
have been swapped in
- almost half of memory is free

I wonder if this means your server periodically runs into a load spike.  Say,
the server program that runs on this system (a mail server?) periodically
processing a whole bunch of email at once in a lot of processes and driving the
load average up for a small period of time.

It would be useful to capture that exact time and not the aftermath - the system
appears mostly idle in comment #55

Comment 57 Rob 2005-08-28 14:14:19 UTC

Rik,

That's my point though. One hour later and the load is still around 1 with no 
load and everything idle. It's like the "cosmetic" swap being used is 
artifically driving the load:

top d 2
 10:10:48  up 3 days, 17:10,  1 user,  load average: 1.36, 1.38, 1.16
81 processes: 80 sleeping, 1 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total    1.0%    0.0%    0.0%   0.0%     0.0%    0.0%   99.0%
Mem:   510160k av,  318248k used,  191912k free,       0k shrd,   76316k buff
                    267868k actv,   17716k in_d,     368k in_c
Swap: 1052216k av,  155324k used,  896892k free                   82908k cached
---

Nothing is really running, no mail spikes, no web spikes, only 1 low traffic 
account on this server, and the load has hovered around 1 for 3+ hours now. Is 
there anything else you want me to capture right now? This is not a temporary 
load spike (We manage around 100 redhat servers, so we know what load spikes 
look like), but a seemingly "fake" load elevation for hours with the only 
thing being a high level of swap memory being shown, whether it's being 
actively used or not. Server still seems quite responsive, but load averages 
stay around 1. Only our RHEL 3.x with 2.4.x kernels exhibit this behavior.

Rob

Comment 58 Rik van Riel 2005-08-28 14:22:35 UTC

If you press the "i" key in top, are there still no running or blocked processes
showing up?

Comment 59 Rob 2005-08-28 14:32:27 UTC

Non idle processes show:
  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
 1820 root      15   0  1892 1836  1492 R     0.0  0.3   0:00   0 sshd
 9862 root      15   0  1012 1012   784 R     0.0  0.1   0:00   0 top
---

So, doesn't appear to be anything driving the load. I 
checked /var/log/messages, maillog, etc... and nothing that should cause any 
load at all, lsof, port checks, etc...

Now load mysteriously drops off after 3 hours of being around 1, and nothing 
has changed memory or program wise:
 10:27:08  up 3 days, 17:26,  1 user,  load average: 0.01, 0.26, 0.68

Non idle processes still shows same 2 ones, top and ssh.

        total:    used:    free:  shared: buffers:  cached:
Mem:  522403840 336461824 185942016        0 82403328 210784256
Swap: 1077469184 159010816 918458368
MemTotal:       510160 kB
MemFree:        181584 kB
MemShared:           0 kB
Buffers:         80472 kB
Cached:          89148 kB
SwapCached:     116696 kB
Active:         277120 kB
ActiveAnon:     136220 kB
ActiveCache:    140900 kB
Inact_dirty:     18620 kB
Inact_laundry:    9552 kB
Inact_clean:       368 kB
Inact_target:    61132 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       510160 kB
LowFree:        181584 kB
SwapTotal:     1052216 kB
SwapFree:       896932 kB
CommitLimit:   1307296 kB
Committed_AS:   399588 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     4096 kB

There has to be something in the memory management of the 2.4.x kernel that 
causes this condition, although it seems to be quite hard to pin down... 
hopefully others can provide more data to back up this condition. I'll try and 
capture more data when it happens again on any of our RHEL 3 servers.

Comment 60 Rik van Riel 2005-08-28 15:07:30 UTC

Since there are no processes waiting on memory, or in blocked state waiting for
anything else, I suspect this is something else.

Do you by any chance have mysql or another threaded program running on this
server?  Maybe there is a thread waiting on a lock (in D state - but not visible
by default in top)?

Comment 61 Rob 2005-08-28 15:30:09 UTC

Rik,

We have many things running on this server such as MySQL, Apache, 
spamassassin, clamav, etc... Process output of MySQL showed no running 
processes previously. Also, we see this beahvior on almost all of our RHEL 3 
servers, large amounts of swap not being released, load hovering around 1 with 
nothing running for a few hours at a time randomly. None of our RHEL4 servers 
or older RH 6,7 servers exhibit this behavior.

I am by no means an expert, nor do I understand why this happens, but we have 
been running Red Hat Linux servers since 1998 and this RHEL 3 release is the 
only one that does unexplained things with swap and load out of 100+ servers. 
I can only tell you what we see and provide data as requested, as we are more 
of an "end user" of Red Hat rather than developers / programmers.

Rob

Comment 62 Larry Woodman 2005-08-29 14:50:17 UTC

Rob, sorry but I dont see what the actual problem is here.  There appears to be
a couple of systems that swapped at one time when you dont think they should
have.  Is this correct?  Can you catch the system swapping when you dont think
it should be?  I dont see any swap activity in any of the vmstat or top outputs
you have attached, although there is evidence that the system did swap earlier.
 Sorry I didnt see the activity on this bug late last week, I was out.

Thanks, Larry Woodman

Comment 63 Rob 2005-08-29 15:11:58 UTC

Larry,

Almost all of our RHEL 3 servers seem to hold large amounts of swap space, even 
machines with very little on them and 2 GB of RAM (with most of it free in 
cached buffers). I know this has been said to be "cosmetic only", but many of 
our RHEL 3 servers will experience a few hours of the load sitting around 1, 
with no high usage processes, no mysql processes, nothing that should drive the 
load except for a large swap space being used/reported by "free". Then, the 
load suddenly drops although nothing has changed on the server.

It seems that the RHEL 3 kernel uses swap instead of free memory for programs 
such as spmassassin, or running an slocate update, etc... Whether it sustains 
this swap usage or it is just reflected in the "free" command is tough to tell, 
but as mentioned above, we'll see load issues on only our rhel 3 servers that 
cannot be explained except for the swap space being "used". RHEL 4 or older RH 
6.x servers do not exhibit this behavior.

Rob

Comment 64 rodrigo 2005-08-29 15:16:37 UTC

Hi Larry, we are having the same issue.  We have RH3 U5, with oracle 9irac. 
We have many people from Oracle Rac Team involved on this issue.

Comment 65 Larry Woodman 2005-09-01 18:46:42 UTC

I was just talking to the Oracle folks about this issue.  There appears to be
some confusion about what is going on here.  Can someone please reproduce this
problem and *while* the system is swapping rather than reclaiming pagecache
memory get me the following:

1.) "vmstat 1" output

2.) AltSysrq-M output

3.) AltSysrq-P and W outputs

4.) /proc/cpuinfo output

5.) /proc/meminfo output

6.) "top 1" output

7.) a "ps aux" output.


Thanks, Larry Woodman

I would do all of this myself but I acn not reproduce this problem you are
describing internally here at Red Hat.

Comment 66 Rob 2005-09-21 04:31:44 UTC

Larry, is there an easy way to get AltSysrq data remotely? We are not local to 
the server, but believe we can reproduce the swapping running a disk intensive 
script.

Rob

Comment 67 Rob 2005-09-21 04:44:50 UTC

Created attachment 119064 [details]
info requested during swapping

Not the best example, but it shows some blocks swapping in and out during cgi
script writing a few thousand files to the server, with 800+ MB RAM free. I can
try and get better data with our backup script running...

Comment 68 Saar 2005-09-24 00:46:33 UTC

to remotely generate an alt-sysreq-M as root:
# echo m > /proc/sysrq-trigger

if the system is not dead, if it's dead then just a serial cable can help.
so.. alt-sysrq-XXX just do:
# echo XXX > /proc/sysrq-trigger

Comment 70 RHEL Program Management 2007-10-19 19:13:15 UTC

This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.

Note You need to log in before you can comment on or make changes to this bug.