Bug 141150

Summary: Spamd has huge memory consumption
Product: [Fedora] Fedora Reporter: Steve Bergman <sbergman>
Component: spamassassinAssignee: Chip Turner <cturner>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: felicity, jm, parkerm, reg+redhat, sbergman, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-11-29 21:23:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steve Bergman 2004-11-29 17:28:38 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:
I upgraded a 256M mail server from FC1 to FC3.  Spamassasin was
working fine before, but now I see it causing swap storms.  There are
several spamd processes running each taking up about 80M virtual,
50m-60m resident, with only 6M shared.  Swap is at 285 of 512 (this is
only a very low volume mail server and nothing else.  No gui.  So
thats a lot of memory.)  The machine is barely useable enough to bring
up vmstat to see that about 750k/sec is being swapped in and swapped
out.  

Spammasssin, and all other packages, are up to date.

May be related to:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=139491

Version-Release number of selected component (if applicable):
spamassassin-3.0.0-3

How reproducible:
Sometimes

Steps to Reproduce:
1.service spamasaassin start
2.Wait several hours for symptoms to occur.

    

Actual Results:  Swap storms.

Expected Results:  No swap storms.

Additional info:

Comment 1 Warren Togami 2004-11-29 21:23:32 UTC
Already pushed 3.0.1 to FC3 update.

Comment 2 Steve Bergman 2004-11-29 22:23:24 UTC
I updated to 3.0.1 and restarted spamassassin.  Within minutes the
server was on its knees with 255M of swap used.  I can't even ssh into
it now, until the OOM killer decides to act.

Comment 3 Warren Togami 2004-11-30 00:47:04 UTC
Ok, then this is an upstream issue.  I am not able to reproduce this
kind of behavior myself.  Go to spamassassin.org's bugzilla.

Comment 4 Steve Bergman 2004-11-30 01:18:13 UTC
OK.  I'll report it upstream.  Were you testing on a ~256M system? 
And was there real mail coming through.  I think it may take real
(spam) activity to trigger the problem. 

A workaround seems to be to  add "-m1" or "-m1 --max-conn-per-child=1"
options in /etc/sysconfig/spamassassin to allow only one child
process, and optionally to kill the child and refork after every
connection.  It then behaves very nicely.





Comment 5 Warren Togami 2004-11-30 01:20:58 UTC
1GB RAM 40 user system, roughly 2 incoming mail per second processed
by spamd.

Comment 6 Sidney Markowitz 2004-11-30 01:57:14 UTC
Could you please look at this upstream bug to see if it appears to be
the same thing:

http://bugzilla.spamassassin.org/show_bug.cgi?id=3981

And also look at the the patch that is attached to

http://bugzilla.spamassassin.org/show_bug.cgi?id=3983

to see if it helps?



Comment 7 Steve Bergman 2004-11-30 02:55:29 UTC
The output of "top" when the system is fine and not using any swap at
all, and when it is swapping like hell and unusable, is pretty much
the same.  6 processes (1 parent and 5 children) are all showing about
80M/process virt, 66M/process res. (Large, but not *that* large...)
When the system starts swapping excessively, the numbers don't change
much at all.

This does sound like the same problem.





Comment 8 Warren Togami 2004-11-30 03:00:18 UTC
What kernel?

Comment 9 Steve Bergman 2004-11-30 03:03:07 UTC
kernel-2.6.9-1.681_FC3

Comment 10 Sidney Markowitz 2004-11-30 03:07:10 UTC
> This does sound like the same problem

Yes, best to take the discussion over there, then, and do try out the
patch I mentioned.


Comment 11 Justin Mason 2004-11-30 23:44:47 UTC
yep, please do; the patch on bug 3983 is almost definitely what you need.

Comment 12 Steve Bergman 2004-12-01 00:12:35 UTC
Yes, I'll be trying that out, although I am realizing that on a low
volume/low memory server like this one it is really most efficient
just to set --max-children=1 and --max-conn-per-child=1.  As the
minimum system requirements for FC3 are about what I have, and spamd
seems to be such a memory hog even when it is working "right" perhaps
the init script should tune the children based on system memory at
runtime?

One thing that puzzles me, though, is that spamassassin 2.x seemed to
use about 20M virt memory/process.  The thread on the upstream bug
mentions 20m virt for 3.0.  What I am seeing is 80M+ virt and 66M+
res.  If I don't set --max-conn-per-child=1, the numbers for the child
start out there and then climb with each successive connection.  My
other FC3 machine, which was a fresh install with 1GB RAM, shows
similar (80m) numbers.  Why do my particular installations have so
much more memory mapped? 

Comment 13 Justin Mason 2004-12-01 00:44:30 UTC
are you using third-party rulesets?  some of those greatly increase
memory consumption in the children, it seems.  try without *any* third
party rulesets and see what the RAM usage is like.

in 3.0.x, we added preforking -- which (as discussed in the upstream
bug) means that large memory consumption of each child becomes a big
deal.  in 2.6x it wasn't so serious, but with 3.0.x, you now have N
children *always* running -- so thrashing starts a lot earlier.

btw also note the upstream comments about "top" output being incorrect
regarding how much of the memory is shared.

Comment 14 A. Folger 2004-12-26 20:58:03 UTC
I seem to have the same problem at the office. I just tried the same
setup at home, and the problem doesn't exist here. Since one of the
most significant differences between the two setups is the spam
filter, which only runs at the office, your suggestion makes sense. I
will continue checking this tomorrow at the office. In the mean time,
I take issue with your suggestion to do avoid 3rd party rulesets.

I use a third party ruleset from ... the kmail binary shipped with
fc3. Could that be the culprit? If so, could you look into the
kmail-generated rulesets?