From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050524 Fedora/1.0.4-4 Firefox/1.0.4 Description of problem: When I try and 'restart' spamassassin (SA) it fails with: service spamassassin restart Shutting down spamd: [ OK ] Starting spamd: Could not create INET socket on 127.0.0.1:783: Address already in use (IO::Socket::INET: Address already in use) [FAILED] The problem seems to be that whilst a 'restart' does a 'stop' and then 'start', the 'stop' calls 'killproc' to find the SA pids. 'killproc' is a function in /etc/init.d/functions. It first looks for a pid file, /var/run/spamd.pid', but this does not exist. So it then calls the 'pidof' command. This returns the spamd processes pids, but it seems to return the children pids first, and the parent one last. As such, while the children pids are killed off, the parent sees this and restarts a child proceess. Testing by running 'ps auxww|grep -i spamd' during the restart preocess shows that a new spamd child has appeared before the 'start' executes. When it does execute it fails because of the child process. I have modified /etc/init.d/functions to reverse the order of pids returned by 'pidof'. As such the parent gets killed off first. Testing this, it works fine every time now. In /etc/init.d/functions I modified at line 185: =========================================================== if [ -z "$pid" ]; then pid=`pidof -c -o $$ -o $PPID -o %PPID -x $1 || \ pidof -c -o $$ -o $PPID -o %PPID -x $base` # JH - fix for SA 'restart' pid=`echo $pid | tac -s ' '` pid=`echo -n $pid` fi =========================================================== I used 'tac' because it seemed the easiest way to reverse the list of pids. Note: SA could write a pid file. But 'killproc' only looks in /var/run, and we start SA as a non-root user. As such that user cannot write a pid file into /var/run. So we must rely on pidof to get the pids. We start SA with the '-m 15' option because we run busy mail servers. As such there are a lot of child processes, and this in itself probably contributes to the problem. However, the problem exists if we reduce the number to 10, or even to use the default of 5. Our /etc/sysconfig/spamassassin file contains: SPAMDOPTIONS="-d -x -m 15 -s daemon -u mail --max-conn-per-child=100" John. Version-Release number of selected component (if applicable): initscripts-8.11.1-1 How reproducible: Always Steps to Reproduce: 1.Start spamassaassin 2.Issue the command 'service spamassassin restart' on a busy server. 3. Actual Results: The error mentioned above appears - SA fails to restart. Expected Results: SA should have restarted with no errors. Additional info:
tac is in /usr/bin - you can't use that in this context. I'd suspect if pid wraparound has happened since the parent started, this wouldn't do what you'd want.
I admit 'tac' is not ideal, it was the only thing I could find that would reverse 'something' for me. Second thought was perhaps modifying 'pidof' to something like 'pidof -r' where the '-r' reverses the pid order. Likewise, I agree that pid wrapround would probably cause this to fail. I can, of course, get SA to write the parent pid out to a file, but then 'functions' would need to know where to find it - so, again back to modifying 'functions'. Unfortunately SA itself has some comments/warnings about writing the pid out before changing to a non-root user. As such this option didn't seem practicable, despite being the most obvious.
I have thought more about this and have to admit that my first idea was somewhat nonsense :-) The problem in trying to create a generic solution is that SA can be run as any user the sysadmin wishes, and can write the pid file wherever he/she wants to. However, the ISC BIND 'named' process can similarly be run as a non-root user and write a pid file out. To that extent I have scrapped the 'functions' changes mentioned initially. I have modified /etc/sysconfig/spamassassin and /etc/init.d/spamassassin to recognise the environment variable 'SPAMD_PID'. I created, in our case, /var/run/spamassassin and 'chmod mail:mail /var/run/spamassassin' to let SA write the pid file into there. The /etc/sysconfig/spamassassin file becomes: =============================================================== # Options to spamd # Set SPAMD_PID to the PID file path if the SpamAssassin '-r' option is used. SPAMD_PID=/var/run/spamassassin/spamd.pid # #SPAMDOPTIONS="-d -c -m5 -H" SPAMDOPTIONS="-d -x -m 15 -s daemon -u mail --max-conn-per-child=100 -r $SPAMD_PID" =============================================================== The /etc/init.d/spamassassin file becomes (relevant bits): =============================================================== start) # Start daemon. echo -n "Starting spamd: " daemon $NICELEVEL spamd $SPAMDOPTIONS RETVAL=$? echo # [ $RETVAL = 0 ] && touch /var/lock/subsys/spamassassin if [ $RETVAL = 0 ]; then [ -n "$SPAMD_PID" ] && ln -s $SPAMD_PID /var/run/spamd.pid touch /var/lock/subsys/spamassassin fi ;; stop) # Stop daemons. echo -n "Shutting down spamd: " killproc spamd RETVAL=$? echo # [ $RETVAL = 0 ] && rm -f /var/lock/subsys/spamassassin if [ $RETVAL = 0 ]; then rm -f /var/lock/subsys/spamassassin rm -f /var/run/spamd.pid fi ;; =============================================================== This is similar to how named is dealt with. As with named, SA 'start' creates a soft link for /var/run/spamd.pid, and when stopping or restarting this is used (by killproc in /etc/init.d/functions). Upon testing SA restarts work fine. John.
OK, assigning to spamasssasin.
*** Bug 141323 has been marked as a duplicate of this bug. ***
%config(noreplace) %{_sysconfdir}/sysconfig/spamassassin The specific implementation suggested in Comment #3 is not good because during package upgrades, if the /etc/sysconfig/spamassassin file had been previously modified it will not be replaced and your init.d script would fail. Given this problem, hardcoding the pid path in the script may be the only supportable solution. Any objections?
See http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4655 for the solution that I am currently testing for future spamassassin packages.
http://people.redhat.com/wtogami/temp/spamassassin/fc3/ http://people.redhat.com/wtogami/temp/spamassassin/fc4/ Please help me to test this package that will soon go to FC3 and FC4 updates. It is essentially an upstream 3.0.5 release candidate.
Re comment 8: I have installed onto one of our FC4 mailhubs spamassassin-3.0.4-2.fc4 from the fedora-updates repo. This seems to work fine; restarting spamassassin repeatedly using 'service spamassassin restart' worked every time. Previously this would have failed pretty much immediately. Many thanks. John.
FWIW, killproc and other initscript functions now support -p to specify the PID file location, so the symlink hack should no longer be necessary.
This is essentially fixed in FC5 and RHEL4U3, but I want to do a little more cleaning up of this before closing this bug.
I'm not so sure this is "fixed" in RHEL4U3. We recently ran up2date on our RHEL4 system, to bring it up to U3. As part of the upgrade, we got a new version of spamassassin: spamassassin-3.0.5-3.el4 Thu 16 Mar 2006 05:34:40 PM CST The install of other packages lasted until 05:49:10 PM, and then up2date restarted all the daemons. But our boot.log indicates spamd didn't restart properly: Mar 16 17:51:43 zeus spamassassin: spamd shutdown succeeded Mar 16 17:51:44 zeus spamd: Could not create INET socket on 127.0.0.1:783: Address already in use (IO::Socket::INET: Address already in use) Mar 16 17:51:44 zeus spamassassin: spamd startup failed The result was that everything worked fine for an hour, until the last child exited due to the default --max-conn-per-child=200. And then we got the mess of Mar 16 18:46:30 zeus spamc[20565]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying (#1 of 3): Connection refused At the time, ps output indicated no spamd processes running, and a '/etc/init.d/spamassassin restart' worked fine to fix it. I'm assuming that the restart would have used the newly-installed init script, so that suggests that this fix was incomplete. Or does the fix only work if spamd was started using it (due to treatment of a pid file)?
Unfortunately there is nothing we can do to ensure that this works when upgrading because the old init script didn't generate the pid file. This fix will only prevent failures in future upgrades, and regular restarts.