Bug 561568
Summary: | (amavisd noisy?) sa-update.cron generating errors | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Muzi <muzammel.linux> |
Component: | amavisd-new | Assignee: | Steven Pritchard <steve> |
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 11 | CC: | bugzilla-redhat, perl-devel, ppisar, steve, xxlray |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-02-15 12:56:53 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Muzi
2010-02-03 22:41:59 UTC
I can't reproduce this here. Is your shell bash or something else? Does a 'rpm -V spamassassin' show the sa-update.cron has been modified any? (In reply to comment #1) > I can't reproduce this here. > > Is your shell bash or something else? > > Does a 'rpm -V spamassassin' show the sa-update.cron has been modified any? No i am using a default bash shell of fedora. rpm -V spamassassin S.5....T. c /etc/mail/spamassassin/local.cf Note when i manually run sa-update -D --checkonly, its working f9, so i think some problem in cron script only. Please suggest. I have found the problem where its is, script is sleep and then timeout. File --> /usr/share/spamassassin/sa-update.cron # Sleep random amount of time before proceeding to avoid overwhelming the servers sleep $(expr $RANDOM % 7200) I just hash above sleep command and script is completed and not ouput produce , if i unhash the above sleep line, then script runs for a time and produce below error. Waiting for the process [3822] to terminate Daemon [3822] terminated by SIGTERM So i think need to remove this line from cron script or modify its time. In my senses its has no advantage of using sleep because sa-update operation will do at end of script. Kindly suggest. Thanks As the comment says, the sleep is needed to prevent all the fedora machines in the world running spamassassin from hitting the update servers at the same time. I have no idea why the sleep doesn't work for you. It works fine here... is 'sleep' or 'expr' somehow overrided by some local alias? Ok i understand the logic of sleep but can you tell me one thing more or verify, can this output means script is working or not working properly ? Waiting for the process [3822] to terminate Daemon [3822] terminated by SIGTERM how much time of sleep,when script runs at cron time its gives this output after 2-3 hours or sometimes 4 hours, not fix time to terminate the script, Kindly just sure about above cron output, its means cron is working ok or have problem ? So much thanks. Muzi I don't see that output at all here. ;( Can you do: sh -x /usr/share/spamassassin/sa-update.cron and show us the results here?
> sh -x /usr/share/spamassassin/sa-update.cron
>
> and show us the results here?
One interesting which know you easily understand the problem. Today cron did not give any output like above. So after debugging , i come to know that sleep command depends upon the $RANDOM variable, as this var value is not fix, come diff values on each run, so if value is low the script is working ok and exit on normal time, and if sleep value is larger then script terminate some times after 5-6 hours or its process is terminated,
sh -x /usr/share/spamassassin/sa-update.cron
+ unset SAUPDATE
+ '[' -f /etc/sysconfig/sa-update ']'
+ . /etc/sysconfig/sa-update
+ for daemon in spamd amavisd
+ /sbin/pidof spamd
+ '[' 0 -eq 0 ']'
+ SAUPDATE=yes
+ for daemon in spamd amavisd
+ /sbin/pidof amavisd
+ '[' 0 -eq 0 ']'
+ SAUPDATE=yes
+ '[' -f /var/run/mimedefang.pid ']'
+ '[' -z yes ']'
+ '[' '!' -d /etc/mail/spamassassin/sa-update-keys ']'
+ CHANNELLIST=
+ KEYLIST=
++ run_parts_list /etc/mail/spamassassin/channel.d/ .conf
++ '[' 2 -lt 1 ']'
++ '[' '!' -d /etc/mail/spamassassin/channel.d/ ']'
++ '[' -d /etc/mail/spamassassin/channel.d/ ']'
++ '[' -n .conf ']'
++ find_opts='-name *.conf'
++ find -L /etc/mail/spamassassin/channel.d/ -mindepth 1 -maxdepth 1 -type f -name '*.conf'
++ sort -n
+ for file in '$(run_parts_list /etc/mail/spamassassin/channel.d/ .conf)'
+ PREFIXES='CHANNELURL KEYID BEGIN'
+ for prefix in '$PREFIXES'
+ grep -q CHANNELURL /etc/mail/spamassassin/channel.d/sought.conf
+ for prefix in '$PREFIXES'
+ grep -q KEYID /etc/mail/spamassassin/channel.d/sought.conf
+ for prefix in '$PREFIXES'
+ grep -q BEGIN /etc/mail/spamassassin/channel.d/sought.conf
+ . /etc/mail/spamassassin/channel.d/sought.conf
++ CHANNELURL=sought.rules.yerp.org
++ KEYID=6C6191E3
++ return 0
+ CHANNELLIST=' sought.rules.yerp.org'
+ KEYLIST=' 6C6191E3'
+ sa-update --import /etc/mail/spamassassin/channel.d/sought.conf
+ for file in '$(run_parts_list /etc/mail/spamassassin/channel.d/ .conf)'
+ PREFIXES='CHANNELURL KEYID BEGIN'
+ for prefix in '$PREFIXES'
+ grep -q CHANNELURL /etc/mail/spamassassin/channel.d/spamassassin-official.conf
+ for prefix in '$PREFIXES'
+ grep -q KEYID /etc/mail/spamassassin/channel.d/spamassassin-official.conf
+ for prefix in '$PREFIXES'
+ grep -q BEGIN /etc/mail/spamassassin/channel.d/spamassassin-official.conf
+ . /etc/mail/spamassassin/channel.d/spamassassin-official.conf
++ CHANNELURL=updates.spamassassin.org
++ KEYID=5244EC45
++ return 0
+ CHANNELLIST=' sought.rules.yerp.org updates.spamassassin.org'
+ KEYLIST=' 6C6191E3 5244EC45'
+ sa-update --import /etc/mail/spamassassin/channel.d/spamassassin-official.conf
++ expr 27941 % 7200
+ sleep 6341
i check when above sleep is 601 then script terminate fine, but if sleep is very large like 6-7 hours then cron is terminated and that's why its gives output of "watiting for process to terminated",
Now i hope you got the root casue for it, problem is for sleep delay.
[root@muzi~]# echo $(expr $RANDOM % 7200)
1845
[root@muzi ~]# echo $(expr $RANDOM % 7200)
5677
[root@muzi ~]# echo $(expr $RANDOM % 7200)
5512
If sleep time is low, then script complete terminate f9
If sleep time is very hight (6-7 hour) approx then script daemon is terminated and give below output.
Waiting for the process [3822] to terminate
Daemon [3822] terminated by SIGTERM
So problem is for sleep only, I think fix for this is define cron in cron.daily instead in cron.d with specific time. as of cron.daily has not a fix time to run, so its overcome this very large sleep delay problem.
How does 1 to 7200 seconds become 5-6 hours? It appears that something else is going on here. (In reply to comment #8) > How does 1 to 7200 seconds become 5-6 hours? > > It appears that something else is going on here. 5-6 hours are not fix, as i stated above, its depends upon the $RANDOM variable value, so above when u see on script run debug. /etc/mail/spamassassin/channel.d/spamassassin-official.conf ++ expr 27941 % 7200 + sleep 6341 So total sleep is 6341,so in one hour 3600 seconds, I hope now you got it. some times 3-4 hours etc. 1865 2490 3023 5865 4670 2024 3309 4349 4544 5533 6809 7174 808 3135 3666 6257 3765 1312 1790 2969 271 7185 6460 1341 5512 3940 7101 5015 5853 4624 0 6059 6055 2335 6287 4960 These are random numbers between 0 and 7199 inclusive. 7200 seconds is two hours. How is it possible that this could become 3-4 hours? Yes very right, i also just examine this time period, bascially cron errors came after 3-4 hours from the cron running time period. from now two days cron is running f9 and give no errors like above mentioned, so its seems that sleep command not the caused of errors, may be time out problem after sleep to connect with update channel servers or some other thing, but not confirm, Thanks for all of your help & comments, If next time error will come :) so i will try to diagnose. Today cron generated errors again, last three days no errors, so i am 100% sure at the end of script after sleep command, script connects with the channels list and if any channel is timeout or not responding then cron is hang and terminate with error Waiting for the process [15118] to terminate Daemon [15118] terminated by SIGTERM Suggestion for Fix. Add some channel alive checking before run sa-update to connect with channels. like ping -c2 etc & if ping is ok then do run sa-update and if not then echo xxx channel is down or again put some sleep function to verify channel after some time. BTW, are you using spamd, or something else like amavisd-new or mimedefang? (In reply to comment #13) > BTW, are you using spamd, or something else like amavisd-new or mimedefang? yes i am using spamd and amavisd daemon, and during cron job both runs. Why are you using both? amavis doesn't use spamd. You can turn off the spamd.
> Waiting for the process [15118] to terminate
> Daemon [15118] terminated by SIGTERM
I'm guessing the error message is amavis and not spamassassin. I don't think this is spamassassin's fault.
Reassigning to amavisd-new. If the problem isn't there, then I'm closing this bug NOTABUG because there isn't anything more I can do here.
I not understand your logic, cron is related to spamd its not concern with the amavisd , just after sa-update its condrestart both, so you please concentrate on spamd. I also verify that no updates available but still cron gives above output, so it means cron not goes to the part of condrestart , its before step to verify updates available or not. Kindly think about below. Timeout Check i think required before run sa-update. Add some channel alive checking before run sa-update to connect with channels. like ping -c2 etc & if ping is ok then do run sa-update and if not then echo xxx channel is down or again put some sleep function to verify channel after some time. Kindly look here below, today cron is terminated with below error http: GET http://yerp.org/rules/stage/330908747.tar.gz request failed: 500 Can't connect to yerp.org:80 (connect: timeout): 500 Can't connect to yerp.org:80 (connect: timeout) channel: could not find working mirror, channel failed cron start time 4:10 AM Cron give above output 9:36 AM So i think this is the problem which i told you above, need some host alive checking before sa-update run. Please suggest. sa-update will occasionally fail. This is not a big deal. Go complain to upstream. This is not a bug in our package. (In reply to comment #18) > sa-update will occasionally fail. This is not a big deal. Go complain to > upstream. This is not a bug in our package. Ok, but i am sure now below error come for the reason of channel timeout. Waiting for the process [15118] to terminate Daemon [15118] terminated by SIGTERM Please sure, and suggest solution for timeout issue. Now come to the final point, where problem occur at end of cron script.if sa-updates founds updates, then condrestart spamd & amavisd /usr/bin/sa-update $arglist if [ $? -eq 0 ]; then /etc/init.d/spamassassin condrestart > /dev/null [ -f /etc/init.d/amavisd ] && /etc/init.d/amavisd condrestart > /dev/null [ -f /var/run/mimedefang.pid ] && /etc/init.d/mimedefang reload > /dev/null Now if you said its not bug then why script failed on condrestart of amavisd ? As a cheap, utterly foul, best of bad practices fix, I did this: if [ $? -eq 0 ]; then /sbin/service spamassassin condrestart > /dev/null # [ -f /etc/init.d/amavisd ] && /sbin/service amavisd condrestart > /dev/null [ -f /etc/init.d/amavisd ] && /sbin/service amavisd condrestart 2>&1 |grep -v "Waiting for the process\|terminated by SIGTERM\|OK\|amavisd stopped\|Starting amavisd\|^$" [ -f /var/run/mimedefang.pid ] && /sbin/service mimedefang reload > /dev/null fi It is a problem with the /etc/init.d/amavisd script and should be sent to that maintainer as the script doesn't follow best practices for silencing non-error output. (listen to me. hahahaha ) I recently migrated a server from RHEL4 to RHEL6 an now have a problem when I restart the amavisd inside a script via a cronjob. The output should be redirected to (a variable and then to) a file but it isn't: /etc/init.d/amavisd restart 2>&1>> $TEXT The cronjob always sends me a mail containing "Daemon [4343] terminated by SIGTERM". This is very(!) noisy and worked properly on RHEL4 as far as I can remember. (In reply to comment #22) > I recently migrated a server from RHEL4 to RHEL6 an now have a problem when > I restart the amavisd inside a script via a cronjob. The output should be > redirected to (a variable and then to) a file but it isn't: > > /etc/init.d/amavisd restart 2>&1>> $TEXT > This means to redirect error output to descriptor of original standard output and append standard output to a file. You have to reverse the redirects >>$TEXT 2>&1. See bash(1). Also you can write &>>$TEXT. Oh ... Thanks for the correction. But the behavior still is noisy for anyone who does not (want to) redirect errors to a log file. |