Red Hat Bugzilla – Bug 247228
cron jobs fail semi-randomly if sendmail incapacitated
Last modified: 2007-11-30 17:12:09 EST
Description of problem:
Version-Release number of selected component (if applicable):
Easy if you set up the right conditions
Steps to Reproduce:
1. incapacitate sendmail by moving aside /usr/sbin/sendmail, e.g.:
# mv /usr/sbin/sendmail /usr/sbin/sendmail.hide
2. run a cron job which is a shell script that produces more output than
default stdio buffer. This demonstrates the problem well (also attached):
# cat /tmp/cronjob-sigpipe.sh; echo ====
mkdir -p /tmp/cronjob-sigpipe
gubbish="`dd if=/etc/termcap bs=4k count=1 2>/dev/null`"
while [ $loop -lt $num ]; do
3. Start this as a cron job. Do this as a regular user, one who doesn't
currently have any cron jobs (it will blow them away):
$ echo '* * * * * /tmp/cronjob-sigpipe.sh' | crontab
Allow it to run for 20 minutes or so.
Files appear in /tmp/cronjob-sigpipe, but not as many as it says it's
going to make; the terminating "succeeded" file is sometimes missing.
Nota bene: the files it's creating are empty. The output that causes
the SIGPIPE and cron job failure is going to the job's stdout, to be
collected for sending as mail to the job owner. Creating files is a
way of recording the job's progress through a channel other than the
failing mail channel that one normally uses to observe cron job
All the files the script is supposed to create should be created.
Of course this is just an example script. It produces a random amount
of output, thus demonstrating that scripts which produce more than a
certain amount of output suddenly stop running.
This is because `crond` pipes the job's stdout/stderr to a `sendmail`
process in order to mail the output to the initiating user. It uses an
internal function, cron_popen(). This function calls execvp() and assumes
success, when in fact it can easily fail if /usr/sbin/sendmail is missing.
In that case, what is essentially a poisoned stdio file pointer is created.
The cron job blithely continues for a while, even producing output, until
the in-memory stdio buffer becomes full. Then stdio tries to flush to the
pipe, gets SIGPIPE, and the process(es) of the cron job are killed.
There are several workarounds, starting with the most obvious:
1. make sure /usr/sbin/sendmail is not incapacitated
2. use `MAILTO=""' in all crontabs (don't forget to cover both user &
3. arrange for `crond` to be run with a "-m /some/other/program" flag,
specifying something that disposes of the attempted mail output one
way or another
But these are all workarounds available to the person who has already
discovered _why_ his cron jobs sometimes work and sometimes mysteriously
die. Before he can use them, he must suffer through the discovery
Suggested fix: cron_popen() must _notice_ if execvp() fails. It must
inform the parent process (I'm not sure what's the best way). The jobs
must the behave deterministically. Either they should succeed (while
e.g. sending the mail output to /dev/null); or they must fail every time.
`crond` could also check, at startup time, whether the binary it intends
to use as `sendmail` exists and is executable. Such a startup-time
check is only a partial fix (`sendmail` could disappear during `crond`'s
uptime), but it affords an opportunity to print a useful error message;
it will explain yesterday's peculiar behavior during today's reboot. Even
better, check each time it's going to run a job, log a warning in syslog
if necessary. On a system which deliberately has no MTA, the admin can
disable the warnings by using "-m /bin/fake-sendmail-dump-to-dev-null"
Original context: VMware ESX Server 2.5.4, with RHEL2.1-based Console OS,
with vixie-cron-4.1-11.EL3. But I am now reproducing exactly the same
problem in an FC7 live CD boot (in an ESX VM...) `sendmail` is
"incapacitated" on ESX, in that it isn't installed at all. Much the same
thing could happen in any sort of small embedded environment. (Ref:
VMware PR 144651)
Created attachment 158639 [details]
Shell script, to be run as a repeating cron job
Thank you for report.
I've added syslog report (warning about possible problem), but I'm thinking
about some better fix. Some checking of the sendmail or other mail service could
solve this issue.
I'd like to see the text of the "possible problem" syslog warning.
For the full fix, remember a system may deliberately omit mailers for enhanced
My analysis shows that the root cause is crond's cron_popen() not noticing
execvp() failure. I recommend fixing by:
1. fix cron_popen() to notice execvp() failure, return failure to its caller.
2. cron_popen()'s caller in cron shouldn't exit on failure, just syslog a
message [including errno or other specifics of _why_ it failed], then run the
command without logging -- as if `mailto' was empty.
This makes cron jobs on my hypothetical no-mailer system somewhat noisy. I
think that's acceptable: system designer/operator who wants to avoid the noise
can rebuild cron without it, force mailto="" for all cron jobs, or supply a
The important thing is that they'll actually _notice_ the issue and be able to
deal with it. Which is much much better than having some random subset of cron
jobs mysteriously die in mid-operation.
The first problem is solved with message:
CRON: Exec of (/usr/lib/sendmail) had failed because: (No such file or directory)
The solution of the second problem is in progress.
"has failed" should be "failed".
Adding relevant ISC engineers.
Ok, not adding them -- apparently can't add arbitrary email addresses.
I would like to add Evan Hunt & Paul Vixie.
The fix is complete. I added it in F-8 (updates). I'm not sure, when will be
Now is fix also in devel of vixie-cron. If you have any thoughts about it,
please let me know.