Bug 649674

Summary: Mail from cron most days full of "gzip: stdout: Broken pipe" entries
Product: [Fedora] Fedora Reporter: Paul Howarth <paul>
Component: cronieAssignee: Marcela Mašláňová <mmaslano>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 14CC: cjwatson, daveg, ejsheldrake, extras-orphan, mmaslano, ondrejj, pertusus, tmraz, varekova
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: cronie-1.4.7-1.fc14 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-16 13:45:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paul Howarth 2010-11-04 09:32:19 UTC
Description of problem:
Most mornings since installing F14 (5-6 days now) I get an email from cron like this:

From: Anacron <it>
To: root <root>
Date: Thu, 4 Nov 2010 03:34:42 +0000
Subject: Anacron job 'cron.daily' on roary.uk.virtensys.com
Message-ID: <201011040334.oA43Ygne003286.virtensys.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

/etc/cron.daily/man-db.cron:


gzip: stdout: Broken pipe

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe



The number of "gzip" lines varies from day to day. I suspect it may be related to the number of updates installed the previous day but that's just a hunch.

Version-Release number of selected component (if applicable):
man-db-2.5.7-2.fc14

How reproducible:
Not very. Running the cron script manually doesn't seem to trigger this.

Comment 1 Ivana Varekova 2010-11-05 11:22:44 UTC
Hello, 
please could you edit /etc/cron.daily/man-db.cron
and add to mandb option --debug and paste here cron output:
----------------------------
--- ./man-db.crondaily.pom	2010-10-18 12:12:28.000000000 +0200
+++ ./man-db.crondaily	2010-11-05 12:20:25.000000000 +0100
@@ -20,6 +20,6 @@ LOCKFILE=/var/lock/man-db.lock
 trap "{ rm -f $LOCKFILE ; exit 255; }" EXIT
 touch $LOCKFILE
 # create/update the mandb database
-mandb $OPTS
+mandb --debug $OPTS
 
 exit 0
---------------------------

Comment 2 Paul Howarth 2010-11-15 13:59:37 UTC
I tried a few things over the last week:

1. Setting OPTS=--debug in /etc/sysconfig/man-db

I got lots of output from cron as a result but none of the reported errors, despite gzip use being evident from the output.

2. Editing /etc/cron.daily/man-db.cron to redirect stdout and stderr to a temp file

Similar result to the first effort (output in temp file)

I then reverted to the out of the box configuration and the errors are back.

Strangely, I'm only seeing this on one of the 4 F14 boxes (3 real, one virtual - the problem box is real) I have.

Comment 3 Edward Sheldrake 2010-11-16 20:14:36 UTC
If man-db was rebuilt to use zlib, it then wouldn't run gzip, so probably wouldn't have this exact issue.

I was going to request adding BuildRequires: zlib-devel to make man-db faster, but in my testing it didn't seem to be significantly faster. The debian and ubuntu man-db packages appear to be built with zlib support.

Comment 4 Paul Howarth 2010-11-18 09:14:34 UTC
This issue looks similar to Bug #64836 and Bug #170402; maybe a similar fix/workaround could be applied?

Comment 5 Colin Watson 2010-12-02 12:48:26 UTC
While I'm not entirely unwilling to work around this in man-db upstream by forcing SIGPIPE back to its default disposition, the root cause is bug #475106.  Perhaps somebody could reopen that and put it somewhere where it might get fixed rather than triaged into oblivion?

Comment 6 Ivana Varekova 2011-01-03 11:29:31 UTC
Hello, there can be redirected the output to /dev/null 
(see https://bugzilla.redhat.com/show_bug.cgi?id=64836), but it helpful to have the debug output described in comment 1. 
I will ask vixie-cron maintainer about 475106.

Comment 7 Ivana Varekova 2011-01-03 11:38:53 UTC
From my point of view the best solution is to patch vixie-cron package in this case (see https://bugzilla.redhat.com/show_bug.cgi?id=475106 for the bug description). Please Marcela can you look at it?

Comment 8 DaveG 2011-03-01 13:10:36 UTC
I have been seeing the same issue and have done a little digging: The problem may not be with cronie itself but whatever is used to run the batch scripts.

I added the line 'cat /proc/self/status' to /etc/cron.daily/man-db.cron and saw "SigIgn: 0000000001001000". Looks like SigPipe is ignored at that point.

Then I added a job to /etc/crontab - '* * * * * root cat /proc/self/status' and saw "SigIgn: 0000000000000000". So, no signals ignored by cronie or it's job execution code.

I run batches using a Python script that handles pre/daily/weekly/monthly/post etc. and found a cure for my issues: The Python subprocess.Popen() call that I use to execute jobs ignores SigPipe, presumably to keep the noise from sub-precesses down. I added a "preexec_fn" to the call that resets SigPipe: "signal.signal(signal.SIGPIPE, signal.SIG_DFL)". The result is no more 'gzip: stdout: Broken pipe' messages from the batch runs.

So, does the standard configuration (/usr/bin/run-parts) do the same kind of thing? Apparently not...

mkdir /etc/cron.test
cat >/etc/cron.test/status <<EOF
#!/bin/sh
cat /proc/self/status
EOF
chmod +x /etc/cron.test/status
/usr/bin/run-parts /etc/cron.test
...
SigIgn:	0000000000000000
...

P.S. cronie-1.4.5-4.fc14.x86_64

Comment 9 Colin Watson 2011-03-01 13:50:36 UTC
DaveG, regarding Python's handling of SIGPIPE, you may find it useful to read http://bugs.python.org/issue1652.  In short, the fact that you had to use that preexec_fn was a bug in Python, fixed in Python 3.2.

Comment 10 DaveG 2011-03-02 01:20:53 UTC
Thanks for the heads-up, Colin. Still using Python 2.7.
Just wondering if the same kind of issue is in play for this bug? cronie itself looks OK but what about anacron+run-parts? I see exactly the same symptoms and resetting SigPipe fixed it for me. My scripts replace anacron so I can't easily check.

Comment 11 Tomas Mraz 2011-03-02 08:26:47 UTC
It does not seem to happen here. I've tried to update cronie with yum on my F14 VM and the SIGPIPE is not ignored in the crond process:

ps axu | grep crond
root     12922  5.1  0.1 112540  1312 ?        Ss   09:17   0:00 crond

cat /proc/12922/status | grep SigIgn
SigIgn:	0000000001000000

I am not sure why the 25th signal is ignored but SIGPIPE is not.

So I do not think it is related to bug 475106
Also there is no modification of SIGPIPE ignore status neither in crond nor in the anacron sources in cronie.

Comment 12 Tomas Mraz 2011-03-02 10:08:34 UTC
Reporter, can you still reproduce it on the fully updated F14 system?

Comment 13 Paul Howarth 2011-03-02 10:11:26 UTC
I haven't seen this issue for a few months now.

Comment 14 Colin Watson 2011-03-02 10:59:17 UTC
Tomas, it may depend on the context in which cron was last restarted.  For instance, gnome-session still (over ten years after mailing list discussions I've found on the subject!) leaves SIGPIPE ignored for its child processes.  These might include terminal windows where you might for example run an upgrade that causes cron to be restarted.

Comment 15 Colin Watson 2011-03-02 11:22:23 UTC
Also, on noticing this gnome-session behaviour, I got fed up and decided to just work around this in man-db upstream:

Wed Mar  2 11:12:13 GMT 2011  Colin Watson  <cjwatson>

        * src/mandb.c (main): Reset SIGPIPE to SIG_DFL on startup, to avoid
          noisy output in the event that mandb was started from a context
          where SIGPIPE was ignored (e.g. Fedora bug #649674).
        * NEWS: Document this.

Of course, that doesn't mean it stops being a bug anywhere else.  Ignored-SIGPIPE bugs are pernicious.

Comment 16 Tomas Mraz 2011-03-02 11:30:06 UTC
OK, but then it is a question who should clear up the SIGPIPE ignore action when it is set.

By the way neither bash running from gnome-terminal nor from lxterminal seems to have the SIGPIPE set to ignore although both gnome-terminal and lxterminal have it set. So apparently lxterminal/gnome-terminal resets the SIGPIPE action.

Comment 17 Tomas Mraz 2011-03-02 11:52:11 UTC
I also tried to upgrade cronie with PackageKit and although the packagekitd had SIGPIPE ignored the crond which was started during the upgrade does not. So I am really out of ideas where the ignored SIGPIPE is coming from.

To me it would seem to be logical to clear it in the /sbin/service or in the initscript daemon function, however bash does not allow clearing ignored signals in scripts when they were ignored at the bash startup.

So perhaps really the cron should clear SIGPIPE action when it daemonizes itself, just to be sure that the signal is not ignored in the children.

Comment 18 Colin Watson 2011-03-02 12:54:18 UTC
Yes, gnome-terminal certainly does.  I noticed it because pterm doesn't.

Comment 19 Marcela Mašláňová 2011-03-16 13:45:07 UTC
Fixed in upstream ee4cbe7659ede3f61db18cc922ee0e27268a8579. It will be in cronie-1.4.7.

Comment 20 Fedora Update System 2011-03-16 15:16:46 UTC
cronie-1.4.7-1.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/cronie-1.4.7-1.fc14

Comment 21 Fedora Update System 2011-03-30 19:58:03 UTC
cronie-1.4.7-1.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.