Red Hat Bugzilla – Bug 241434
Default logrotate.d/syslog is reckless
Last modified: 2010-10-22 11:20:43 EDT
Description of problem:
The default 'postrotate' script used in sysklogd's logrotate.d/syslog script
seems to go out of its way to delivery a bullet to its foot:
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
...ensuring that any errors w/ the sighup are masked. This flaw ensures that log
data will get thrown away if compression is enabled or the failed HUP persists
through a full rotation cycle.
Version-Release number of selected component (if applicable):
sysklogd-1.4.1-26_EL and others
Steps to Reproduce:
1. Cause the syslog.pid file to be incorrect or missing (perhaps incorrect
SELinux context, fat fingers, initscripts killproc() bug, etc).
Option 1: Let logrotate run through a full rotation
Option 2: Add 'compress' to logrotate.conf and let logrotate run once.
The kill in the postrotate script will silently fail. Logrotate would normally
observe the error and avoid destroying the still-open on-disk inode. Instead,
sysklogd's logrotate configuration bypasses this safety measure needlessly
jeopardizing system logs.
That sysklogd's default configuration doesn't go out of its way to prevent
logrotate handling errors the way it was designed to.
While step #1 could be dismissed with a self-righteous "stupid mistakes result
in stupid behavior", the possibility of this condition is the only apparent
justification for trying to mask the kill's return value in the first place.
The "|| true" part is in the script for the case when the logging process is not
running (which is OK) and we do not want to see any errors about not restarting
it. But we still want its previous logs to be rotated.
comment 2 seems like quite an edge case ("we don't use syslog but want logs to
be rotated") to risk the integrity of logs that are actually being used. If this
were deferred to initscripts 'reload', there would be no reason for sloppy
workarounds in the postrotate.
Tomas -- could you expand on why Redhat prefers a quiet logrotate over integrity
of syslog data?
Tomas, what if we get rid of "|| true". syslog should run anyway.
I've tried to replace the postrotate script with
/sbin/service syslog condrestart
and it looks to be working fine: not starting the service when it's stopped,
failing on error... And we have no "|| true" thing.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
I'm not sure this "bug" is fixable at all. In the comment #7 I only suggested a way how to get rid of the "|| true" clause but the actual result would of course depend on the implementation of condrestart in the initscript which in our case depends on the lockfile (i think) so the root user with fat fingers can now corrupt lockfile instead of pid file and we're back at the beginning.
And one thing I forgot of entirely -- the downtime of restart is significantly higher than of reload or SIHUPping the daemon and in case of syslog it may be a reason to keep the logrotate config file as is.
I'm not the maintainer, but I would recommend to reconsider the devel_ack.
Wrabco, what do you think?
Based on my experience as a former initscripts comaintainer, using pidof is anything but robust. pidof can return user's processes instead of the system daemon. If root needs to run an additional syslogd instance (perhaps to store messages sent over UDP), pidof can't tell which process is the "system" instance and which process is the other. If there is a chroot set up, pidof can return the chrooted instance instead of the main one, or vice versa.
Protecting against any random system misconfiguration is futile. The syslogd binary could be corrupted, resulting in incorrect SIGHUP handling. The logrotate.d/sysklogd file could have an incorrect SELinux context. The system environment could have been corrupted to prevent shell operation in very many ways. Some of these events are more likely than others, but they're all about as likely that syslogd will remain running and its pid file will vanish or become inaccessible.
If the pid file, which is specifically intended for the purpose of identifying the running syslogd, can not be relied upon as an authoritative source of information, there's no reason to trust anything else in the system more.
Perhaps it makes some sense to report incorrect pid files (by returning true if the pid file does not exist, and using it without any "|| true" otherwise), but I don't think there is any reasonable way to detect whether syslogd was never started or whether the pid file is missing.
This is not a bug, I agree with Miloslav Trmac, that:
"Protecting against any random system misconfiguration is futile. The syslogd
binary could be corrupted, resulting in incorrect SIGHUP handling. The
logrotate.d/sysklogd file could have an incorrect SELinux context. The system
environment could have been corrupted to prevent shell operation in very many
ways. Some of these events are more likely than others, but they're all about
as likely that syslogd will remain running and its pid file will vanish or
If the pid file, which is specifically intended for the purpose of identifying
the running syslogd, can not be relied upon as an authoritative source of
information, there's no reason to trust anything else in the system more."
re comment 10 / comment 14 -- if we're resigned that initscripts will only ever be as good as pidof, comment 20 seems reasonable and perhaps /etc/init.d/syslog's reload() function should be collapsed to this one line. Otherwise, the initscripts are the place for this logic to be encapsulated and it's only sensible to utilize them whenever possible.
We rely on /var/run/syslogd.pid. It's intended for the purpose of identifying
the running syslogd. pidof might be also considered but according to comment 14 it's not the robust solution.
I don't consider this issue as a bug.