Bug 241434

Summary:	Default logrotate.d/syslog is reckless
Product:	Red Hat Enterprise Linux 4	Reporter:	Kevin Graham <kgraham>
Component:	sysklogd	Assignee:	Peter Vrabec <pvrabec>
Status:	CLOSED NOTABUG	QA Contact:	Brian Brock <bbrock>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4.0	CC:	dmair, mriddle, ndoane, sgrubb, tao, tsmetana
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2009-03-16 10:23:43 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	232914
Bug Blocks:	391511

Description Kevin Graham 2007-05-25 22:40:12 UTC

Description of problem:

The default 'postrotate' script used in sysklogd's logrotate.d/syslog script
seems to go out of its way to delivery a bullet to its foot:

postrotate
    /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript

...ensuring that any errors w/ the sighup are masked. This flaw ensures that log
data will get thrown away if compression is enabled or the failed HUP persists
through a full rotation cycle.

Version-Release number of selected component (if applicable):

sysklogd-1.4.1-26_EL and others

How reproducible:

Steps to Reproduce:
1. Cause the syslog.pid file to be incorrect or missing (perhaps incorrect
SELinux context, fat fingers, initscripts killproc() bug, etc).
Option 1: Let logrotate run through a full rotation
Option 2: Add 'compress' to logrotate.conf and let logrotate run once.
  
Actual results:

The kill in the postrotate script will silently fail. Logrotate would normally
observe the error and avoid destroying the still-open on-disk inode. Instead,
sysklogd's logrotate configuration bypasses this safety measure needlessly
jeopardizing system logs.

Expected results:

That sysklogd's default configuration doesn't go out of its way to prevent
logrotate handling errors the way it was designed to.

Additional info:

While step #1 could be dismissed with a self-righteous "stupid mistakes result
in stupid behavior", the possibility of this condition is the only apparent
justification for trying to mask the kill's return value in the first place.

Comment 1 Kevin Graham 2007-06-05 16:01:11 UTC

Ping?

Comment 2 Tomas Smetana 2007-06-07 12:05:07 UTC

The "|| true" part is in the script for the case when the logging process is not
running (which is OK) and we do not want to see any errors about not restarting
it.  But we still want its previous logs to be rotated.

Comment 3 Kevin Graham 2007-06-07 15:27:03 UTC

comment 2 seems like quite an edge case ("we don't use syslog but want logs to
be rotated") to risk the integrity of logs that are actually being used. If this
were deferred to initscripts 'reload', there would be no reason for sloppy
workarounds in the postrotate.

Comment 4 Kevin Graham 2007-07-17 16:15:43 UTC

Tomas -- could you expand on why Redhat prefers a quiet logrotate over integrity
of syslog data?

Comment 6 Peter Vrabec 2008-02-08 11:27:52 UTC

Tomas, what if we get rid of "|| true". syslog should run anyway.

Comment 7 Tomas Smetana 2008-02-11 14:41:04 UTC

I've tried to replace the postrotate script with

/sbin/service syslog condrestart

and it looks to be working fine: not starting the service when it's stopped,
failing on error...  And we have no "|| true" thing.

Comment 8 RHEL Program Management 2008-09-05 17:17:27 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 10 Tomas Smetana 2009-01-21 16:22:43 UTC

I'm not sure this "bug" is fixable at all.  In the comment #7 I only suggested a way how to get rid of the "|| true" clause but the actual result would of course depend on the implementation of condrestart in the initscript which in our case depends on the lockfile (i think) so the root user with fat fingers can now corrupt lockfile instead of pid file and we're back at the beginning.

And one thing I forgot of entirely -- the downtime of restart is significantly higher than of reload or SIHUPping the daemon and in case of syslog it may be a reason to keep the logrotate config file as is.

I'm not the maintainer, but I would recommend to reconsider the devel_ack.

Wrabco, what do you think?

Comment 14 Miloslav Trmač 2009-01-21 22:25:09 UTC

Based on my experience as a former initscripts comaintainer, using pidof is anything but robust.  pidof can return user's processes instead of the system daemon.  If root needs to run an additional syslogd instance (perhaps to store messages sent over UDP), pidof can't tell which process is the "system" instance and which process is the other.  If there is a chroot set up, pidof can return the chrooted instance instead of the main one, or vice versa.

Protecting against any random system misconfiguration is futile.  The syslogd binary could be corrupted, resulting in incorrect SIGHUP handling.  The logrotate.d/sysklogd file could have an incorrect SELinux context.  The system environment could have been corrupted to prevent shell operation in very many ways.  Some of these events are more likely than others, but they're all about as likely that syslogd will remain running and its pid file will vanish or become inaccessible. 

If the pid file, which is specifically intended for the purpose of identifying the running syslogd, can not be relied upon as an authoritative source of information, there's no reason to trust anything else in the system more.

Perhaps it makes some sense to report incorrect pid files (by returning true if the pid file does not exist, and using it without any "|| true" otherwise), but I don't think there is any reasonable way to detect whether syslogd was never started or whether the pid file is missing.

Comment 20 Peter Vrabec 2009-01-23 10:42:57 UTC

This is not a bug, I agree with Miloslav Trmac, that: 

"Protecting against any random system misconfiguration is futile.  The syslogd
binary could be corrupted, resulting in incorrect SIGHUP handling.  The
logrotate.d/sysklogd file could have an incorrect SELinux context.  The system
environment could have been corrupted to prevent shell operation in very many
ways.  Some of these events are more likely than others, but they're all about
as likely that syslogd will remain running and its pid file will vanish or
become inaccessible. 

If the pid file, which is specifically intended for the purpose of identifying
the running syslogd, can not be relied upon as an authoritative source of
information, there's no reason to trust anything else in the system more."

Comment 21 Kevin Graham 2009-01-30 21:20:52 UTC

re comment 10 / comment 14 -- if we're resigned that initscripts will only ever be as good as pidof, comment 20 seems reasonable and perhaps /etc/init.d/syslog's reload() function should be collapsed to this one line. Otherwise, the initscripts are the place for this logic to be encapsulated and it's only sensible to utilize them whenever possible.

Comment 25 Peter Vrabec 2009-03-16 10:23:43 UTC

We rely on /var/run/syslogd.pid. It's intended for the purpose of identifying
the running syslogd. pidof might be also considered but according to comment 14 it's not the robust solution.

I don't consider this issue as a bug.