Bug 45647 - log rotation race condition (2 copies of logrotate are running) (cron is running 2 (two) instances of some entries in the crontab)
log rotation race condition (2 copies of logrotate are running) (cron is runn...
Status: CLOSED DUPLICATE of bug 29868
Product: Red Hat Linux
Classification: Retired
Component: vixie-cron (Show other bugs)
7.0
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jens Petersen
Aaron Brown
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-06-24 11:11 EDT by Need Real Name
Modified: 2007-04-18 12:33 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-08-28 14:40:03 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Need Real Name 2001-06-24 11:11:40 EDT
From root  Sun Jun 24 04:02:07 2001
Return-Path: <root@localhost.localdomain>
Received: (from root@localhost)
        by localhost.localdomain (8.11.0/8.11.0) id f5O826503328
        for root; Sun, 24 Jun 2001 04:02:06 -0400
Date: Sun, 24 Jun 2001 04:02:06 -0400
From: root <root@localhost.localdomain>
Message-Id: <200106240802.f5O826503328@localhost.localdomain>
To: root@localhost.localdomain
Subject: errors rotating logs

errors occured while rotating /var/log/messages {

Failed to remove old log /var/log/messages.5: No such file or directory
Comment 1 Preston Brown 2001-07-02 14:09:03 EDT
have you modified /etc/logrotate.d/syslog?  How about /etc/logrotate.conf?  What files 
do you have in /var/log?

No one else can duplicate this bug and it has never been reported before so I tend to 
believe there is something specifically wrong with your computer/configuration.
Comment 2 Need Real Name 2001-07-02 14:30:20 EDT
>have you modified /etc/logrotate.d/syslog? 
No. rpm --verify sysklogd-1.3.33-8 prints nothing

> How about /etc/logrotate.conf?
I did not modify it.

>What files do you have in /var/log?

ls -ls /var/log
total 1352
 112 -rw-r--r--    1 root     root       109477 Jul  2 14:32 XFree86.0.log
  44 -rw-rw-r--    1 root     mal         43455 Feb 23 23:30 XFree86.1.log
  44 -rw-r--r--    1 root     root        44496 Jan 16 00:40 XFree86.9.log
   4 -rw-------    1 root     root          105 Jul  2 04:12 boot.log
   8 -rw-------    1 root     root         5368 Jun 29 13:37 boot.log.1
   0 -rw-------    1 root     root            0 Jun 24 04:02 boot.log.2
   0 -rw-------    1 root     root            0 Jun 17 04:02 boot.log.3
   8 -rw-------    1 root     root         4643 Jun 11 22:51 boot.log.4
  20 -rw-------    1 root     root        16777 Jul  2 14:30 cron
  88 -rw-------    1 root     root        82178 Jul  1 04:02 cron.1
   0 -rw-------    1 root     root            0 Jun 24 04:02 cron.2
  88 -rw-------    1 root     root        83099 Jun 24 04:02 cron.3
  88 -rw-------    1 root     root        82638 Jun 17 04:02 cron.4
   4 -rw-r--r--    1 root     root         2757 Jun 27 10:11 dmesg
   0 -rw-------    1 root     root            0 Nov  6  2000 htmlaccess.log
   4 drwxrwxr-x    2 nobody   apache       4096 Jul  1 04:02 httpd
  24 -rw-r--r--    1 root     root       146584 Jul  2 14:26 lastlog
   0 -rw-------    1 root     root            0 Jul  1 04:02 maillog
   8 -rw-------    1 root     root         5453 Jun 30 16:42 maillog.1
   4 -rw-------    1 root     root          169 Jun 24 04:02 maillog.2
   8 -rw-------    1 root     root         8012 Jun 23 18:30 maillog.3
   8 -rw-------    1 root     root         7937 Jun 16 18:32 maillog.4
  44 -rw-------    1 root     root        41125 Jul  2 14:30 messages
 224 -rw-------    1 root     root       222368 Jul  1 04:02 messages.1
   0 -rw-------    1 root     root            0 Jun 24 04:02 messages.2
 204 -rw-------    1 root     root       203686 Jun 24 04:02 messages.3
 232 -rw-------    1 root     root       230335 Jun 17 04:02 messages.4
   0 -rw-------    1 root     root            0 Jul  1 04:02 netconf.log
   0 -rw-------    1 root     root            0 Jun  1 04:02 netconf.log.1
   0 -rw-------    1 root     root            0 May  1 04:02 netconf.log.2
   0 -rw-------    1 root     root            0 Apr  1 04:02 netconf.log.3
   0 -rw-------    1 root     root            0 Mar  1 04:02 netconf.log.4
   4 drwx------    2 root     root         4096 Jun 23 18:52 samba
   4 -rw-------    1 root     root           97 Jul  2 14:26 secure
   4 -rw-------    1 root     root          679 Jun 27 16:25 secure.1
   0 -rw-------    1 root     root            0 Jun 24 04:02 secure.2
   0 -rw-------    1 root     root            0 Jun 17 04:02 secure.3
   4 -rw-------    1 root     root          988 Jun 14 12:05 secure.4
   0 -rw-------    1 root     root           0 Jul  1 04:02 spooler
   0 -rw-------    1 root     root            0 Jun 24 04:02 spooler.1
   0 -rw-------    1 root     root            0 Jun 24 04:02 spooler.2
   0 -rw-------    1 root     root            0 Jun 17 04:02 spooler.3
   0 -rw-------    1 root     root            0 Jun 10 04:02 spooler.4
   0 -rw-r--r--    1 root     root            0 Aug 22  2000 statistics
   0 -rw-r--r--    1 root     root            0 Jul  1 04:02 up2date
   0 -rw-r--r--    1 root     root            0 Jun 24 04:02 up2date.1
   0 -rw-r--r--    1 root     root            0 Jun 24 04:02 up2date.2
   0 -rw-r--r--    1 root     root            0 Jun 17 04:02 up2date.3
   0 -rw-r--r--    1 root     root            0 Jun 10 04:02 up2date.4
   4 drwxr-xr-x    2 root     root         4096 Aug  3  2000 vbox
   4 -rw-rw-r--    1 root     utmp          384 Jul  2 14:26 wtmp
  60 -rw-rw-r--    1 root     utmp        55680 Jun 30 01:48 wtmp.1

>No one else can duplicate this bug and it has never been reported before so I
tend to 
> believe there is something specifically wrong with your
computer/configuration.

I do not have this problem in most cases.
It happens seldomly.
Last time I had it it was on Sun Jun 24 04:02 2001
Also, note it mentioned /var/log/messages.5
The /etc/logrotate.conf has 
rotate 4
So /var/log/messages.5 SHOULD NEVER HAPPEN
And in the list of files above there is no /var/log/messages.5

tail  -15  /var/log/messages.3
Jun 24 03:30:00 cmpq CROND[3259]: (root) CMD (   /sbin/rmmod -as) 
Jun 24 03:40:00 cmpq CROND[3262]: (root) CMD (   /sbin/rmmod -as) 
Jun 24 03:50:00 cmpq CROND[3265]: (root) CMD (   /sbin/rmmod -as) 
Jun 24 03:50:54 cmpq named[404]: Cleaned cache of 4 RRsets
Jun 24 03:50:55 cmpq named[404]: USAGE 993369055 992314252 CPU=42.5u/13.06s
CHILDCPU=0u/0s
Jun 24 03:50:55 cmpq named[404]: NSTATS 993369055 992314252 A=5298 PTR=165 MX=34
ANY=46
Jun 24 03:50:55 cmpq named[404]: XSTATS 993369055 992314252 RR=7406 RNXD=84
RFwdR=5108 RDupR=22 RFail=0 RFErr=0 RErr=0 RAXFR=0 RLame=5 ROpts=0 SSysQ=1676
SAns=5391 SFwdQ=3783 SDupQ=421 SErr=1 RQ=5543 RIQ=0 RFwdQ=3783 RDupQ=30 RTCP=0
SFwdR=5108 SFail=0 SFErr=0 SNaAns=2261 SNXD=150 RUQ=0 RURQ=0 RUXFR=0 RUUpd=0
Jun 24 04:00:00 cmpq CROND[3269]: (root) CMD (   /sbin/rmmod -as) 
Jun 24 04:01:00 cmpq CROND[3271]: (root) CMD (run-parts /etc/cron.hourly) 
Jun 24 04:01:01 cmpq PAM_unix[3274]: (su) session opened for user nobody by
(uid=0)
Jun 24 04:01:03 cmpq PAM_unix[3274]: (su) session closed for user nobody
Jun 24 04:01:59 cmpq CROND[3292]: (root) CMD (run-parts /etc/cron.daily) 
Jun 24 04:02:01 cmpq CROND[3296]: (root) CMD (run-parts /etc/cron.daily) 
Jun 24 04:02:01 cmpq anacron[3298]: Updated timestamp for job `cron.daily' to
2001-06-24
Jun 24 04:02:02 cmpq anacron[3308]: Updated timestamp for job `cron.daily' to
2001-06-24

head  -20  /var/log/messages.1
Jun 24 04:02:03 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:03 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:04 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:04 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:05 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:05 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:06 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:07 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:08 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:11 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:12 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:13 cmpq syslogd 1.3-3: restart.
Jun 24 04:02:17 cmpq kernel: smb_get_length: recv error = 5
Jun 24 04:02:17 cmpq kernel: smb_trans2_request: result=-5, setting invalid
Jun 24 04:10:00 cmpq CROND[3520]: (root) CMD (   /sbin/rmmod -as) 
Jun 24 04:20:00 cmpq CROND[3523]: (root) CMD (   /sbin/rmmod -as) 
Jun 24 04:20:01 cmpq kernel: hdd: ATAPI 4X CD-ROM drive, 128kB Cache
Jun 24 04:22:00 cmpq CROND[3526]: (root) CMD (run-parts /etc/cron.weekly) 

Comment 3 Need Real Name 2001-07-02 14:45:51 EDT
Is it possible that there is a race, called by starting two instances of
logrotate
The log above shows 12 restarts of sysklogd
---------------
Jun 24 04:02:03 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:03 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:04 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:04 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:05 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:05 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:06 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:07 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:08 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:11 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:12 cmpq syslogd 1.3-3: restart.
 Jun 24 04:02:13 cmpq syslogd 1.3-3: restart.
-------------

The file /etc/logrotate.d/syslog has 6 files to rotate
and 6 does restarts.
Seems the logrotate was somehow started twice.
On a separate issue: instaead of 6 restarts I would recommend to have 1,
see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=47020
but this is a different issue not related to this bug.
Comment 4 Need Real Name 2001-07-02 14:52:25 EDT
On a regular basis I have 6 restarts of sysklogd

Jul  1 04:02:02 cmpq syslogd 1.3-3: restart.
Jul  1 04:02:02 cmpq syslogd 1.3-3: restart.
Jul  1 04:02:02 cmpq syslogd 1.3-3: restart.
Jul  1 04:02:02 cmpq syslogd 1.3-3: restart.
Jul  1 04:02:03 cmpq syslogd 1.3-3: restart.
Jul  1 04:02:03 cmpq syslogd 1.3-3: restart.

In the example above there was 12 restarts. In this one (the normal one) there
6.
Seems there is some kind of race/problem which, in some seldom situation,
which causes logrotate being started twice.
It is hard for me to reproduce it.
The only thing I can tell - it is definitelly much more likely
to happen on a slow computer than on a fast one.
Comment 5 Need Real Name 2001-07-02 15:00:46 EDT
The /etc/crontab file is a standard one

 cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly
[root@cmpq /root]# ls -lstr /etc/logrotate.d/
total 24
   4 -rw-r--r--    1 root     root          145 Aug 23  2000 linuxconf
   4 -rw-r--r--    1 root     root          763 Sep 14  2000 syslog
   4 -rw-r--r--    1 root     root          590 Oct 18  2000 apache
   4 -rw-r--r--    1 root     root          152 Jan 27 05:11 named
   4 -rw-r--r--    1 root     root           32 Apr 17 21:32 up2date
   4 -rw-r--r--    1 root     root          196 Jun 23 18:52 samba
ls -lstr /etc/cron.daily
total 24
   4 -rwxr-xr-x    1 root     root          103 Jul 19  2000 tetex.cron
   4 -rwxr-xr-x    1 root     root          276 Aug  4  2000 0anacron
   4 -rwxr-xr-x    1 root     root           51 Aug 15  2000 logrotate
   4 -rwxr-xr-x    1 root     root          402 Aug 23  2000 makewhatis.cron
   4 -rwxr-xr-x    1 root     root          221 Oct  5  2000 tmpwatch
   4 -rwxr-xr-x    1 root     root           99 Dec 18  2000 slocate.cron
Comment 6 Need Real Name 2001-07-02 22:30:58 EDT
I check several other RedHat 7.0 computers.

head /var/log/messages*

on some of them show twice started logrotate.
Is may be
1. A problem with cron.
2. A problem with log rotation when a signal is sent to cron,
so it gets restarted and started over.

From the data I have it is hard to distinguish 1  nad 2, 
because cron does not print to syslog when it gets -HUP,
it only prints when it gets started, so it hard to say whether 
it got -HUP or not.
For consistency whith other daemons which always print a message to syslog
when they receive -HUP it may make sense to add such message
printing to crond.
Comment 7 Preston Brown 2001-08-28 16:37:08 EDT
cleaned up via sysklogd modifications.
Comment 8 Need Real Name 2001-08-28 16:52:43 EDT
But the problem is not sysklogd related,
it seems cron related.
Cron starts multiple logrotate.

The multiple -HUP to sysklogd are NOT RELATED to this issue. 
For multiple -HUPs a single logrotate sends several -HUP.
and this is what was fixed by modifying /etc/logrotate.d/syslog from sysklogd
package.

This bug, when several logrote running is different, and I a believe is crond
related.
Comment 9 Preston Brown 2002-01-07 11:05:17 EST
Adrian, Elliot:  Can the two of you (owners of logrotate and cron) investigate 
and solve this issue?
Comment 10 Need Real Name 2002-01-08 13:07:11 EST
I also have similar problem with tmpwatch.
This is an error message in my mailbox:
It looks that the first process of tmpwatch was started by cron, 
and was deleting the files while the second process was just started.
Because of two simultaneously running tmpwatch processes
I get such errors.
From my point of view this is the problem wit cron
(I have vixie-cron-3.0.1-61)
which seldomly (once every 2-3 months) starts
the same job twice.


Message 4:
From root@test.gromco.com Mon Jan 07 04:05:41 2002
Envelope-to: root@test.gromco.com
Delivery-date: Mon, 07 Jan 2002 04:05:41 -0500
From: root@test.gromco.com (Cron Daemon)
To: root@test.gromco.com
Subject: Cron <root@localhost> run-parts /etc/cron.daily
X-Cron-Env: <SHELL=/bin/bash>
X-Cron-Env: <PATH=/sbin:/bin:/usr/sbin:/usr/bin>
X-Cron-Env: <MAILTO=root>
X-Cron-Env: <HOME=/>
X-Cron-Env: <LOGNAME=root>
Date: Mon, 07 Jan 2002 04:05:41 -0500

/etc/cron.daily/tmpwatch:

error: failed to lstat /tmp/jjj: No such file or directory
error: failed to lstat /tmp/lopp: No such file or directory
error: failed to lstat /tmp/ll1: No such file or directory
error: failed to rmdir /tmp/nscomm40-mal/950: No such file or directoryerror:
failed to rmdir /tmp/nscomm40-mal/1267: No such file or directoryerror: failed
to lstat /tmp/hh_slava.txt: No such file or directory

Comment 11 Elliot Lee 2002-01-29 15:08:11 EST
I can't imagine how this could be a logrotate bug. Putting myself in Cc: and
assigning to Adrian to see if he knows what is going on.
Comment 12 Need Real Name 2002-07-14 02:18:21 EDT
The problem seems still exists in RedHat 7.2
Comment 13 Christopher Blizzard 2002-08-28 10:13:33 EDT
Almost every night I have entries in my crontab that are run and cron is
definitely running more than one instance of them on a regular basis.  The
really strange thing is that it doesn't happen every time.  About half the time
a job will be run more than once.

This is really bad.
Comment 14 Need Real Name 2002-08-28 14:39:56 EDT
Same thing here, but I have it much more seldom (once every 2-3 months)
a cron is running two instances of a job. (on RedHat 7.3)
Comment 15 Christopher Blizzard 2002-08-28 16:45:31 EDT
This was fixed in Red Hat 7.1 with the timeaftertime patch.

*** This bug has been marked as a duplicate of 29868 ***
Comment 16 Christopher Blizzard 2002-08-28 16:46:33 EDT
(The machine where I was seeing it is a really old 7.0 box.)

Note You need to log in before you can comment on or make changes to this bug.