Bug 45647
Summary: | log rotation race condition (2 copies of logrotate are running) (cron is running 2 (two) instances of some entries in the crontab) | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Need Real Name <mal> |
Component: | vixie-cron | Assignee: | Jens Petersen <petersen> |
Status: | CLOSED DUPLICATE | QA Contact: | Aaron Brown <abrown> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.0 | CC: | blizzard, havill, rogerc, sopwith |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2002-08-28 18:40:03 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Need Real Name
2001-06-24 15:11:40 UTC
have you modified /etc/logrotate.d/syslog? How about /etc/logrotate.conf? What files do you have in /var/log? No one else can duplicate this bug and it has never been reported before so I tend to believe there is something specifically wrong with your computer/configuration. >have you modified /etc/logrotate.d/syslog? No. rpm --verify sysklogd-1.3.33-8 prints nothing > How about /etc/logrotate.conf? I did not modify it. >What files do you have in /var/log? ls -ls /var/log total 1352 112 -rw-r--r-- 1 root root 109477 Jul 2 14:32 XFree86.0.log 44 -rw-rw-r-- 1 root mal 43455 Feb 23 23:30 XFree86.1.log 44 -rw-r--r-- 1 root root 44496 Jan 16 00:40 XFree86.9.log 4 -rw------- 1 root root 105 Jul 2 04:12 boot.log 8 -rw------- 1 root root 5368 Jun 29 13:37 boot.log.1 0 -rw------- 1 root root 0 Jun 24 04:02 boot.log.2 0 -rw------- 1 root root 0 Jun 17 04:02 boot.log.3 8 -rw------- 1 root root 4643 Jun 11 22:51 boot.log.4 20 -rw------- 1 root root 16777 Jul 2 14:30 cron 88 -rw------- 1 root root 82178 Jul 1 04:02 cron.1 0 -rw------- 1 root root 0 Jun 24 04:02 cron.2 88 -rw------- 1 root root 83099 Jun 24 04:02 cron.3 88 -rw------- 1 root root 82638 Jun 17 04:02 cron.4 4 -rw-r--r-- 1 root root 2757 Jun 27 10:11 dmesg 0 -rw------- 1 root root 0 Nov 6 2000 htmlaccess.log 4 drwxrwxr-x 2 nobody apache 4096 Jul 1 04:02 httpd 24 -rw-r--r-- 1 root root 146584 Jul 2 14:26 lastlog 0 -rw------- 1 root root 0 Jul 1 04:02 maillog 8 -rw------- 1 root root 5453 Jun 30 16:42 maillog.1 4 -rw------- 1 root root 169 Jun 24 04:02 maillog.2 8 -rw------- 1 root root 8012 Jun 23 18:30 maillog.3 8 -rw------- 1 root root 7937 Jun 16 18:32 maillog.4 44 -rw------- 1 root root 41125 Jul 2 14:30 messages 224 -rw------- 1 root root 222368 Jul 1 04:02 messages.1 0 -rw------- 1 root root 0 Jun 24 04:02 messages.2 204 -rw------- 1 root root 203686 Jun 24 04:02 messages.3 232 -rw------- 1 root root 230335 Jun 17 04:02 messages.4 0 -rw------- 1 root root 0 Jul 1 04:02 netconf.log 0 -rw------- 1 root root 0 Jun 1 04:02 netconf.log.1 0 -rw------- 1 root root 0 May 1 04:02 netconf.log.2 0 -rw------- 1 root root 0 Apr 1 04:02 netconf.log.3 0 -rw------- 1 root root 0 Mar 1 04:02 netconf.log.4 4 drwx------ 2 root root 4096 Jun 23 18:52 samba 4 -rw------- 1 root root 97 Jul 2 14:26 secure 4 -rw------- 1 root root 679 Jun 27 16:25 secure.1 0 -rw------- 1 root root 0 Jun 24 04:02 secure.2 0 -rw------- 1 root root 0 Jun 17 04:02 secure.3 4 -rw------- 1 root root 988 Jun 14 12:05 secure.4 0 -rw------- 1 root root 0 Jul 1 04:02 spooler 0 -rw------- 1 root root 0 Jun 24 04:02 spooler.1 0 -rw------- 1 root root 0 Jun 24 04:02 spooler.2 0 -rw------- 1 root root 0 Jun 17 04:02 spooler.3 0 -rw------- 1 root root 0 Jun 10 04:02 spooler.4 0 -rw-r--r-- 1 root root 0 Aug 22 2000 statistics 0 -rw-r--r-- 1 root root 0 Jul 1 04:02 up2date 0 -rw-r--r-- 1 root root 0 Jun 24 04:02 up2date.1 0 -rw-r--r-- 1 root root 0 Jun 24 04:02 up2date.2 0 -rw-r--r-- 1 root root 0 Jun 17 04:02 up2date.3 0 -rw-r--r-- 1 root root 0 Jun 10 04:02 up2date.4 4 drwxr-xr-x 2 root root 4096 Aug 3 2000 vbox 4 -rw-rw-r-- 1 root utmp 384 Jul 2 14:26 wtmp 60 -rw-rw-r-- 1 root utmp 55680 Jun 30 01:48 wtmp.1 >No one else can duplicate this bug and it has never been reported before so I tend to > believe there is something specifically wrong with your computer/configuration. I do not have this problem in most cases. It happens seldomly. Last time I had it it was on Sun Jun 24 04:02 2001 Also, note it mentioned /var/log/messages.5 The /etc/logrotate.conf has rotate 4 So /var/log/messages.5 SHOULD NEVER HAPPEN And in the list of files above there is no /var/log/messages.5 tail -15 /var/log/messages.3 Jun 24 03:30:00 cmpq CROND[3259]: (root) CMD ( /sbin/rmmod -as) Jun 24 03:40:00 cmpq CROND[3262]: (root) CMD ( /sbin/rmmod -as) Jun 24 03:50:00 cmpq CROND[3265]: (root) CMD ( /sbin/rmmod -as) Jun 24 03:50:54 cmpq named[404]: Cleaned cache of 4 RRsets Jun 24 03:50:55 cmpq named[404]: USAGE 993369055 992314252 CPU=42.5u/13.06s CHILDCPU=0u/0s Jun 24 03:50:55 cmpq named[404]: NSTATS 993369055 992314252 A=5298 PTR=165 MX=34 ANY=46 Jun 24 03:50:55 cmpq named[404]: XSTATS 993369055 992314252 RR=7406 RNXD=84 RFwdR=5108 RDupR=22 RFail=0 RFErr=0 RErr=0 RAXFR=0 RLame=5 ROpts=0 SSysQ=1676 SAns=5391 SFwdQ=3783 SDupQ=421 SErr=1 RQ=5543 RIQ=0 RFwdQ=3783 RDupQ=30 RTCP=0 SFwdR=5108 SFail=0 SFErr=0 SNaAns=2261 SNXD=150 RUQ=0 RURQ=0 RUXFR=0 RUUpd=0 Jun 24 04:00:00 cmpq CROND[3269]: (root) CMD ( /sbin/rmmod -as) Jun 24 04:01:00 cmpq CROND[3271]: (root) CMD (run-parts /etc/cron.hourly) Jun 24 04:01:01 cmpq PAM_unix[3274]: (su) session opened for user nobody by (uid=0) Jun 24 04:01:03 cmpq PAM_unix[3274]: (su) session closed for user nobody Jun 24 04:01:59 cmpq CROND[3292]: (root) CMD (run-parts /etc/cron.daily) Jun 24 04:02:01 cmpq CROND[3296]: (root) CMD (run-parts /etc/cron.daily) Jun 24 04:02:01 cmpq anacron[3298]: Updated timestamp for job `cron.daily' to 2001-06-24 Jun 24 04:02:02 cmpq anacron[3308]: Updated timestamp for job `cron.daily' to 2001-06-24 head -20 /var/log/messages.1 Jun 24 04:02:03 cmpq syslogd 1.3-3: restart. Jun 24 04:02:03 cmpq syslogd 1.3-3: restart. Jun 24 04:02:04 cmpq syslogd 1.3-3: restart. Jun 24 04:02:04 cmpq syslogd 1.3-3: restart. Jun 24 04:02:05 cmpq syslogd 1.3-3: restart. Jun 24 04:02:05 cmpq syslogd 1.3-3: restart. Jun 24 04:02:06 cmpq syslogd 1.3-3: restart. Jun 24 04:02:07 cmpq syslogd 1.3-3: restart. Jun 24 04:02:08 cmpq syslogd 1.3-3: restart. Jun 24 04:02:11 cmpq syslogd 1.3-3: restart. Jun 24 04:02:12 cmpq syslogd 1.3-3: restart. Jun 24 04:02:13 cmpq syslogd 1.3-3: restart. Jun 24 04:02:17 cmpq kernel: smb_get_length: recv error = 5 Jun 24 04:02:17 cmpq kernel: smb_trans2_request: result=-5, setting invalid Jun 24 04:10:00 cmpq CROND[3520]: (root) CMD ( /sbin/rmmod -as) Jun 24 04:20:00 cmpq CROND[3523]: (root) CMD ( /sbin/rmmod -as) Jun 24 04:20:01 cmpq kernel: hdd: ATAPI 4X CD-ROM drive, 128kB Cache Jun 24 04:22:00 cmpq CROND[3526]: (root) CMD (run-parts /etc/cron.weekly) Is it possible that there is a race, called by starting two instances of logrotate The log above shows 12 restarts of sysklogd --------------- Jun 24 04:02:03 cmpq syslogd 1.3-3: restart. Jun 24 04:02:03 cmpq syslogd 1.3-3: restart. Jun 24 04:02:04 cmpq syslogd 1.3-3: restart. Jun 24 04:02:04 cmpq syslogd 1.3-3: restart. Jun 24 04:02:05 cmpq syslogd 1.3-3: restart. Jun 24 04:02:05 cmpq syslogd 1.3-3: restart. Jun 24 04:02:06 cmpq syslogd 1.3-3: restart. Jun 24 04:02:07 cmpq syslogd 1.3-3: restart. Jun 24 04:02:08 cmpq syslogd 1.3-3: restart. Jun 24 04:02:11 cmpq syslogd 1.3-3: restart. Jun 24 04:02:12 cmpq syslogd 1.3-3: restart. Jun 24 04:02:13 cmpq syslogd 1.3-3: restart. ------------- The file /etc/logrotate.d/syslog has 6 files to rotate and 6 does restarts. Seems the logrotate was somehow started twice. On a separate issue: instaead of 6 restarts I would recommend to have 1, see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=47020 but this is a different issue not related to this bug. On a regular basis I have 6 restarts of sysklogd Jul 1 04:02:02 cmpq syslogd 1.3-3: restart. Jul 1 04:02:02 cmpq syslogd 1.3-3: restart. Jul 1 04:02:02 cmpq syslogd 1.3-3: restart. Jul 1 04:02:02 cmpq syslogd 1.3-3: restart. Jul 1 04:02:03 cmpq syslogd 1.3-3: restart. Jul 1 04:02:03 cmpq syslogd 1.3-3: restart. In the example above there was 12 restarts. In this one (the normal one) there 6. Seems there is some kind of race/problem which, in some seldom situation, which causes logrotate being started twice. It is hard for me to reproduce it. The only thing I can tell - it is definitelly much more likely to happen on a slow computer than on a fast one. The /etc/crontab file is a standard one cat /etc/crontab SHELL=/bin/bash PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO=root HOME=/ # run-parts 01 * * * * root run-parts /etc/cron.hourly 02 4 * * * root run-parts /etc/cron.daily 22 4 * * 0 root run-parts /etc/cron.weekly 42 4 1 * * root run-parts /etc/cron.monthly [root@cmpq /root]# ls -lstr /etc/logrotate.d/ total 24 4 -rw-r--r-- 1 root root 145 Aug 23 2000 linuxconf 4 -rw-r--r-- 1 root root 763 Sep 14 2000 syslog 4 -rw-r--r-- 1 root root 590 Oct 18 2000 apache 4 -rw-r--r-- 1 root root 152 Jan 27 05:11 named 4 -rw-r--r-- 1 root root 32 Apr 17 21:32 up2date 4 -rw-r--r-- 1 root root 196 Jun 23 18:52 samba ls -lstr /etc/cron.daily total 24 4 -rwxr-xr-x 1 root root 103 Jul 19 2000 tetex.cron 4 -rwxr-xr-x 1 root root 276 Aug 4 2000 0anacron 4 -rwxr-xr-x 1 root root 51 Aug 15 2000 logrotate 4 -rwxr-xr-x 1 root root 402 Aug 23 2000 makewhatis.cron 4 -rwxr-xr-x 1 root root 221 Oct 5 2000 tmpwatch 4 -rwxr-xr-x 1 root root 99 Dec 18 2000 slocate.cron I check several other RedHat 7.0 computers. head /var/log/messages* on some of them show twice started logrotate. Is may be 1. A problem with cron. 2. A problem with log rotation when a signal is sent to cron, so it gets restarted and started over. From the data I have it is hard to distinguish 1 nad 2, because cron does not print to syslog when it gets -HUP, it only prints when it gets started, so it hard to say whether it got -HUP or not. For consistency whith other daemons which always print a message to syslog when they receive -HUP it may make sense to add such message printing to crond. cleaned up via sysklogd modifications. But the problem is not sysklogd related, it seems cron related. Cron starts multiple logrotate. The multiple -HUP to sysklogd are NOT RELATED to this issue. For multiple -HUPs a single logrotate sends several -HUP. and this is what was fixed by modifying /etc/logrotate.d/syslog from sysklogd package. This bug, when several logrote running is different, and I a believe is crond related. Adrian, Elliot: Can the two of you (owners of logrotate and cron) investigate and solve this issue? I also have similar problem with tmpwatch. This is an error message in my mailbox: It looks that the first process of tmpwatch was started by cron, and was deleting the files while the second process was just started. Because of two simultaneously running tmpwatch processes I get such errors. From my point of view this is the problem wit cron (I have vixie-cron-3.0.1-61) which seldomly (once every 2-3 months) starts the same job twice. Message 4: From root.com Mon Jan 07 04:05:41 2002 Envelope-to: root.com Delivery-date: Mon, 07 Jan 2002 04:05:41 -0500 From: root.com (Cron Daemon) To: root.com Subject: Cron <root@localhost> run-parts /etc/cron.daily X-Cron-Env: <SHELL=/bin/bash> X-Cron-Env: <PATH=/sbin:/bin:/usr/sbin:/usr/bin> X-Cron-Env: <MAILTO=root> X-Cron-Env: <HOME=/> X-Cron-Env: <LOGNAME=root> Date: Mon, 07 Jan 2002 04:05:41 -0500 /etc/cron.daily/tmpwatch: error: failed to lstat /tmp/jjj: No such file or directory error: failed to lstat /tmp/lopp: No such file or directory error: failed to lstat /tmp/ll1: No such file or directory error: failed to rmdir /tmp/nscomm40-mal/950: No such file or directoryerror: failed to rmdir /tmp/nscomm40-mal/1267: No such file or directoryerror: failed to lstat /tmp/hh_slava.txt: No such file or directory I can't imagine how this could be a logrotate bug. Putting myself in Cc: and assigning to Adrian to see if he knows what is going on. The problem seems still exists in RedHat 7.2 Almost every night I have entries in my crontab that are run and cron is definitely running more than one instance of them on a regular basis. The really strange thing is that it doesn't happen every time. About half the time a job will be run more than once. This is really bad. Same thing here, but I have it much more seldom (once every 2-3 months) a cron is running two instances of a job. (on RedHat 7.3) This was fixed in Red Hat 7.1 with the timeaftertime patch. *** This bug has been marked as a duplicate of 29868 *** (The machine where I was seeing it is a really old 7.0 box.) |