Bug 185168 - logs not being rotated when machine is not powered on at sunday 0400 anacron dead but subsys locked
Summary: logs not being rotated when machine is not powered on at sunday 0400 anacron ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: anacron
Version: 5
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Marcela Mašláňová
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks: 185538 185540
TreeView+ depends on / blocked
 
Reported: 2006-03-11 01:51 UTC by David Timms
Modified: 2007-11-30 22:11 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-12-12 17:08:11 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
/var/log/messages showing logrotate error messages (2.44 KB, text/plain)
2006-03-11 01:51 UTC, David Timms
no flags Details
anacron strace log (1.84 KB, text/plain)
2006-03-14 21:04 UTC, David Timms
no flags Details
terminal process and responses (2.02 KB, text/plain)
2006-03-14 21:07 UTC, David Timms
no flags Details
/var/log/anacron (3.45 KB, text/plain)
2006-03-14 23:09 UTC, David Timms
no flags Details
grep -B 3 -A 2 logrotate /var/log/messages* (3.36 KB, text/plain)
2006-03-14 23:30 UTC, David Timms
no flags Details
/etc/logrotate.conf (505 bytes, text/plain)
2006-03-14 23:37 UTC, David Timms
no flags Details
cat /etc/logrotate.d/* (3.11 KB, text/plain)
2006-03-14 23:40 UTC, David Timms
no flags Details
# strace -fvo logrotate.strace /usr/sbin/logrotate /etc/logrotate.conf (5.33 KB, text/plain)
2006-03-14 23:41 UTC, David Timms
no flags Details

Description David Timms 2006-03-11 01:51:13 UTC
Description of problem:
In FC3/4 the /var/log/messages etc where rotated through .1/.2/.3 on sundays at
0400. This keeps the logs to a reasonable size.

Version-Release number of selected component (if applicable):
FC5T2 install + devel pupdates
anacron-2.3-36.1
crontabs-1.10-7.1
logrotate-3.7.3-2.2.1
vixie-cron-4.1-54.FC5
:~) not sure which component this bug belongs to ?

How reproducible:
Another machine rolls over the logs OK, when it is powered off during the
scheduled time, and it is turned on later.

Steps to Reproduce:
1. FC5T3 (now I'm not 100% sure: may be FC5T2)+ updates
2. keep updating daily
3. have machine off overnight so that scheduled time is missed
  
Actual results:
# ls -l /var/log/me*
-rw------- 1 root root 2929009 Mar 11 11:48 /var/log/messages
-rw------- 1 root root  143817 Feb 12 10:08 /var/log/messages.1

sy-co-se
  anacron dead but subsys locked
# service anacron status
anacron dead but subsys locked

Expected results:
logs rotated each week into .1/.2/.3/.4

Additional info:
# rpm -V anacron crontabs logrotate vixie-cron
# (no changes / errors)
# service anacron status
anacron dead but subsys locked
# service anacron stop 
# service anacron start
Starting anacron:                                          [  OK  ]
# service anacron status
anacron dead but subsys locked
# cat /etc/fedora-release
Fedora Core release 4.91 (Pre-FC5)
# ls -l /var/log/cron* 
-rw------- 1 root root 49693 Mar 11 12:32 /var/log/cron
-rw------- 1 root root  2835 Feb 12 10:08 /var/log/cron.1
# ls -l /var/log/boot.log*
-rw------- 1 root root 0 Feb 12 10:08 /var/log/boot.log
-rw------- 1 root root 0 Feb 11 22:05 /var/log/boot.log.1

Note: I hadn't noticed this before, but am trying to review logs to check out
suspend / resume problems, and noticed the /var/log/messages was huge and
started a month ago. I just noticed in the messages that logrotate is tried to
run, but fails:
Feb 15 08:25:23 davidthome logrotate: ALERT exited abnormally with [1]

The other machine has rotated (delayed when off) logs, and 
# service anacron stop / start / status are normal.

Comment 1 David Timms 2006-03-11 01:51:13 UTC
Created attachment 125976 [details]
/var/log/messages showing logrotate error messages

Comment 2 Jason Vas Dias 2006-03-14 17:45:48 UTC
It looks like your anacron process is exiting / crashing for some reason :
> # service anacron start
> Starting anacron:                                          [  OK  ]
> # service anacron status
> anacron dead but subsys locked
I cannot reproduce this problem here.

You are running on an i386/i686 platform, or x86_64 ?

Please can you gather some further data:

1. Ensure anacron is stopped, and the lockfile is removed:
 # service anacron stop
 # rm -f /var/lock/subsys/anacron
2. Try getting anacron to exit / crash on startup - as root:
 # ulimit -c unlimited
 # cd /var/spool/anacron
 # strace -vfo /tmp/anacron.strace.log anacron -s
The strace should exit when anacron exits. 
3. Please compress and send the log file and any core dump files:
 # tar -cpf /tmp/anacron.strace.log /var/spool/anacron/core* | gzip >
/tmp/anacron-dbg.tar.gz
and send the /tmp/anacron-dbg.tar.gz file to me (jvdias) or
append it to this bug report - thank you.

Comment 3 David Timms 2006-03-14 21:04:09 UTC
Created attachment 126119 [details]
anacron strace log

Note, no crash dump was made in the folder you suggested (and I couldn't make
the tar command work as given, so did the tar and gzip separately).

Comment 4 David Timms 2006-03-14 21:07:31 UTC
Created attachment 126120 [details]
terminal process and responses

Comment 5 David Timms 2006-03-14 21:16:50 UTC
oops: forgot the architecture: i386/i686 AMX Athlon Thunderbird
# uname -a
Linux davidthome 2.6.15-1.2051_FC5 #1 Mon Mar 13 20:06:52 EST 2006 i686 athlon
i386 GNU/Linux


Comment 6 Jason Vas Dias 2006-03-14 21:59:50 UTC
Thank you - I think I can see what is happening now.

When anacron has actually run all the jobs that could be scheduled for a given
day, it exits, with the message 'Normal exit (0 jobs run)' written to 
/var/log/cron - if you do:
  # grep anacron /var/log/cron
you should see the anacron messages - if they are different, please append them
to this bug report.

So, unlike most services run from initscripts, anacron does not stay running
all the time; I guess once it has determined that no jobs can be scheduled
that day, it assumes it will run again the next day the job is to scheduled,
and has no more work to do, so it exits.

anacron is only meant to run jobs that have not been run for a certain time;
not run jobs at certain times, which is crond's job. 

Yes, this whole situation is far from optimal, and will be rectified by 
replacement of anacron with crond; crond will be able to parse anacrontab files,
and record the last time anacron jobs were run, and so will behave like anacron 
and run anacron jobs that have not been run for a certain time - this is work
in progress, and should be done shortly (a few weeks). 

But your real problem is with logrotate:

Feb 15 08:25:23 davidthome logrotate: ALERT exited abnormally with [1]

For some reason, logrotate, whether run by anacron OR crond, is getting an
error which prevents log files being rotated.

Are there any other logrotate messages in /var/log/messages ? 
# grep logrotate /var/log/messages

Does logrotate produce any output messages ? They should appear as mail 
sent to the root or anacrontab MAILTO= user with subject 
"Anacron job 'cron.daily'" . Did you get any such messages ?

If there are no messages from logrotate / cron.daily, then please try this
command as root:
  # strace -fvo /tmp/logrotate.strace /usr/sbin/logrotate /etc/logrotate.conf
and append the /tmp/logrotate.strace file (compressed) and command output to 
this bug report - thanks.


Comment 7 David Timms 2006-03-14 23:09:35 UTC
Created attachment 126125 [details]
/var/log/anacron

There is one problem: has trouble with sendmail.
# ls -l /usr/sbin/send*
ls: /usr/sbin/send*: No such file or directory

# rpm -qa|grep mail
procmail-3.22-16.2.1
mailx-8.1.1-44.2.1
mailcap-2.1.20-1
ie. sendmail is not installed. Is this supposed to get installed automatically
?

Comment 8 David Timms 2006-03-14 23:30:03 UTC
Created attachment 126126 [details]
grep -B 3 -A 2 logrotate /var/log/messages*

An selinux denied occurs at the same second that logrotate logs, but it doesn't
give me enough hints to say whether it is related.
I have manually renamed /var/log/messages, so the attach is a (possibly)
non-time-linear, listing of the last months messages* files (its a bit ugly).

# mail
Mail version 8.1 6/6/93.  Type ? for help.
"/var/spool/mail/root": 2 messages 2 new
>N  1 root  Sat Feb 11 07:20  19/771   "Anacron job 'cron.dai"
 N  2 root  Sun Feb 12 10:08 802/25844 "LogWatch for davidtho"
&1
From root  Sat Feb 11 07:20:07 2006
Date: Sat, 11 Feb 2006 07:20:06 -0500
From: root (Anacron)
To: root
Content-Type: text/plain; charset="ANSI_X3.4-1968"
Subject: Anacron job 'cron.daily'
/etc/cron.daily/0logwatch:
/usr/sbin/sendmail: Permission denied
&2
...yum updated:
    procmail.i386 3.22-16.2
    sendmail.i386 8.13.5-2.2
    fetchmail.i386 6.3.2.1-1.1
    mailx.i386 8.1.1-44.2
# grep mail /var/log/yum.log*  (reordered by date)
/var/log/yum.log.1:Feb 11 08:52:26 Updated: procmail.i386 3.22-16.2
/var/log/yum.log.1:Feb 11 08:52:54 Updated: mailx.i386 8.1.1-44.2
/var/log/yum.log.1:Feb 11 09:05:53 Updated: sendmail.i386 8.13.5-2.2
/var/log/yum.log.1:Feb 11 09:10:03 Updated: fetchmail.i386 6.3.2.1-1.1
/var/log/yum.log:Feb 20 00:41:38 Updated: mailcap.noarch 2.1.20-1
/var/log/yum.log:Feb 20 00:45:58 Updated: procmail.i386 3.22-16.2.1
/var/log/yum.log:Feb 20 00:46:06 Updated: mailx.i386 8.1.1-44.2.1

I didn't forcefully remove sendmail, but both rpm and the lack of binary on
disk show it isn't installed anymore. Perhaps another occurence like my bug
185177 ?
===
# strace -fvo logrotate.strace /usr/sbin/logrotate /etc/logrotate.conf
error: Ignoring yum.rpmorig, because of .rpmorig ending
error: mysqld:3 unknown user 'mysql'

# rpm -V logrotate
--nada!

Comment 9 David Timms 2006-03-14 23:37:03 UTC
Created attachment 126127 [details]
/etc/logrotate.conf

Comment 10 David Timms 2006-03-14 23:40:31 UTC
Created attachment 126128 [details]
cat /etc/logrotate.d/*

Just in case something relevant: I do notice a heap of ctrl chars in the mgetty
part, but I don't know if this is normal.

Comment 11 David Timms 2006-03-14 23:41:43 UTC
Created attachment 126129 [details]
# strace -fvo logrotate.strace /usr/sbin/logrotate /etc/logrotate.conf

Comment 12 Jason Vas Dias 2006-03-15 00:36:43 UTC
Thanks for the log files - I may have found the problem:

It appears that the 'mysqld' file precedes the 'syslog' file in 
/etc/logrotate.d, you have an existing /var/log/mysqld.log, 
and the 'mysql' user does not exist, causing logrotate to exit with an error:

10372 lstat64("/var/log/mysqld.log", {st_dev=makedev(253, 0), st_ino=951360,
st_mode=S_IFREG|0640, st_nlink=1,
 st_uid=0, st_gid=27, st_blksize=4096, st_blocks=8, st_size=0,
st_atime=2006/03/05-13:49:24, st_mtime=2006/03/
05-13:49:24, st_ctime=2006/03/05-13:49:24}) = 0
... ( so the /var/log/mysqld.log file exists, with gid 27 (which should be 
...   the mysql uid/gid ) ... 
10372 write(2, "error: ", 7)            = 7
10372 write(2, "mysqld:3 unknown user \'mysql\'\n", 30) = 30
...
10372 exit_group(1)                     = ?


Now, probably logrotate should not exit with an error if the owning user
of a log file to be rotated does not exist; it should not attempt to 
rotate that log file, and should emit an error message, but it should proceed
to the next log file, not exit with an error (I've CC'ed the logrotate
maintainer on this bug).

Do you still have the mysql package installed ?
Did you manually remove the mysql user ?

If you still have the mysql package installed, then if you did not
remove the mysql user manually, it is a mysql bug that the mysql user 
does not exist - please open a mysql bug. If you removed the mysql package,
then the mysql package %postun script should not remove the mysql user if 
there are still existing log files, or should remove such log files - it 
is also Red Hat packaging policy not to remove any system userid - so 
that is also a mysql bug.

And anacron will still run the job if sendmail is not installed,
but will be unable to send output mail, but will not complain about
it (an anacron bug that will be fixed when it is replaced by new crond).

Very sorry for this mess. A workaround is to recreate the mysql user:
  # useradd -u 27 -g mysql -n -r -M -s /sbin/nologin -d /var/lib/mysql mysql
or, if you have removed the mysql package, remove the /var/log/mysql* files.


Comment 13 David Timms 2006-03-15 02:24:32 UTC
Hi Jason,
*** note: user and group gui tool won't start (error): shows no groups nor users.
*** I didn't manually remove the mysql user
# grep sql /etc/passwd
hsqldb:x:100:101::/var/lib/hsqldb:/sbin/nologin
/\ missing mysql user.
# grep sql /etc/shadow
hsqldb:!!:13190:0:99999:7:::
mysql:!!:13190::::::
/\ mysql is present.
# grep sql /etc/group
hsqldb:x:101:
mysql:x:27:
# grep sql /etc/gshadow
hsqldb:x::
mysql:!::
# rpm -qa|grep mysql|sort
libdbi-dbd-mysql-0.8.1a-1.2.1
mysql-5.0.18-2.1
mysql-connector-odbc-3.51.12-1.2.1
mysql-server-5.0.18-2.1
[root@davidthome ~]# service mysqld status
mysqld is stopped
[root@davidthome ~]# service mysqld start
chown: `mysql:mysql': invalid user
chown: `mysql:mysql': invalid user
*** as you have found!
[root@davidthome ~]# grep mysql /var/log/yum.log*
/var/log/yum.log:Feb 20 00:47:22 Updated: mysql.i386 5.0.18-2.1
/var/log/yum.log:Feb 20 00:52:39 Updated: mysql-server.i386 5.0.18-2.1
/var/log/yum.log:Feb 20 00:52:47 Updated: mysql-connector-odbc.i386 3.51.12-1.2.1
/var/log/yum.log:Feb 20 00:52:47 Updated: libdbi-dbd-mysql.i386 0.8.1a-1.2.1
/var/log/yum.log:Mar 05 13:47:41 Erased: mysql-server
/var/log/yum.log:Mar 05 13:49:24 Installed: mysql-server.i386 5.0.18-2.1

*** I did notice the mysql start user error during startup, and uninstalled /
reinstalled to try to fix it. As you say that process didn't actually fix a
login / user problem.

# useradd -u 27 -g mysql -n -r -M -s /sbin/nologin -d /var/lib/mysql mysql
# grep sql /etc/passwd
hsqldb:x:100:101::/var/lib/hsqldb:/sbin/nologin
mysql:x:27:27::/var/lib/mysql:/sbin/nologin
# grep sql /etc/shadow 
hsqldb:!!:13190:0:99999:7:::
mysql:!!:13222::::::
[root@davidthome ~]# service mysqld status
mysqld is stopped
[root@davidthome ~]# service mysqld start

** after 15 seconds, prompt not returned.
# service mysqld status
mysqld is stopped
#sy-co-se mysql start waits 30 seconds then says:
  service mysqld started, but updates the status to mysqld dead but subsys locked.
# ls -l /var/lock/subsys/mysqld
-rw-r--r-- 1 root root 0 Mar 15 12:17 /var/lock/subsys/mysqld
[root@davidthome ~]# rm /var/lock/subsys/mysqld
rm: remove regular empty file `/var/lock/subsys/mysqld'? y

* retry the service start fails identically.
* pirut remove / install mysql-server (with the fixed mysql user)
* mysql now start OK.
* # /usr/sbin/logrotate /etc/logrotate.conf
error: Ignoring yum.rpmorig, because of .rpmorig ending
[root@davidthome anacron]# ls -l /var/log/me*
-rw------- 1 root root     594 Mar 15 13:28 /var/log/messages
-rw------- 1 root root  371462 Mar 15 13:28 /var/log/messages.1
-rw------- 1 root root  143817 Feb 12 10:08 /var/log/messages.2
-rw------- 1 root root 2932210 Mar 11 13:02 /var/log/messages.2006-03
-rw------- 1 root root   42661 Mar 11 13:13
/var/log/messages.2006-03-11.bad-iso.mount.txt
-rw------- 1 root root  741484 Mar 13 00:36
/var/log/messages.2006-03-12.before-suspend.tt
-rw------- 1 root root  182387 Mar 13 11:38 /var/log/messages.2006-03-13.txt

So, after fixing the username for mysql, reinstalling mysqld, I can now
successfully run Logrotate manually, and it does it's thing.

I am happy if you want to close this bug in which ever way you see fit, or I can
provide more info if you think it is relevant/useful.

Comment 14 Jason Vas Dias 2006-03-17 23:32:35 UTC
OK, well several issues came out of this bug:
 o  anacron should not be exit-ing if it thinks it has no more work to do;
    this will be fixed with the new cron rpm anacron replacement.
 o  Bug 185538 - logrotate should continue to process logrotate.d files after
    it gets an error for a single logrotate.d file.

I'm changing this bug into an enhancement request for the anacron issue.

Comment 15 Marcela Mašláňová 2006-12-12 17:08:11 UTC
Changing in notabug - I don't see in point to make cron work with anacron.


Note You need to log in before you can comment on or make changes to this bug.