Bug 974691

Summary: Since cleaning up from #974132, nothing gets written to /var/log/messages
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: rsyslogAssignee: Tomas Heinrich <theinric>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 19CC: mah.darade, pvrabec, robatino, theinric
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard: RejectedBlocker RejectedFreezeException
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-25 09:35:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Adam Williamson 2013-06-14 20:11:38 UTC
I cleaned up quite aggressively from #974132 - I wound up doing 'rm -f /var/log/messages' and wiping almost all of /var/log/journal. I then updated to systemd-204-8.fc19.x86_64 and rsyslog-7.4.0-1.fc19.x86_64 .

journalctl is working, rsyslog isn't sucking CPU time, and I don't have spam in any logs, but in fact *nothing* is now being written to /var/log/messages at all. After every few messages are written to the journal, I get a line like this:

Jun 14 13:06:01 adam.localdomain systemd-journal[3084]: Forwarding to syslog missed 3 messages.

e.g.:

Jun 14 12:54:15 adam.localdomain systemd-journal[3084]: Forwarding to syslog missed 2 messages.
Jun 14 13:01:01 adam.localdomain CROND[28086]: (root) CMD (run-parts /etc/cron.hourly)
Jun 14 13:01:01 adam.localdomain run-parts(/etc/cron.hourly)[28090]: starting 0anacron
Jun 14 13:01:01 adam.localdomain run-parts(/etc/cron.hourly)[28097]: finished 0anacron
Jun 14 13:01:01 adam.localdomain run-parts(/etc/cron.hourly)[28099]: starting 0yum-hourly.cron
Jun 14 13:01:01 adam.localdomain systemd-journal[3084]: Forwarding to syslog missed 4 messages.
Jun 14 13:01:01 adam.localdomain run-parts(/etc/cron.hourly)[28103]: finished 0yum-hourly.cron
Jun 14 13:01:01 adam.localdomain run-parts(/etc/cron.hourly)[28106]: starting mcelog.cron
Jun 14 13:01:01 adam.localdomain run-parts(/etc/cron.hourly)[28110]: finished mcelog.cron
Jun 14 13:06:01 adam.localdomain systemd-journal[3084]: Forwarding to syslog missed 3 messages.

but no indication as to why. I can't see any immediately obvious reason why at a glance.

Comment 1 Adam Williamson 2013-06-14 20:13:34 UTC
Proposing as a final blocker. It may turn out that this is some idiosyncracy I've caused with my cleanup, but if it affects all systems with systemd -8 and rsyslog 7.4.0, it'd violate:

https://fedoraproject.org/wiki/Fedora_19_Alpha_Release_Criteria#System_logging

"A system logging infrastructure must be available, enabled by default, and working."

note one of the notes reads "This must be done in accordance with relevant standards accepted by the Fedora Project." - I believe we still ought to be writing /var/log/messages to comply with this.

Comment 2 Tomas Heinrich 2013-06-15 13:05:03 UTC
Two guesses:
- the state file holding the journal cursor might be completely off after erasing /var/log/journal/*
- the lines about missed messages refer to the systemd provided syslog.service + syslog.socket

Try erasing the state file - /var/lib/rsyslog/imjournal.state
Try getting the state of the syslog.socket ; maybe it got started and the socket is now full.
Inspect the condition of the (r)syslog service file ; maybe it got modified and wasn't updated.

Comment 3 Adam Williamson 2013-06-17 17:39:40 UTC
Discussed at 2013-06-17 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-06-17/f19final-blocker-review-6.2013-06-17-16.01.log.txt . As per http://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html#VARLOGLOGFILESANDDIRECTORIES , FHS does require /var/log/messages be present (and so our criteria imply that it should be correctly populated) if syslogd is installed, so if this truly affected all installations, there is an argument for it to be a blocker bug. For now, we don't know that it does. So we are delaying the decision and we will test with fresh TC4 installations once TC4 is available, to determine if this is a general problem or not.

Tomas, thanks for the tips, I'll look into my system ASAP.

Comment 4 Adam Williamson 2013-06-19 17:12:40 UTC
Discussed at 2013-06-19 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-06-19/f19final-blocker-review-7.2013-06-19-16.01.log.txt . I don't have the constant failures any more since Jun 17 - looks like it somehow cleared up after that date, but not sure why yet. Others report that they have a single "Forwarding to syslog missed XX messages." line per boot, indicating there is some kind of problem with forwarding some messages early in each boot, but no-one else sees the more serious issue of it constantly failing to forward almost anything.

Given this data we agreed to reject this as a blocker and freeze exception issue for now. I'll still try and find some time to look into it a bit further and check it's not more serious than it sounds. Sorry I didn't yet get the data requested, Tomas.

Comment 5 Tomas Heinrich 2013-06-20 10:12:09 UTC
(In reply to Adam Williamson from comment #4)
> I don't have the constant
> failures any more since Jun 17 - looks like it somehow cleared up after that
> date, but not sure why yet. Others report that they have a single
> "Forwarding to syslog missed XX messages." line per boot, indicating there
> is some kind of problem with forwarding some messages early in each boot,
> but no-one else sees the more serious issue of it constantly failing to
> forward almost anything.

For quite some time, systemd forced rsyslog to read from /run/systemd/journal/syslog instead of /dev/log. rsyslog doesn't use this socket since rsyslog-7.4.0-1 and doesn't pose as a syslog.service.

I'd expect the messages to be related to this socket interface; maybe the socket filled up and journal has started to drop messages. Since rsyslog doesn't use it anymore, this might be a fallout of the recent carnage or a problem in journal's flow control or an issue in the rsyslog-systemd plumbing.

Comment 6 Tomas Heinrich 2014-06-25 09:35:11 UTC
I think this can be closed after a year of silence.

Comment 7 Adam Williamson 2014-06-26 06:02:06 UTC
sorry about that, I never quite got around to poking this again, and then things moved on :/

Comment 8 Tomas Heinrich 2014-06-26 10:31:27 UTC
Don't worry about it. It might have been fixed in the meantime and if not, somebody will let us know. :]