Bug 1105828 - rsyslog /var/log/secure empty
Summary: rsyslog /var/log/secure empty
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: rsyslog
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Tomas Heinrich
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-07 23:33 UTC by brian
Modified: 2016-09-20 04:52 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-29 21:00:04 UTC
Type: Bug


Attachments (Terms of Use)

Description brian 2014-06-07 23:33:33 UTC
Description of problem:
/var/log/secure is empty and not being updated by rsyslogd.
fuser /var/log/secure shows the file is not open.

Version-Release number of selected component (if applicable):
Fedora 20 x686 (32bit), all current patches.
rsyslog-7.4.8-1.fc20.1.i686

I have 7 machines with similar Fedora 20 installs, all with default (identical) /etc/rsyslog.conf and /etc/rsyslog.d/ contents.  Of the 7 machines, 6 of them are x86_64 64bit,, the problem one is 32bit; significant I don't know.

I've performed:
yum reinstall rsyslog
rpm -V rsyslog-7.4.8-1.fc20.1.i686
No changes found.

Tried rebooting (numerous times), autorelabel, 
systemctl rsyslog restart

rsyslog is running, but doesn't open file.

How reproducible:
Unable on any other machine except this one.

Steps to Reproduce:
To produce expected /var/log/secure content, I'm attempting to ssh to machine with bogus userid/passwords, nothing.

I'm also running openswan on machine which produces large volumns of info to /var/log/secure.  Nothing.


Actual results:
/var/log/secure is empty, not open for writing by rsyslog.

Expected results:
Lots of content.

Additional info:
I've seen no selinux errors indicating problems in /var/log/audit/audit.log
ls -laZ /var/log/secure
-rw-------. root root system_u:object_r:var_log_t:s0   /var/log/secure

========================================================
cat /etc/rsyslog.conf 
# rsyslog configuration file

# For more information see /usr/share/doc/rsyslog-*/rsyslog_conf.html
# If you experience problems, see http://www.rsyslog.com/doc/troubleshoot.html

#### MODULES ####

# The imjournal module bellow is now used as a message source instead of imuxsock.
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
$ModLoad imjournal # provides access to the systemd journal
$ModLoad imklog   # provides kernel logging support (previously done by rklogd)
#$ModLoad immark  # provides --MARK-- message capability

# Provides UDP syslog reception
#$ModLoad imudp
#$UDPServerRun 514

# Provides TCP syslog reception
#$ModLoad imtcp
#$InputTCPServerRun 514


#### GLOBAL DIRECTIVES ####

# Where to place auxiliary files
$WorkDirectory /var/lib/rsyslog

# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat

# File syncing capability is disabled by default. This feature is usually not required,
# not useful and an extreme performance hit
#$ActionFileEnableSync on

# Include all config files in /etc/rsyslog.d/
$IncludeConfig /etc/rsyslog.d/*.conf

# Turn off message reception via local log socket;
# local messages are retrieved through imjournal now.
$OmitLocalLogging on

# File to store the position in the journal
$IMJournalStateFile imjournal.state


#### RULES ####

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.*                                                 /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none                /var/log/messages

# The authpriv file has restricted access.
authpriv.*                                              /var/log/secure

# Log all the mail messages in one place.
mail.*                                                  -/var/log/maillog


# Log cron stuff
cron.*                                                  /var/log/cron

# Everybody gets emergency messages
*.emerg                                                 :omusrmsg:*

# Save news errors of level crit and higher in a special file.
uucp,news.crit                                          /var/log/spooler

# Save boot messages also to boot.log
local7.*                                                /var/log/boot.log


# ### begin forwarding rule ###
# The statement between the begin ... end define a SINGLE forwarding
# rule. They belong together, do NOT split them. If you create multiple
# forwarding rules, duplicate the whole block!
# Remote Logging (we use TCP for reliable delivery)
#
# An on-disk queue is created for this action. If the remote host is
# down, messages are spooled to disk and sent when it is up again.
#$ActionQueueFileName fwdRule1 # unique name prefix for spool files
#$ActionQueueMaxDiskSpace 1g   # 1gb space limit (use as much as possible)
#$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
#$ActionQueueType LinkedList   # run asynchronously
#$ActionResumeRetryCount -1    # infinite retries if host is down
# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
#*.* @@remote-host:514
# ### end of the forwarding rule ###
======================================================

Comment 1 Tomas Heinrich 2014-06-18 13:30:13 UTC
Hello Brian,

given your extensive research, it looks the issue is in journald or the transport between journald and rsyslog. There are couple of things you can try.

I didn't notice it in your description, but besides /var/log/secure being empty, other files (e.g. messages) are being logged to, right?

Do the authpriv.* logs reach journald? You can check with
$ journalctl SYSLOG_FACILITY=10

You can try do remove the state file - /var/lib/rsyslog/imjournal.state - and restart rsyslog. That would create a flood of logs and keep the daemon busy for a moment but should stop eventually.

I don't know how this would help in your situation, but you can try removing journald's logs from /var/log/journald/ and restarting it.

Comment 2 brian 2014-06-19 01:41:01 UTC
/var/log/messages is correctly being written.

journalctl SYSLOG_FACILITY=10 does show the full expected assortment of content that would have gone to /var/log/secure from the install date up to today.  Since this is a internet firewall, that's a lot of content.

removing the imjournal.state file and restarting rsyslog produced some interesting results.  First root session that performed the restart returned the following messages:

systemctl restart rsyslog

============================================================================
[root@windward rsyslog]# 
Message from syslogd@windward at May  4 01:26:14 ...
 kernel:BUG: soft lockup - CPU#0 stuck for 23s! [kwin:1241]

Message from syslogd@windward at May  4 01:26:30 ...
 kernel:CPU: 0 PID: 1241 Comm: kwin Not tainted 3.14.2-200.fc20.i686+PAE #1

Message from syslogd@windward at May  4 01:26:31 ...
 kernel:Hardware name: ASUSTeK Computer INC. A7N8X-E/A7N8X-E, BIOS ASUS A7N8X-E Deluxe ACPI BIOS Rev 1008  12/05/2003

Message from syslogd@windward at May  4 01:26:31 ...
 kernel:task: e3924f90 ti: e3960000 task.ti: e3960000

Message from syslogd@windward at May  4 01:26:31 ...
 kernel:Stack:

Message from syslogd@windward at May  4 01:26:31 ...
 kernel:Call Trace:

Message from syslogd@windward at May  4 01:26:32 ...
 kernel:Code: 3d 00 00 01 00 76 0a 0f b7 d0 ed c3 90 8d 74 26 00 55 ba de b4 be c0 89 e5 e8 0b ff ff ff b8 ff ff ff ff 5d c3 8d 74 26 00 8b 00 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 81 fa ff ff 03 00 89

===================================================================

No other ill effects have been noticed.  Like you said, it caused a lot of traffic to be re-applied to all the log files.  So I'll go on.

Interestingly /var/log/secure did now have content up to May 4th.  I had known it used to be logging and stopped some time ago, now I know exactly when.  The last row written was:
May  4 13:38:25 windward su: pam_unix(su-l:session): session opened for user root by xxxxxx(uid=nnnn)  <I changed the userid/uid for security sake>

Given that we now know the process was/is still working, but not with current records, and your journalctl command, here may be a bad record in the journal that is hanging this process.  So I went back through that log to see what else happened at that same time; including the portion below, but to me nothing stands out significant except that its within a few minutes of a system reboot.

Here's the next few lines of the journalctl output, again changing some identifying parts like my userid and hostname.

May 04 13:38:25 fqdn.home su[3522]: pam_unix(su-l:session): session opened for user root by xxxxxx(uid=nnnn)
May 04 13:50:05 fqdn.home sshd[3483]: pam_unix(sshd:session): session closed for user xxxxxx
May 04 13:50:05 fqdn.home su[3522]: pam_unix(su-l:session): session closed for user root
May 04 13:50:05 fqdn.home sshd[2316]: pam_unix(sshd:session): session closed for user xxxxxx
May 04 13:50:05 fqdn.home su[2359]: pam_unix(su-l:session): session closed for user root
May 04 13:50:05 fqdn.home sshd[1534]: Exiting on signal 15
May 04 13:50:05 fqdn.home sshd[1534]: pam_unix(sshd:session): session closed for user root
May 04 13:50:05 fqdn.home sshd[2310]: Received signal 15; terminating.
May 04 13:50:05 fqdn.home systemd[4030]: pam_unix(systemd-user:session): session opened for user root by (uid=0)
May 04 13:50:05 fqdn.home systemd[4027]: pam_unix(systemd-user:session): session opened for user xxxxxx by (uid=0)
May 04 13:50:05 fqdn.home systemd[4043]: pam_unix(systemd-user:session): session closed for user root
May 04 13:50:05 fqdn.home systemd[4045]: pam_unix(systemd-user:session): session closed for user xxxxxx
May 04 13:50:05 fqdn.home login[698]: pam_systemd(login:session): Failed to release session: Connection was disconnected before a reply was re
May 04 13:50:05 fqdn.home systemd[712]: pam_unix(systemd-user:session): session closed for user root
May 04 13:50:05 fqdn.home systemd[2324]: pam_unix(systemd-user:session): session closed for user xxxxxx
May 04 13:50:05 fqdn.home login[698]: pam_unix(login:session): session closed for user root
-- Reboot --


I don't see anything significant, but that would be my bet is that somewhere in the log is a bad record rsyslog can't swallow.

Any more suggestions to try?
Thanks for your help so far, at least I'm further along than before.

Comment 3 brian 2014-06-19 01:45:54 UTC
Actually I just noticed that the /var/log/secure is receiving new content.  A huge gap between May 4th 13:38:25 and June 18 20:36, but any new expected content is being correctly logged.

So what do you make of that?

Comment 4 Tomas Heinrich 2014-06-23 13:56:00 UTC
(In reply to brian from comment #2)
> Interestingly /var/log/secure did now have content up to May 4th.

It's interesting how that collides with the kernel messages above:

> Message from syslogd@windward at May  4 01:26:31 ...
> kernel:Call Trace:
<...>

But it might be just a coincidence...

> Given that we now know the process was/is still working, but not with
> current records, and your journalctl command, here may be a bad record in
> the journal that is hanging this process.

Yeah, I agree. It wouldn't be the first time a change/bug in a journal message caused consumers to choke on it.

The thing is, even if this was the case, the behavior seems odd.
If there was an incompatible (unparsable) format change, there should be 0 new messages since.
If there was a bogus message that jammed the parser in imjournal, it should have done the same again after re-read.

One more thing to note; there have been unresolved issues with changing the system time back and forth (don't have the bugs at hand right now) that seem to affect the mechanism by which the journal stores the current log cursor (rsyslog's state file). But in those cases, logging stopped altogether. This selective message loss is really weird.

> So I went back through that log
> to see what else happened at that same time; including the portion below,
> but to me nothing stands out significant except that its within a few
> minutes of a system reboot.

You can get more data out of each message using:
journalctl -o json
The output might need a bit of anonymization again.

Another thing to try would be to delete the state file again, start rsyslog in the debug mode and store the output for inspection. There might be something useful regarding processing the messages from the 4th of May. But I'm afraid that would produce too much data to be practical.

Comment 5 JM 2015-02-23 17:10:38 UTC
I have this problem with three Fedora 21 systems, looks like the bug - or whatever it is - still exists.

Comment 6 JM 2015-02-23 17:23:53 UTC
Looks like this problem is related to bug #1088021 for RHEL 7(?).

Comment 7 Fedora End Of Life 2015-05-29 12:03:12 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 8 Fedora End Of Life 2015-06-29 21:00:04 UTC
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.