Bug 1459633 - rsyslog sporadically fails to log message
rsyslog sporadically fails to log message
Status: NEW
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: rsyslog (Show other bugs)
7.4
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Radovan Sroka
BaseOS QE Security Team
: Triaged
: 1473809 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-07 12:14 EDT by Jakub Raček
Modified: 2018-06-30 17:31 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
rsyslog patch and debug log 1 (28.10 KB, text/plain)
2017-07-26 04:16 EDT, Jan Stancek
no flags Details

  None (edit)
Comment 3 Jan Stancek 2017-06-08 02:28:46 EDT
rsyslog config:
---
$ModLoad imjournal # provides access to the systemd journal
$WorkDirectory /var/lib/rsyslog
$IMJournalStateFile imjournal.state
mail.debug      /var/log/maillog
---

rsyslog is restarted and 2 seconds later application writes "syslogtst: mail crit test" to /dev/log. Message appears in journal, but not it /var/log/maillog.

@Radovan: Can you think of any debugging/extra options, or any ideas that would help to identify why this is failing? We can't reproduce it on demand by hand, but we can modify LTP testcase, which runs ~daily, and collect some logs that way when it fails next time.
Comment 4 Radovan Sroka 2017-06-08 03:23:40 EDT
(In reply to Jan Stancek from comment #3)
> rsyslog config:
> ---
> $ModLoad imjournal # provides access to the systemd journal
> $WorkDirectory /var/lib/rsyslog
> $IMJournalStateFile imjournal.state
> mail.debug      /var/log/maillog
> ---
> 

Is this whole config ?

> rsyslog is restarted and 2 seconds later application writes "syslogtst: mail
> crit test" to /dev/log. Message appears in journal, but not it
> /var/log/maillog.
> 
> @Radovan: Can you think of any debugging/extra options, or any ideas that
> would help to identify why this is failing? We can't reproduce it on demand
> by hand, but we can modify LTP testcase, which runs ~daily, and collect some
> logs that way when it fails next time.

You can run rsyslog in debug mode e.g.:

# rsyslog -dn
Comment 5 Jan Stancek 2017-06-08 04:23:48 EDT
(In reply to Radovan Sroka from comment #4)
> (In reply to Jan Stancek from comment #3)
> > rsyslog config:
> > ---
> > $ModLoad imjournal # provides access to the systemd journal
> > $WorkDirectory /var/lib/rsyslog
> > $IMJournalStateFile imjournal.state
> > mail.debug      /var/log/maillog
> > ---
> > 
> 
> Is this whole config ?

Yes, the test replaces original with what you see above.
Comment 6 Radovan Sroka 2017-06-08 04:37:48 EDT
(In reply to Jan Stancek from comment #5)
> (In reply to Radovan Sroka from comment #4)
> > (In reply to Jan Stancek from comment #3)
> > > rsyslog config:
> > > ---
> > > $ModLoad imjournal # provides access to the systemd journal
> > > $WorkDirectory /var/lib/rsyslog
> > > $IMJournalStateFile imjournal.state
> > > mail.debug      /var/log/maillog
> > > ---
> > > 
> > 
> > Is this whole config ?
> 
> Yes, the test replaces original with what you see above.


hmm This is not standart usecase for imjournal. I would suggest to use something like in default .conf. 

#--------------------------------------
$ModLoad imuxsock 
$ModLoad imjournal
$WorkDirectory /var/lib/rsyslog
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$SystemLogSocketName /run/systemd/journal/syslog
$OmitLocalLogging on
$IMJournalStateFile imjournal.state

mail.debug      /var/log/maillog
#---------------------------------------

Try that.
Comment 7 Jan Stancek 2017-06-08 05:08:56 EDT
Can you elaborate on that? Is there some interaction between imuxsock and imjournal? The doc [1] says these are alternatives, so I'd expect they can work independently.

If we use config above, which plugin will take precedence?

[1] https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/System_Administrators_Guide/s1-interaction_of_rsyslog_and_journal.html
Comment 8 Jakub Raček 2017-06-08 06:27:19 EDT
> hmm This is not standart usecase for imjournal. I would suggest to use
> something like in default .conf. 
> 

Original config file says:

# Turn off message reception via local log socket;                              
# local messages are retrieved through imjournal now.                           
$OmitLocalLogging on                

As far as I can tell, only when this is set to off, then $SystemLogSocketName becomes relevant. If it's not and both modules are loaded, the message gets logged twice. 

$ActionFileDefaultTemplate just sets format.

Are you sure that the configuration you suggested is effectively different from the one we have in tests?
Comment 9 Radovan Sroka 2017-06-08 06:55:40 EDT
(In reply to Jan Stancek from comment #7)
> Can you elaborate on that? Is there some interaction between imuxsock and
> imjournal? The doc [1] says these are alternatives, so I'd expect they can
> work independently.
> 

They should.

> If we use config above, which plugin will take precedence?
> 
> [1]
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/
> html/System_Administrators_Guide/s1-interaction_of_rsyslog_and_journal.html



(In reply to Jakub Raček from comment #8)
> > hmm This is not standart usecase for imjournal. I would suggest to use
> > something like in default .conf. 
> > 
> 
> Original config file says:
> 
> # Turn off message reception via local log socket;                          
> 
> # local messages are retrieved through imjournal now.                       
> 
> $OmitLocalLogging on                
> 
> As far as I can tell, only when this is set to off, then
> $SystemLogSocketName becomes relevant. If it's not and both modules are
> loaded, the message gets logged twice. 

Yes. Imjournal reads systemd socket "SystemLogSocketName" and Imuxsock reads /dev/log byt default but that is symlink to systemd socket(at least on fedora and I'm not sure how is this provided in rhel7)

So this is why "OmitLocalLogging" is used.
> 
> $ActionFileDefaultTemplate just sets format.
> 
> Are you sure that the configuration you suggested is effectively different
> from the one we have in tests?

It shoud be ok. My comment about configuration is irrelevant.
Comment 10 Radovan Sroka 2017-06-08 07:05:27 EDT
You have such rule in configuration.

mail.debug      /var/log/maillog

I would left rule for /var/log/messages in your .conf. You can see whether the message arrived at least there.

Or create *.* this rule instead.

Good debug helper is debug template.

see http://www.rsyslog.com/doc/v8-stable/configuration/templates.html

Try it and you will see. I'm just guessing that the message has different priority or severity as you were expecting.
Comment 12 Jan Stancek 2017-07-26 04:16 EDT
Created attachment 1304638 [details]
rsyslog patch and debug log 1

I started running failing test with modified rsyslog, hoping it would sched more light on where things go wrong (imjournal, systemd or syslog02 test).

I added some extra logs to plugins/imjournal/imjournal.c, attached is rsyslog debug log. Looking at the end of log I see pollJournal()/sd_journal_process returned 4 times "1" (SD_JOURNAL_APPEND), but trace in readjournal() just stops appearing after "Starting System Logging Service..." message.
Comment 13 John Shortt 2017-07-27 10:22:15 EDT
*** Bug 1473809 has been marked as a duplicate of this bug. ***
Comment 14 Radovan Sroka 2017-08-30 08:57:19 EDT
Hi,

you should try to change rsyslog config to use imuxsock instead of imjournal.

Here is the config:

$ModLoad imuxsock
$WorkDirectory /var/lib/rsyslog
$SystemLogSocketName /run/systemd/journal/syslog
$OmitLocalLogging off

... #some rules like "mail.debug      /var/log/maillog"

If problem persist, there is probably something wrong in your test suite. If not, there is some race condition in imjournal module and we are closer to make this right. 

@Jan Stancek:
I would like to debug rsyslog but I need some ssh access to machine with failing test.
BTW that code was slightly refactored between 7.3 and 7.4 (bacause of rebase).
Comment 16 Jan Stancek 2017-08-31 02:51:38 EDT
(In reply to Radovan Sroka from comment #14)
> Hi,
> 
> you should try to change rsyslog config to use imuxsock instead of imjournal.
> 
> Here is the config:
> 
> $ModLoad imuxsock
> $WorkDirectory /var/lib/rsyslog
> $SystemLogSocketName /run/systemd/journal/syslog
> $OmitLocalLogging off
> 
> ... #some rules like "mail.debug      /var/log/maillog"

This is how test worked until July 2015. When imjournal became default in RHEL, LTP switched to that. First mention of syslog tests failing I can find is from May 2016.

> 
> If problem persist, there is probably something wrong in your test suite. If
> not, there is some race condition in imjournal module and we are closer to
> make this right. 

This is possible, but we'll likely know only many months later with relative certainty.

> 
> @Jan Stancek:
> I would like to debug rsyslog but I need some ssh access to machine with
> failing test.

In other words reliable reproducer. Unfortunately nobody has figured out yet how to reproduce this on demand. I think a debug patch for test/rsyslog has best chance of moving us forward - if/when we can reproduce it again with the extra logs.
Comment 17 Radovan Sroka 2017-08-31 04:01:34 EDT
(In reply to Jan Stancek from comment #16)
> (In reply to Radovan Sroka from comment #14)
> > Hi,
> > 
> > you should try to change rsyslog config to use imuxsock instead of imjournal.
> > 
> > Here is the config:
> > 
> > $ModLoad imuxsock
> > $WorkDirectory /var/lib/rsyslog
> > $SystemLogSocketName /run/systemd/journal/syslog
> > $OmitLocalLogging off
> > 
> > ... #some rules like "mail.debug      /var/log/maillog"
> 
> This is how test worked until July 2015. When imjournal became default in
> RHEL, LTP switched to that. First mention of syslog tests failing I can find
> is from May 2016.
> 
> > 
> > If problem persist, there is probably something wrong in your test suite. If
> > not, there is some race condition in imjournal module and we are closer to
> > make this right. 
> 
> This is possible, but we'll likely know only many months later with relative
> certainty.
> 
> > 
> > @Jan Stancek:
> > I would like to debug rsyslog but I need some ssh access to machine with
> > failing test.
> 
> In other words reliable reproducer. Unfortunately nobody has figured out yet
> how to reproduce this on demand. I think a debug patch for test/rsyslog has
> best chance of moving us forward - if/when we can reproduce it again with
> the extra logs.

So what are next steps? I think we can and we should schedule run for test suite with imuxsock and also another with patched rsyslog as many times as possible e.g. 100. We can use some compose that was problematic before.
Comment 18 Jan Stancek 2017-08-31 04:20:50 EDT
(In reply to Radovan Sroka from comment #17)
> So what are next steps? I think we can and we should schedule run for test
> suite with imuxsock and also another with patched rsyslog as many times as
> possible e.g. 100. We can use some compose that was problematic before.

I'm running patched rsyslog for a while now, but it didn't reproduce yet with my latest round of extra logs after 1000+ recipes. The one instance you see in comment 12 took couple hundred recipes.
Comment 19 PaulB 2018-01-26 09:04:55 EST
All,
Here is a reproducer:
 arch: s390x
 host: ibm-z-21
 distro: RHEL-ALT-7.5-20180123.n.0 Server s390x
 kernel-alt: 4.14.0-32.el7a debug
 task: /kernel/distribution/ltp/lite 20180118-1

https://beaker.engineering.redhat.com/recipes/4724953#task66995856
http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2018/01/22692/2269256/4724953/66995856/329696347/resultoutputfile.log
---<-snip->---
syslog02                       FAIL       1    
---<-snip->---

http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2018/01/22692/2269256/4724953/66995856/RHELKT1LITE.FILTERED.run.log
---<-snip->---
<<<test_start>>>
tag=syslog02 stime=1516926043
cmdline="syslog02"
contacts=""
analysis=exit
<<<test_output>>>
syslog02    0  TINFO  :  Test if messages of all levels are logged.
syslog02    0  TINFO  :  For each level, a separate configuration file is
syslog02    0  TINFO  :  created and that will be used as syslog.conf file.
syslog02    0  TINFO  :  testing whether messages are logged into log file
syslog02    0  TINFO  :  Doing level: emerg...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    1  TPASS  :  ***** Level emerg passed *****
syslog02    0  TINFO  :  Doing level: alert...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    2  TPASS  :  ***** Level alert passed *****
syslog02    0  TINFO  :  Doing level: crit...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    3  TPASS  :  ***** Level crit passed *****
syslog02    0  TINFO  :  Doing level: err...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    4  TPASS  :  ***** Level err passed *****
syslog02    0  TINFO  :  Doing level: warning...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    5  TPASS  :  ***** Level warning passed *****
syslog02    0  TINFO  :  Doing level: notice...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    6  TFAIL  :  ltpapicmd.c:190: ***** Level notice failed *****
syslog02    0  TINFO  :  Doing level: info...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    7  TPASS  :  ***** Level info passed *****
syslog02    0  TINFO  :  Doing level: debug...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    8  TPASS  :  ***** Level debug passed *****
syslog02    0  TINFO  :  restarting syslog daemon
<<<execution_status>>>
initiation_status="ok"
duration=34 termination_type=exited termination_id=1 corefile=no
cutime=3 cstime=16
<<<test_end>>>
---<-snip->---


Best,
-pbunyan
Comment 20 PaulB 2018-01-30 08:46:24 EST
All,
Here is a reproducer:
 distro: RHEL-ALT-7.5-20180128.n.0 Server s390x
 kernel-alt: 4.14.0-34.el7a debug
 task: /kernel/distribution/ltp/lite 20180118-1



https://beaker.engineering.redhat.com/recipes/4742759#task67179975
http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2018/01/22766/2276621/4742759/67179975/330495809/resultoutputfile.log
---<-snip->---
syslog04                       FAIL       1    
---<-snip->---

http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2018/01/22766/2276621/4742759/67179975/RHELKT1LITE.FILTERED.run.log
---<-snip->---
<<<test_start>>>
tag=syslog04 stime=1517292148
cmdline="syslog04"
contacts=""
analysis=exit
<<<test_output>>>
syslog04    0  TINFO  :  case4: Test the logging option: LOG_PID
syslog04    0  TINFO  :  Do openlog() with LOG_PID option and see whether pid
syslog04    0  TINFO  :  is logged with message.
syslog04    0  TINFO  :  Testing the log option: LOG_PID...
syslog04    0  TINFO  :  restarting syslog daemon
syslog04    0  TINFO  :  restarting syslog daemon
<<<execution_status>>>
initiation_status="ok"
duration=6 termination_type=exited termination_id=1 corefile=no
cutime=1 cstime=5
<<<test_end>>>
<<<test_start>>>
tag=syslog05 stime=1517292154
cmdline="syslog05"
contacts=""
analysis=exit
<<<test_output>>>
syslog05    0  TINFO  :   case5: Test the logging option: LOG_CONS
syslog05    0  TINFO  :   o Do openlog() with LOG_CONS option.
syslog05    0  TINFO  :   o Disable /dev/syslog by moving it to a temporary file name.
syslog05    0  TINFO  :   o Send the syslog message.
syslog05    0  TINFO  :   o Check whether this is written to the console i.e to
syslog05    0  TINFO  :     the file /usr/adm/ktlog/<this year>/<this month>/<to_day>
syslog05    0  TINFO  :  syslog: Testing the log option: LOG_CONS...
syslog05    0  TINFO  :  restarting syslog daemon
syslog05    0  TINFO  :  restarting syslog daemon
<<<execution_status>>>
initiation_status="ok"
duration=26 termination_type=exited termination_id=0 corefile=no
cutime=1 cstime=5
<<<test_end>>>
---<-snip->---




Best,
-pbunyan

Note You need to log in before you can comment on or make changes to this bug.