Bug 1739558

Summary: rsyslog underperforming compared to fluentd
Product: OpenShift Container Platform Reporter: Eric Matysek <ematysek>
Component: LoggingAssignee: Rich Megginson <rmeggins>
Status: CLOSED DEFERRED QA Contact: Anping Li <anli>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.2.0CC: aos-bugs, jcantril, mburke, mfiedler, nhosoi, pasik, rmeggins
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-05 14:08:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eric Matysek 2019-08-09 13:57:04 UTC
Description of problem:
rsyslog is currently dropping logs at lower logging rates than fluentd

Version-Release number of selected component (if applicable):
4.2

How reproducible:
100%

Steps to Reproduce:
1. Generate logs on one node
2. Verify ElasticSearch index contains expected number of logs

Actual results:
rsyslog starts dropping logs at a rate around 500/s (10 pods on one node producing 50/s each)
fluentd is able to handle 2000/s or more

Expected results:
rsyslog should handle at least the same amount as fluentd

Additional info:
Related to bug 1722944

Comment 1 Rich Megginson 2019-08-09 14:05:12 UTC
> rsyslog starts dropping logs at a rate around 500/s (10 pods on one node producing 50/s each)

So rsyslog is still dropping logs?  So https://bugzilla.redhat.com/show_bug.cgi?id=1722944 isn't fixed?  I don't understand what you mean by "dropping logs"?

Comment 2 Eric Matysek 2019-08-09 16:03:53 UTC
dropping logs = logs are not getting indexed into ElasticSearch

Comment 3 Rich Megginson 2019-08-09 16:17:48 UTC
(In reply to Eric Matysek from comment #2)
> dropping logs = logs are not getting indexed into ElasticSearch

So how is this BZ different than https://bugzilla.redhat.com/show_bug.cgi?id=1722944?

Comment 4 Eric Matysek 2019-08-09 16:52:15 UTC
It isn't, I was told I should open a new one though since that one is a lot to read through.

Comment 5 Noriko Hosoi 2019-08-23 15:22:13 UTC
Per scrum, this bug is considered a blocker.

Comment 6 Noriko Hosoi 2019-08-23 21:02:18 UTC
Hi Eric, man I ask you to run the same test with the following sets of rsyslog config change?  It can be updated by "oc edit cm rsyslog -c rsyslog".

case 1: increasing PersistStateInterval from 0 to 1000
--- /etc/rsyslog.d/60-mmk8s.conf	2019-08-03 22:41:43.416405303 -0700
+++ 60-mmk8s.conf	2019-08-23 13:30:44.492208637 -0700
@@ -8,8 +8,9 @@
       file="/var/log/containers/*.log"
       tag="kubernetes" addmetadata="on" reopenOnTruncate="on"
       discardTruncatedMsg="on" msgDiscardingError="off"
       freshStartTail=`echo $RSYSLOG_FILE_READ_FROM_TAIL`
+      PersistStateInterval="1000"
       endmsg.regex="(^[^ ]+ (stdout|stderr) F )|(\\n\"}$)|(\\n\",\")") # "
 
 if ((strlen($!CONTAINER_NAME) > 0) and (strlen($!CONTAINER_ID_FULL) > 0)) or


If we do not see any difference with the case 1, then can we try the polling mode?

case 2: using polling mode
--- /etc/rsyslog.d/10-viaq-modules.conf	2019-08-03 09:46:59.160801385 -0700
+++ 10-viaq-modules.conf	2019-08-23 13:42:22.098584866 -0700
@@ -1,6 +1,6 @@
 # Read from file with inotify enabled
-module(load="imfile" mode="inotify")
+module(load="imfile" mode="polling" PollingInterval="1")
 
 # Read from journal
 module(load="imjournal" StateFile=`echo $RSYSLOG_IMJOURNAL_STATE` UsePid="both"
        RateLimit.Burst="1000000" RateLimit.Interval="10" PersistStateInterval="1000"

The imfile doc recommends to use inotify mode, there was some cases similar issues solved by switching to polling mode, e.g., https://github.com/rsyslog/rsyslog/issues/1605.

Comment 7 Eric Matysek 2019-08-26 22:17:16 UTC
Based on my testing today, neither config change looked like it made a significant impact of rsyslog performance

Comment 8 Noriko Hosoi 2019-08-26 22:41:28 UTC
(In reply to Eric Matysek from comment #7)
> Based on my testing today, neither config change looked like it made a
> significant impact of rsyslog performance

Thank you very much for your testing, Eric.  I expected case 1 change could improve the rsyslog performance better and when a log rotation occurs there were no (or much less) backlog left which are likely the missing logs.

Also, there was a hope that the inotify mode could be the cause of the problem.  The inotify mode is rather new.  And the combination of inotify and reading from symlink is even newer, i.e., it has not been used widely so far.  But we are afraid this might fall into the Caveats/Known Bugs "symlink may not always be properly processed" in https://www.rsyslog.com/doc/v8-stable/configuration/modules/imfile.html.

Continue investigating...

Comment 10 Rich Megginson 2019-08-30 01:06:10 UTC
https://github.com/openshift/cluster-logging-operator/pull/236 doesn't fix the problem, but it helps - moving back to ASSIGNED

Comment 11 Rich Megginson 2019-09-03 17:16:02 UTC
current status - rsyslog is still not working correctly

I'm using https://github.com/richm/origin-aggregated-logging/commit/f8be57baa76f6b2ac969f0601d0d5cf0c217dc74 for performance testing in the CI environment

There is definitely a problem with rsyslog correctly handling cri-o/kubelet log file rotation.  I can see that rsyslog misses the last several records in the file.  kubelet renames the file e.g. 0.log -> 0.log.timestamp - this means the file keeps the same inode and remains open.

ls -ali /var/log/pods/perfbasic1/logging-load-driver

total 216884
185544759 drwxr-xr-x. 3 root root      109 Sep  3 16:39 ../
    19466 -rw-r--r--. 1 root root 44766710 Sep  3 16:44 0.log.20190903-164321.gz
    19471 -rw-r--r--. 1 root root 40145884 Sep  3 16:44 0.log.20190903-164404.gz
    19472 -rw-r--r--. 1 root root 40778012 Sep  3 16:45 0.log.20190903-164447.gz
    19464 -rw-------. 1 root root 62901287 Sep  3 16:45 0.log.20190903-164530
    19462 drwxr-xr-x. 2 root root      144 Sep  3 16:45 ./
    19465 -rw-------. 1 root root 20338907 Sep  3 16:45 0.log

rsyslog is notified immediately when the file rotates.  rsyslog then tries to read from the file until EOF (until a read() system call returns 0 and reads 0 bytes).  When this happens, it completely stops reading from the file.  But if cri-o  is logging at a high rate, it may continue writing to the file, even after kubelet has renamed it.  The way fluentd works is that it continues reading from the file, _after_ it has been rotated, for up to rotate_wait seconds https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/lib/generate_throttle_configs/lib/generate_throttle_configs.rb#L71.  We may need to add this functionality to rsyslog - it currently has no such setting https://www.rsyslog.com/doc/v8-stable/configuration/modules/imfile.html

I've also tried setting reopenOnTruncate="off" (we set it to "on") but that just causes a lot of errors like {"@timestamp":"2019-09-03T16:47:40.973981+00:00", "host":"rsyslog-jx9rc", "level":"warning", "facility":"syslog", "syslog-tag":"rsyslogd-2175:", "source":"rsyslogd-2175", "procid":"-", "msgid":"-", "message":"imfile: internal error? inotify provided watch descriptor 106 which we could not find in our tables - ignored [v8.37.0-9.el7 try http:\/\/www.rsyslog.com\/e\/2175 ]"}

Comment 12 Rich Megginson 2019-09-03 17:31:08 UTC
> I've also tried setting reopenOnTruncate="off" (we set it to "on") but that just causes a lot of errors like {"@timestamp":"2019-09-03T16:47:40.973981+00:00", "host":"rsyslog-jx9rc", "level":"warning", "facility":"syslog", "syslog-tag":"rsyslogd-2175:", "source":"rsyslogd-2175", "procid":"-", "msgid":"-", "message":"imfile: internal error? inotify provided watch descriptor 106 which we could not find in our tables - ignored [v8.37.0-9.el7 try http:\/\/www.rsyslog.com\/e\/2175 ]"}

I get this error with reopenOnTruncate="on" as well - but it doesn't seem to completely stop rsyslog from sending logs to elasticsearch - I can see logs being added still.

Comment 13 Rich Megginson 2019-09-03 18:34:01 UTC
Another strategy - perhaps fluentd doesn't use this because it doesn't use inotify - keep reading from the file until it is closed.

man inotify:

           IN_CLOSE_WRITE (+)
                  File opened for writing was closed.

I think there is only one process actually writing to the file, so when we get IN_CLOSE_WRITE on the rotated file, we know it has been closed.  Using inotifywait during the run seems to support this theory:

/var/log/pods/perfbasic1_perfbasic1_944f6464-ce76-11e9-b9ee-0ac699f7f852/logging-load-driver/ MOVED_FROM 0.log
/var/log/pods/perfbasic1_perfbasic1_944f6464-ce76-11e9-b9ee-0ac699f7f852/logging-load-driver/ MOVED_TO 0.log.20190903-181831
/var/log/pods/perfbasic1_perfbasic1_944f6464-ce76-11e9-b9ee-0ac699f7f852/logging-load-driver/ MODIFY 0.log.20190903-181831
...
/var/log/pods/perfbasic1_perfbasic1_944f6464-ce76-11e9-b9ee-0ac699f7f852/logging-load-driver/ MODIFY 0.log.20190903-181831
/var/log/containers/perfbasic1_perfbasic1_logging-load-driver-54786addf92ec1337a62f5ba98330b7fa5588509f884db9b78049994265cd5ed.log MODIFY
/var/log/pods/perfbasic1_perfbasic1_944f6464-ce76-11e9-b9ee-0ac699f7f852/logging-load-driver/ CLOSE_WRITE,CLOSE 0.log.20190903-181831
/var/log/containers/perfbasic1_perfbasic1_logging-load-driver-54786addf92ec1337a62f5ba98330b7fa5588509f884db9b78049994265cd5ed.log CLOSE_WRITE,CLOSE

MOVED_FROM/MOVED_TO is the rename - this is rotating the file by renaming it - the inode is still the same.  cri-o/conmon/kubelet continues writing to the file - MODIFY.  Then the file is closed CLOSE_WRITE,CLOSE.

/var/log/pods/perfbasic1_perfbasic1_944f6464-ce76-11e9-b9ee-0ac699f7f852/logging-load-driver/ CLOSE_NOWRITE,CLOSE 0.log.20190903-181525

I believe this is rsyslog closing the file - NOWRITE means the file was not opened for writing, which is probably rsyslog.

I think we can use this inotify behavior to make rsyslog wait until it gets a CLOSE_WRITE,CLOSE on the file before it stops reading from it.

Comment 15 Rich Megginson 2019-10-22 17:12:53 UTC
I think this bz is related to https://bugzilla.redhat.com/show_bug.cgi?id=1763746

Comment 16 Jeff Cantrill 2019-11-05 14:08:40 UTC
Closing deferred as its uncertain if/when rsyslog may be be released as a collector

Comment 17 Rich Megginson 2019-11-14 20:59:16 UTC
this is likely related to https://bugzilla.redhat.com/show_bug.cgi?id=1772672