Description of problem: The log rotation for rsyslog doesn't work. $ oc exec rsyslog-gkqvh env |grep LOGGING_FILE Defaulting container name to rsyslog. Use 'oc describe pod/rsyslog-gkqvh -n openshift-logging' to see all of the containers in this pod. LOGGING_FILE_SIZE=1024000 LOGGING_FILE_AGE=5 $ oc exec rsyslog-gkqvh -- ls /var/log/rsyslog -l Defaulting container name to rsyslog. Use 'oc describe pod/rsyslog-gkqvh -n openshift-logging' to see all of the containers in this pod. total 65472 -rw-r--r--. 1 root root 35792065 Jul 16 07:56 rsyslog.log I checked logrotate container, no log. Version-Release number of selected component (if applicable): ose-logging-rsyslog-v4.2.0-201907141446 ose-cluster-logging-operator-v4.2.0-201907121819 How reproducible: Always Steps to Reproduce: 1. Deploy logging via OLM, set log collector to rsyslog 2. oc set env ds/rsyslog LOGGING_FILE_SIZE=1024000 LOGGING_FILE_AGE=5 3. wait for a while, check files in /var/log/rsyslog in rsyslog pods Actual results: The log rotation for rsyslog doesn't work. Expected results: The log rotation should work. Additional info:
Note that rsyslog does not have log rotation built-in to it - there is no way we can ever make it rotate exactly at the LOGGING_FILE_SIZE - rsyslog size based log rotation is "best effort" or "approximate". rsyslog logrotation runs every day: https://github.com/richm/cluster-logging-operator/blob/master/files/logrotate/logrotate # run logrotate (/var/log/rsyslog/*.log) at 2am everyday # run logrotate_pod (/var/lib/rsyslog.pod) at 3am everyday 0 2 * * * root /usr/bin/bash /opt/app-root/bin/logrotate.sh 0 3 * * * root /usr/bin/bash /opt/app-root/bin/logrotate_pod.sh With this configuration, the file can exceed 10M in size before logrotate runs. If this is not acceptable, we can change logrotate to run every hour or every 10 minutes. Or, leave this configuration as the default, and document for customers how it works and how to change it. The configuration is in cron format. See `man 5 crontab`.
Thanks for the note. I tried to set LOGGING_FILE_AGE=5, LOGGING_FILE_PATH and LOGGING_FILE_SIZE were using the default value, and set `*/5 * * * * root /usr/bin/bash /opt/app-root/bin/logrotate.sh` in cm/logrotate-crontab, here are what I found: 1. after the rotate finished, the log file rsyslog.log was renamed to rsyslog.log-$timestamp, and a new file rsyslog.log was created at the same time, but the logs were still written to rsyslog.log-$timestamp, not in the rsyslog.log. $ oc rsh rsyslog-wztfl Defaulting container name to rsyslog. Use 'oc describe pod/rsyslog-wztfl -n openshift-logging' to see all of the containers in this pod. sh-4.2# ls /var/log/rsyslog/ -l total 75072 -rw-r--r--. 1 root root 467 Jul 17 03:40 logrotate.log -rw-r--r--. 1 root root 1007 Jul 17 03:25 rsyslog.log -rw-r--r--. 1 root root 10382396 Jul 17 02:49 rsyslog.log-20190717-1563331501 -rw-r--r--. 1 root root 12649621 Jul 17 03:02 rsyslog.log-20190717-1563332101 -rw-r--r--. 1 root root 11345789 Jul 17 03:11 rsyslog.log-20190717-1563332701 -rw-r--r--. 1 root root 12271294 Jul 17 03:21 rsyslog.log-20190717-1563333302 -rw-r--r--. 1 root root 23122678 Jul 17 03:40 rsyslog.log-20190717-1563333901 If delete rsyslog pods, then wait until new pods start, the logs were written to rsyslog.log. I checked the content in logrotate.log, it will only rotate file "*.log", it won't rotate file "rsyslog.log-$timestamp". If the rsyslog pods were not deleted, the log files rsyslog.log-$timestamp will never be deleted, and it can be very large. # cat logrotate.log reading config file /tmp/logrotate.conf Allocating hash table for state file, size 15360 B Handling 1 logs rotating pattern: "/var/log/rsyslog/*.log" 1024000 bytes (5 rotations) empty log files are not rotated, old logs are removed considering log /var/log/rsyslog/logrotate.log log does not need rotating (log size is below the 'size' threshold) considering log /var/log/rsyslog/rsyslog.log log does not need rotating (log size is below the 'size' threshold) I thought the rsyslog pod's log should always be written to rsyslog.log. 2. I have set LOGGING_FILE_AGE=5, there always has 5 files named as rsyslog.log-$timestamp Use 'oc describe pod/rsyslog-wztfl -n openshift-logging' to see all of the containers in this pod. total 198984 -rw-r--r--. 1 root root 467 Jul 17 03:20 logrotate.log -rw-r--r--. 1 root root 2142615 Jul 17 03:22 rsyslog.log -rw-r--r--. 1 root root 152907148 Jul 17 02:41 rsyslog.log-20190717-1563330901 -rw-r--r--. 1 root root 10382396 Jul 17 02:49 rsyslog.log-20190717-1563331501 -rw-r--r--. 1 root root 12649621 Jul 17 03:02 rsyslog.log-20190717-1563332101 -rw-r--r--. 1 root root 11345789 Jul 17 03:11 rsyslog.log-20190717-1563332701 -rw-r--r--. 1 root root 12271294 Jul 17 03:21 rsyslog.log-20190717-1563333302 Use 'oc describe pod/rsyslog-wztfl -n openshift-logging' to see all of the containers in this pod. total 58688 -rw-r--r--. 1 root root 921 Jul 17 03:25 logrotate.log -rw-r--r--. 1 root root 1007 Jul 17 03:25 rsyslog.log -rw-r--r--. 1 root root 10382396 Jul 17 02:49 rsyslog.log-20190717-1563331501 -rw-r--r--. 1 root root 12649621 Jul 17 03:02 rsyslog.log-20190717-1563332101 -rw-r--r--. 1 root root 11345789 Jul 17 03:11 rsyslog.log-20190717-1563332701 -rw-r--r--. 1 root root 12271294 Jul 17 03:21 rsyslog.log-20190717-1563333302 -rw-r--r--. 1 root root 6091556 Jul 17 03:25 rsyslog.log-20190717-1563333901
(In reply to Qiaoling Tang from comment #2) > 1. after the rotate finished, the log file rsyslog.log was renamed to > rsyslog.log-$timestamp, and a new file rsyslog.log was created at the same > time, but the logs were still written to rsyslog.log-$timestamp, not in the > rsyslog.log. Currently, rsyslog.log is shared between the main rsyslogd process and its mmexternal plugin undefined_field. Obviously, undefined_field is not handling SIGHUP. I'm afraid we need to include and use this kind of library... https://github.com/client9/reopen To make sure my guess is right, could you please find out these? 1) Once logrotation happens, the size of rsyslog.log remains the same and nothing is added? Or something is added even if it is very small? 2) What rsyslog.log-$timestamp is getting? Could you share some sample lines? Thanks.
(In reply to Noriko Hosoi from comment #3) > To make sure my guess is right, could you please find out these? > 1) Once logrotation happens, the size of rsyslog.log remains the same and > nothing is added? Or something is added even if it is very small? sh-4.2# ls -lh total 218M -rw-r--r--. 1 root root 467 Jul 18 00:15 logrotate.log -rw-r--r--. 1 root root 1010 Jul 17 23:46 rsyslog.log -rw-r--r--. 1 root root 4.1M Jul 17 23:10 rsyslog.log-20190717-1563405001 -rw-r--r--. 1 root root 3.9M Jul 17 23:15 rsyslog.log-20190717-1563405301 -rw-r--r--. 1 root root 7.9M Jul 17 23:23 rsyslog.log-20190717-1563405601 -rw-r--r--. 1 root root 1.1M Jul 17 23:24 rsyslog.log-20190717-1563405901 -rw-r--r--. 1 root root 144M Jul 18 00:18 rsyslog.log-20190717-1563406201 sh-4.2# cat rsyslog.log {"@timestamp":"2019-07-17T23:30:01.789081+00:00", "host":"rsyslog-5lkwz", "level":"info", "facility":"syslog", "syslog-tag":"rsyslogd:", "source":"rsyslogd", "procid":"-", "msgid":"-", "message":" [origin software=\"rsyslogd\" swVersion=\"8.37.0-9.el7\" x-pid=\"80546\" x-info=\"http:\/\/www.rsyslog.com\"] rsyslogd was HUPed"} {"@timestamp":"2019-07-17T23:30:01.789280+00:00", "host":"rsyslog-5lkwz", "level":"info", "facility":"syslog", "syslog-tag":"rsyslogd0:", "source":"rsyslogd0", "procid":"-", "msgid":"-", "message":"lookup table 'prio_to_level' reloaded from file '\/etc\/rsyslog.d\/prio_to_level.json' [v8.37.0-9.el7 try http:\/\/www.rsyslog.com\/e\/0 ]"} {"@timestamp":"2019-07-17T23:30:01.789296+00:00", "host":"rsyslog-5lkwz", "level":"info", "facility":"syslog", "syslog-tag":"rsyslogd0:", "source":"rsyslogd0", "procid":"-", "msgid":"-", "message":"lookup table 'normalize_level' reloaded from file '\/etc\/rsyslog.d\/normalize_level.json' [v8.37.0-9.el7 try http:\/\/www.rsyslog.com\/e\/0 ]"} > 2) What rsyslog.log-$timestamp is getting? Could you share some sample > lines? The whole file is in the attachment. and I deleted lines contain "No Action Needed for xxxxxxxxx". sh-4.2# cat rsyslog.log-20190717-1563406201 mmexternal: merge_json_log: false mmexternal: use_undefined: false mmexternal: default_keep_fields: CEE,time,@timestamp,aushape,ci_job,collectd,docker,fedora-ci,file,foreman,geoip,hostname,ipaddr4,ipaddr6,kubernetes,level,message,namespace_name,namespace_uuid,offset,openstack,ovirt,pid,pipeline_metadata,rsyslog,service,systemd,tags,testcase,tlog,viaq_msg_id mmexternal: extra_keep_fields: mmexternal: undefined_name: undefined mmexternal: keep_empty_fields: mmexternal: undefined_to_string: false mmexternal: undefined_dot_replace_char: UNUSED mmexternal: undefined_max_num_fields: 9223372036854775807 mmexternal: noaction: true {"@timestamp":"2019-07-17T23:05:33.140008+00:00", "host":"rsyslog-5lkwz", "level":"err", "facility":"syslog", "syslog-tag":"rsyslogd:", "source":"rsyslogd", "procid":"-", "msgid":"-", "message":"program '\/usr\/local\/bin\/undefined_field' exited normally, state 0 [v8.37.0-9.el7]"} {"@timestamp":"2019-07-17T23:05:33.186499+00:00", "host":"rsyslog-5lkwz", "level":"err", "facility":"syslog", "syslog-tag":"rsyslogd:", "source":"rsyslogd", "procid":"-", "msgid":"-", "message":"program '\/usr\/local\/bin\/undefined_field' exited normally, state 0 [v8.37.0-9.el7]"} {"@timestamp":"2019-07-17T23:05:33.232430+00:00", "host":"rsyslog-5lkwz", "level":"err", "facility":"syslog", "syslog-tag":"rsyslogd:", "source":"rsyslogd", "procid":"-", "msgid":"-", "message":"program '\/usr\/local\/bin\/undefined_field' exited normally, state 0 [v8.37.0-9.el7]"} {"@timestamp":"2019-07-17T23:05:33.281407+00:00", "host":"rsyslog-5lkwz", "level":"err", "facility":"syslog", "syslog-tag":"rsyslogd:", "source":"rsyslogd", "procid":"-", "msgid":"-", "message":"program '\/usr\/local\/bin\/undefined_field' exited normally, state 0 [v8.37.0-9.el7]"} {"@timestamp":"2019-07-17T23:05:33.329929+00:00", "host":"rsyslog-5lkwz", "level":"err", "facility":"syslog", "syslog-tag":"rsyslogd:", "source":"rsyslogd", "procid":"-", "msgid":"-", "message":"program '\/usr\/local\/bin\/undefined_field' exited normally, state 0 [v8.37.0-9.el7]"} {"@timestamp":"2019-07-17T23:05:33.376303+00:00", "host":"rsyslog-5lkwz", "level":"err", "facility":"syslog", "syslog-tag":"rsyslogd:", "source":"rsyslogd", "procid":"-", "msgid":"-", "message":"program '\/usr\/local\/bin\/undefined_field' exited normally, state 0 [v8.37.0-9.el7]"}
Created attachment 1591650 [details] rsyslog.log-20190717-1563406201
Thank you, @Qialling! Your data proved my theory.
https://github.com/openshift/origin-aggregated-logging/pull/1685
Verified with ose-logging-rsyslog-v4.2.0-201907251819.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922