RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1763746 - rsyslog doesn't read lines from imfile after rotation of logs
Summary: rsyslog doesn't read lines from imfile after rotation of logs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: rsyslog
Version: 7.7
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Jiří Vymazal
QA Contact: Dalibor Pospíšil
URL:
Whiteboard:
: 1762035 1772672 (view as bug list)
Depends On:
Blocks: 1805675
TreeView+ depends on / blocked
 
Reported: 2019-10-21 14:02 UTC by Renaud Métrich
Modified: 2023-10-06 18:41 UTC (History)
16 users (show)

Fixed In Version: rsyslog-8.24.0-52.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1805675 (view as bug list)
Environment:
Last Closed: 2020-03-31 19:10:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github rsyslog rsyslog issues 3960 0 'None' closed Imfile rename to hash-style state file limited 2021-02-17 13:18:43 UTC
Github rsyslog rsyslog pull 2953 0 'None' closed omfile: implement file-id, used in state file 2021-02-17 13:18:42 UTC
Github rsyslog rsyslog pull 3521 0 'None' closed imfile bugfix: file reader could get stuck 2021-02-17 13:18:43 UTC
Red Hat Knowledge Base (Solution) 4546511 0 None None None 2019-11-01 13:03:47 UTC
Red Hat Product Errata RHSA-2020:1000 0 None None None 2020-03-31 19:10:20 UTC

Internal Links: 1772672

Description Renaud Métrich 2019-10-21 14:02:19 UTC
Description of problem:

imfile module makes use of a state file /var/lib/rsyslog/imfile-state:<INODE> where <INODE> is the inode of the file being watched, e.g. "/var/opt/rh/rh-nginx18/log/nginx/access.log", e.g.

  # stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log
  17625800

  There is now 1 state file: /var/lib/rsyslog/imfile-state:17625800

Upon rotating once, rsyslog doesn't delete the state file corresponding to the inode prior to rotation, e.g.

  # logrotate -f /etc/logrotate.d//etc/logrotate.d/rh-nginx18-nginx
  # stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log
  17625781

  There are now 2 state files:
  - /var/lib/rsyslog/imfile-state:17625800
  - /var/lib/rsyslog/imfile-state:17625781

Upon rotating a second time, the "/var/opt/rh/rh-nginx18/log/nginx/access.log" file inode is prior to first rotation is "reused", causing logs to be lost until the offset saved in the /var/lib/rsyslog/imfile-state:<INODE> state file is reached, e.g.

  # logrotate -f /etc/logrotate.d//etc/logrotate.d/rh-nginx18-nginx
  # stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log
  17625800


Version-Release number of selected component (if applicable):

rsyslog-8.24.0-41.el7_7.2.x86_64


How reproducible:

Always

Steps to Reproduce:
0. Install rh-nginx18-nginx (RHSCL), start it, make rsyslog watch the log file

  # yum -y install rh-nginx18-nginx
  # systemctl start rh-nginx18-nginx

  # cat > /etc/rsyslog.d/nginx.conf << EOF
\$InputFileName /var/opt/rh/rh-nginx18/log/nginx/access.log
\$InputFileTag nginx_tag
\$InputFileSeverity notice
\$InputFileFacility local6
\$InputFilePersistStateInterval 0
\$InputRunFileMonitor
EOF

  # systemctl restart rsyslog

1. Create some nginx logs with curl

  # for i in $(seq 1 100); do curl http://localhost/step1 >/dev/null 2>&1; done

2. Check for inode, rotate and check for inode again

  # stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log; logrotate -f /etc/logrotate.d/rh-nginx18-nginx; stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log
  XXX
  YYY

3. Create some nginx logs with curl again

  # for i in $(seq 1 100); do curl http://localhost/step3 >/dev/null 2>&1; done

4. Check for inode, rotate and check for inode again

  # stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log; logrotate -f /etc/logrotate.d/rh-nginx18-nginx; stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log
  YYY
  XXX

  If inode XXX is not the one in step 2, create some more logs again and rotate again.

5. Create a new nginx log

  # curl http://localhost/step5 >/dev/null 2>&1
  # tail /var/opt/rh/rh-nginx18/log/nginx/access.log


Actual results:

The log is in /var/opt/rh/rh-nginx18/log/nginx/access.log, but never reaches /var/log/messages because offset in state file /var/lib/rsyslog/imfile-state:XXX is high

Expected results:

Log sent to /var/log/messages

Additional info:

This doesn't happen on Upstream (Fedora 30 rsyslog) because the state file is now different: it has a hash in postfix.

I believe backporting ee77f27db0391126f7210dc3f49ab7f01037a700 and friends (the hash part in the state file) would help, but I'm not completely sure yet.

For sure, the issue is due to keeping the state file for old inode which then gets reused, but offset in the state file is not zero.

Comment 2 Renaud Métrich 2019-10-21 14:37:17 UTC
An alternate reproducer is to have logrotate use the "copytruncate" option for the log file.
In such case, because the inode never changes, logs are lost upon rotation (rsyslog doesn't detect the file content changed at all).

Reproducer:

1. Create many logs (1000)

  # for i in $(seq 1 1000); do curl http://localhost/step1 >/dev/null 2>&1; done

2. Rotate

  # stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log; logrotate -f /etc/logrotate.d/rh-nginx18-nginx; stat -c %i /var/opt/rh/rh-nginx18/log/nginx/access.log
  <INODE>
  <INODE>

  (Same inodes above due to using "copytruncate")

3. Create some logs

  # curl http://localhost/step3 >/dev/null 2>&1

--> isn't read my imfile due to offset being large (rsyslog didn't detect the rotation at all).

Comment 3 Rich Megginson 2019-10-22 17:13:03 UTC
I think this bz is related to https://bugzilla.redhat.com/show_bug.cgi?id=1739558 - note that description says "rsyslog underperforming compared to fluentd" - that is misleading - it is really that rsyslog is dropping logs due to cri-o/kubelet log rotation.

Comment 4 Rich Megginson 2019-10-22 17:14:59 UTC
Also - have you tried to use the reopenontruncate setting: https://www.rsyslog.com/doc/v8-stable/configuration/modules/imfile.html#reopenontruncate ?  Or have tried to use a later version of rsyslog to see if that fixes the problem?

Comment 5 Renaud Métrich 2019-10-23 06:56:28 UTC
My comment #2 is misleading, this is another scenario of when the issue happens, but for other reasons (not using "reopnontruncate").

In the description logrotate doesn't truncate the log file, it creates a new inode.
The issue is with the older state files which do not get deleted, causing the issue when some inode was already used for the log file some time ago (typically I see this reuse after one or 2 rotations).

All this has been fixed upstream by the proposed fix + some recent fixes.

I'm proposing the following workaround, which consists in deleting rsyslog state files that will be used for the current logs after rotation:

/etc/logrotate.d/rh-nginx18-nginx:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
/var/opt/rh/rh-nginx18/log/nginx/*log {
    create 0644 nginx nginx
    daily
    rotate 10
    missingok
    notifempty
    compress
    sharedscripts
    postrotate
        for inode in $(stat -c %i $1); do
            /bin/rm /var/lib/rsyslog/imfile-state:$inode 2>/dev/null || true
        done
        /bin/kill -USR1 `cat /var/opt/rh/rh-nginx18/run/nginx/nginx.pid 2>/dev/null` 2>/dev/null || true
    endscript
}
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

The modification is in "postrotate", the first 3 lines.

In theory, this should work well because at the "postrotate" step here, the new logs have been created by logrotate (through "create" statement) but nginx doesn't use them yet since it didn't receive the USR1 signal, hence rsyslog is still processing the older file.

I'm still waiting for my customer to confirm it works, I will then create a KCS with this.

Comment 6 Rich Megginson 2019-10-23 13:37:23 UTC
(In reply to Renaud Métrich from comment #5)
> My comment #2 is misleading, this is another scenario of when the issue
> happens, but for other reasons (not using "reopnontruncate").
> 
> In the description logrotate doesn't truncate the log file, it creates a new
> inode.
> The issue is with the older state files which do not get deleted, causing
> the issue when some inode was already used for the log file some time ago
> (typically I see this reuse after one or 2 rotations).
> 
> All this has been fixed upstream by the proposed fix + some recent fixes.

So if we rebase rsyslog in rhel 8.2 to the latest upstream, we can solve this issue for rhel8?

> 
> I'm proposing the following workaround, which consists in deleting rsyslog
> state files that will be used for the current logs after rotation:
> 
> /etc/logrotate.d/rh-nginx18-nginx:
> -------- 8< ---------------- 8< ---------------- 8< ---------------- 8<
> --------
> /var/opt/rh/rh-nginx18/log/nginx/*log {
>     create 0644 nginx nginx
>     daily
>     rotate 10
>     missingok
>     notifempty
>     compress
>     sharedscripts
>     postrotate
>         for inode in $(stat -c %i $1); do
>             /bin/rm /var/lib/rsyslog/imfile-state:$inode 2>/dev/null || true
>         done
>         /bin/kill -USR1 `cat /var/opt/rh/rh-nginx18/run/nginx/nginx.pid
> 2>/dev/null` 2>/dev/null || true
>     endscript
> }
> -------- 8< ---------------- 8< ---------------- 8< ---------------- 8<
> --------
> 
> The modification is in "postrotate", the first 3 lines.
> 
> In theory, this should work well because at the "postrotate" step here, the
> new logs have been created by logrotate (through "create" statement) but
> nginx doesn't use them yet since it didn't receive the USR1 signal, hence
> rsyslog is still processing the older file.
> 
> I'm still waiting for my customer to confirm it works, I will then create a
> KCS with this.

Ok, but note that if the customer is using imfile with startmsg.regex or endmsg.regex, there is a chance that the log will end with a partial line, which will be saved in the imfile-state:$inode file.  The timing has to be just right, but if you restart rsyslog, it will lose the copy of the partial line in memory, and will not be able to reconstruct the partial line from the information saved in the imfile-state:$inode file.

Comment 7 Renaud Métrich 2019-10-23 13:55:59 UTC
(In reply to Rich Megginson from comment #6)
> So if we rebase rsyslog in rhel 8.2 to the latest upstream, we can solve
> this issue for rhel8?
> 

It looks like it works with very latest from Upstream (tried with 8.1911.0.master and didn't reproduce).

> > 
> > I'm proposing the following workaround, which consists in deleting rsyslog
> > state files that will be used for the current logs after rotation:
> > 
> > /etc/logrotate.d/rh-nginx18-nginx:
> > -------- 8< ---------------- 8< ---------------- 8< ---------------- 8<
> > --------
> > /var/opt/rh/rh-nginx18/log/nginx/*log {
> >     create 0644 nginx nginx
> >     daily
> >     rotate 10
> >     missingok
> >     notifempty
> >     compress
> >     sharedscripts
> >     postrotate
> >         for inode in $(stat -c %i $1); do
> >             /bin/rm /var/lib/rsyslog/imfile-state:$inode 2>/dev/null || true
> >         done
> >         /bin/kill -USR1 `cat /var/opt/rh/rh-nginx18/run/nginx/nginx.pid
> > 2>/dev/null` 2>/dev/null || true
> >     endscript
> > }
> > -------- 8< ---------------- 8< ---------------- 8< ---------------- 8<
> > --------
> > 
> > The modification is in "postrotate", the first 3 lines.
> > 
> > In theory, this should work well because at the "postrotate" step here, the
> > new logs have been created by logrotate (through "create" statement) but
> > nginx doesn't use them yet since it didn't receive the USR1 signal, hence
> > rsyslog is still processing the older file.
> > 
> > I'm still waiting for my customer to confirm it works, I will then create a
> > KCS with this.
> 
> Ok, but note that if the customer is using imfile with startmsg.regex or
> endmsg.regex, there is a chance that the log will end with a partial line,
> which will be saved in the imfile-state:$inode file.  The timing has to be
> just right, but if you restart rsyslog, it will lose the copy of the partial
> line in memory, and will not be able to reconstruct the partial line from
> the information saved in the imfile-state:$inode file.

I understand it's not optimal.

Comment 9 Renaud Métrich 2019-11-01 10:39:37 UTC
*** Bug 1762035 has been marked as a duplicate of this bug. ***

Comment 10 Renaud Métrich 2019-11-01 10:43:33 UTC
Note that workaround in Comment #5 may not be sufficient. It has been reported that this creates an AVC (logrotate is not authorized to delete the old state files).
Unfortunately I didn't find the time to confirm myself.

Comment 11 Renaud Métrich 2019-11-01 12:19:39 UTC
I can confirm some additional SELinux rules are needed to be able to use the workaround:

logrotate_rsyslog_state_files.te:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
module logrotate_rsyslog_state_files 1.0;

require {
	type syslogd_var_lib_t;
	type logrotate_t;
	class file { getattr unlink };
	class dir { remove_name write };
}

#============= logrotate_t ==============
allow logrotate_t syslogd_var_lib_t:dir { remove_name write };
allow logrotate_t syslogd_var_lib_t:file { getattr unlink };
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Compile and install the module:

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# yum -y install selinux-policy-devel
# make -f /usr/share/selinux/devel/Makefile logrotate_rsyslog_state_files.pp
# semodule -i logrotate_rsyslog_state_files.pp
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Comment 13 Jiří Vymazal 2019-11-05 10:48:09 UTC
adding relevant upstream PRs

Comment 32 Jiří Vymazal 2019-11-26 15:16:31 UTC
*** Bug 1772672 has been marked as a duplicate of this bug. ***

Comment 42 Ugo Bellavance 2020-02-06 11:29:37 UTC
What is taking so long?

Comment 43 Dalibor Pospíšil 2020-02-10 16:26:50 UTC
The bug is now verified. It just waits for a release of rhel-7.8.

Comment 55 errata-xmlrpc 2020-03-31 19:10:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1000


Note You need to log in before you can comment on or make changes to this bug.