Bug 1560358

Summary: Looks like fluentd is keeping open deleted journal files that have been rotated
Product: OpenShift Container Platform Reporter: Peter Portante <pportant>
Component: LoggingAssignee: ewolinet
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: high Docs Contact:
Priority: high    
Version: 3.9.0CC: anli, aos-bugs, dmoessne, ecrosby, ewolinet, florian.herzog, jcantril, jlee, jupierce, mharri, mifiedle, nhosoi, pportant, rmeggins, vwalek
Target Milestone: ---Keywords: OpsBlocker, Reopened
Target Release: 3.10.zFlags: jcantril: needinfo-
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The underlying library that fluentd uses for reading in journal files did not properly handle rolled over files. Consequence: When a journal file was rolled over, fluentd would hold these files even after it was done reading from them. Fix: Updated code in the underlying fluent-plugin-systemd plugin and updated the version in the fluentd dockerfile. Result: We no longer see fluentd holding rolled over journal files.
Story Points: ---
Clone Of:
: 1664744 (view as bug list) Environment:
Last Closed: 2019-01-09 14:56:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1664744    
Attachments:
Description Flags
The fluentd openfile watch logs none

Description Peter Portante 2018-03-26 02:15:24 UTC
88 deleted journal files kept open by fluentd on one node of starter-ca-central-1.  We'll likely need to fix fluentd journal plugin like docker and rxyslog were fixed.


$ lsof -p $(pidof fluentd) | grep deleted | grep -F "/var/log" | wc -l
88

Comment 1 Peter Portante 2018-03-28 01:52:37 UTC
*** Bug 1561121 has been marked as a duplicate of this bug. ***

Comment 2 Noriko Hosoi 2018-03-29 20:04:14 UTC
If the issue is observed for the log-driver=journald, fluent-plugin-systemd depends on the C native library for opening/closing journal files.  It might be an RFE for the C library?
https://www.freedesktop.org/software/systemd/man/sd_journal_open.html

It looks bz1561121 is facing the problem with the json file input.
The input plugin in_tail uses cool.io for managing the journal files.
I wonder if any errors from cool.io could be found in the fluentd log?

In_tail/cool.io refreshes the watch list every refresh_interval interval, by default 60 seconds with rotate_wait 5 seconds.  If that many files are left opened despite of deleted (longer than the refresh_interval), there should be some problems which prevent the refresh...

Comment 3 Peter Portante 2018-04-08 12:15:53 UTC
It does not appear to be a problem with json files, as the list of deleted files still open by fluentd are in /var/log/journal.

Next time I see this, I'll capture the list.

The partial fix is to be sure the use of the sd_journal client API includes a fetch of the file descriptor for inotify immediately after calling sd_journal_open().

The final fix is really in the sd_journal APIs themselves.

Comment 6 Peter Portante 2018-04-09 22:37:22 UTC
My ruby fu is not good, but from what I can tell the ruby systemd code does not call sd_journal_get_fd() immediately after sd_journal_open().

If a client calls sd_journal_get_fd() as soon as possible after calling sd_journal_open() the window for leaking FDs is closed significantly, but not entirely.  The only way to close that gap is with a change to the journal APIs where sd_journal_open() creates the inotify FD at the time of the open.

Can somebody double check me here?

Comment 11 Peter Portante 2018-04-11 14:33:51 UTC
Closing and re-opening the journal is not the right fix.  This led to duplicate messages at a major customer, as that is what rsyslog used to do.  See https://github.com/rsyslog/rsyslog/commit/34b2d733699954e2ac770e8607c1694aacb51862

Comment 15 Anping Li 2018-05-21 09:00:01 UTC
The fix wasn't merged to latest openshift3/logging-fluentd/images/v3.10.0-0.47.0.0.

Comment 16 ewolinet 2018-05-22 20:09:37 UTC
@Anping

It looks like the fix should be available in openshift3/logging-fluentd/images/v3.10.0-0.50.0.0

Comment 17 Anping Li 2018-05-30 03:12:43 UTC
It seems the fix wasn't merged or won't work. 299 deleted journal files are keeping open.

openshift3/logging-fluentd/images/v3.10.0-0.53.0.0"

[root@shared-ug-wmeng-master-etcd-zone2-1 fluentd]# lsof -p 95570 | grep deleted | grep -F "/var/log"  | wc -l
299

Comment 18 Anping Li 2018-05-30 03:16:57 UTC
systemd-219-57.el7.x86_64

[Journal]
 Storage=persistent
 Compress=True
 SyncIntervalSec=1s
 RateLimitInterval=1s
 RateLimitBurst=10000
 SystemMaxUse=8G
 SystemMaxFileSize=10M
 MaxRetentionSec=1month
 ForwardToSyslog=False
 ForwardToWall=False

Comment 22 Peter Portante 2018-06-05 13:59:35 UTC
Filed PR for this upstream: https://github.com/reevoo/fluent-plugin-systemd/pull/65

Comment 23 Jeff Cantrill 2018-06-07 18:43:12 UTC
Adding documentation:

# docker ps | grep fluent

and then to find the correct pid:

# ps -ef | grep of container hash for the one running "sh run.sh"

Comment 25 Anping Li 2018-06-08 08:19:19 UTC
Created attachment 1448991 [details]
The fluentd openfile watch logs

Both openshift3/logging-fluentd:v3.10.0-0.64.0.0 and logging-fluentd:journalpatch  closed the rotated journal files. It took about 2 minutes to close 18 Deleted files. 

Steps: 
1. Watch the journal file and the fluentd opened deleted file
while true; do
  echo "##1 ls -lR /var/log/journal |wc -l"
  ls -lR /var/log/journal |wc -l
  echo "##2 ls -lR /var/log/journal"
  ls -lR /var/log/journal
  fluent_id=$(ps -ef|grep 'fluentd --no-supervisor' | awk '{ if (NR==1) print   $2}')
  echo "##3 suck journal logs"
  lsof -p $fluent_id | grep deleted
  echo done
done
2. Reduce the journal files
journalctl --vacuum-size=1

Comment 26 Jeff Cantrill 2018-06-08 12:31:53 UTC
Moving to ON_QA per #c25 with no intentions to apply the supplemental change proposed in #c24

Comment 27 Peter Portante 2018-06-08 17:35:18 UTC
Anping Li, can you use the mechanism in #c25 to show that it does occur with the currently released image?  If you can, great.  If you can't, we'll likely need to fix the reproducer.

Comment 28 Anping Li 2018-06-11 10:54:12 UTC
@peter, 
The deleted file are clothed when use the release image logging-fluentd/images/v3.7.52. I had tried both #c25 and made the journal log rotated in load testing.

Any suggestion to 100% reproduced it? Maybe I need to generate more complicated records to make fluentd busy.


##1 ls -l /var/log/journal/5ee9c09efa004272a03befb58bf25779 |wc -l
232
##2 ls -lh /var/log/journal/5ee9c09efa004272a03befb58bf25779
total 1.9G
...............
................
##3 suck journal logs
fluentd 26488 root  441r      REG  253,0  8388608  8635532 /var/log/journal/5ee9c09efa004272a03befb58bf25779/system (deleted)
fluentd 26488 root  442r      REG  253,0  8388608  8635533 /var/log/journal/5ee9c09efa004272a03befb58bf25779/system (deleted)
fluentd 26488 root  443r      REG  253,0  8388608  8635534 /var/log/journal/5ee9c09efa004272a03befb58bf25779/system (deleted)

Comment 29 Anping Li 2018-06-12 06:35:15 UTC
@mike, Can you watch that in performance testing?

Comment 31 Mike Fiedler 2018-06-25 14:04:11 UTC
I have not seen this issue.   Will continue to monitor it in a new logging long run starting this week.

Comment 32 Anping Li 2018-07-02 03:02:07 UTC
With /logging-fluentd/images/v3.10.9-1,

The deleted file still be watched.  

[root@starter-ca-central-1-master-692e9 ~]# lsof -p 39279 | grep deleted | grep -F "/var/log" 
fluentd 39279 root   97r      REG  253,3 10485760  16801226 /var/log/journal/d52c597d0f1a42aeb01b5a7d71e63f24/system (deleted)
fluentd 39279 root   98r      REG  253,3 10485760  16800995 /var/log/journal/d52c597d0f1a42aeb01b5a7d71e63f24/system (deleted)
fluentd 39279 root   99r      REG  253,3 10485760  16800319 /var/log/journal/d52c597d0f1a42aeb01b5a7d71e63f24/system (deleted)
fluentd 39279 root  101r      REG  253,3 10485760  16799243 /var/log/journal/d52c597d0f1a42aeb01b5a7d71e63f24/system (deleted)
fluentd 39279 root  102r      REG  253,3 10485760  16799244 /var/log/journal/d52c597d0f1a42aeb01b5a7d71e63f24/system (deleted)
fluentd 39279 root  103r      REG  253,3 10485760  16799245 /var/log/journal/d52c597d0f1a42aeb01b5a7d71e63f24/system (deleted)
fluentd 39279 root  104r      REG  253,3 10485760  16807680 /var/log/journal/d52c597d0f1a42aeb01b5a7d71e63f24/system (deleted)

Comment 35 Mike Fiedler 2018-07-16 17:34:25 UTC
@anli logging-fluentd 3.10.18 contains the gems in comment 34.  I am unable to reproduce the issue with that version of logging-fluentd.

Comment 37 Jeff Cantrill 2018-07-17 13:18:36 UTC
Based on #c35 and your observation in #c32, doesn't that mean you have verified the issue is resolved and we should move to close it out?

Comment 38 Mike Fiedler 2018-07-17 16:55:12 UTC
I think @anli is suggesting a final test in Online.   If not, we can close it out and re-open if it occurs in Online - if so, go ahead and move to ON_QA.

Comment 39 Anping Li 2018-07-18 02:27:21 UTC
@Jeff, @ Mike, Yes, A final test in Online is expected.

Comment 42 Anping Li 2018-07-19 06:45:58 UTC
Moved to verified as comment 25, 34, 41. 
if still hit this issue.  feel free to reopen this bug.

Comment 44 errata-xmlrpc 2018-07-30 19:11:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816

Comment 45 Justin Pierce 2019-01-08 20:45:50 UTC
Reopening since this issue has been observed on clusters running 3.11.44. I don't believe it has been fully resolved. 

[root@starter-us-east-1-node-compute-7edda log]# df -h /var
Filesystem              Size  Used Avail Use% Mounted on
/dev/mapper/rootvg-var   16G   14G  2.5G  85% /var
[root@starter-us-east-1-node-compute-7edda log]# du -shx /var
5.6G    /var

The next place I looked was at *open* files, that are in a *deleted* state.  Fluentd is holding onto 803 of these files.

[root@starter-us-east-1-node-compute-7edda log]# ps -ef | grep /usr/bin/fluentd
root     13551 13530  0 Jan03 ?        00:02:19 /usr/bin/ruby /usr/bin/fluentd --no-supervisor
root     29585  9099  0 16:57 pts/1    00:00:00 grep --color=auto /usr/bin/fluentd
[root@starter-us-east-1-node-compute-7edda log]# lsof -p 13551 | grep deleted | wc -l
803

Comment 46 Rich Megginson 2019-01-08 21:13:08 UTC
(In reply to Justin Pierce from comment #45)
> Reopening since this issue has been observed on clusters running 3.11.44. I
> don't believe it has been fully resolved. 
> 
> [root@starter-us-east-1-node-compute-7edda log]# df -h /var
> Filesystem              Size  Used Avail Use% Mounted on
> /dev/mapper/rootvg-var   16G   14G  2.5G  85% /var
> [root@starter-us-east-1-node-compute-7edda log]# du -shx /var
> 5.6G    /var
> 
> The next place I looked was at *open* files, that are in a *deleted* state. 
> Fluentd is holding onto 803 of these files.
> 
> [root@starter-us-east-1-node-compute-7edda log]# ps -ef | grep
> /usr/bin/fluentd
> root     13551 13530  0 Jan03 ?        00:02:19 /usr/bin/ruby
> /usr/bin/fluentd --no-supervisor
> root     29585  9099  0 16:57 pts/1    00:00:00 grep --color=auto
> /usr/bin/fluentd
> [root@starter-us-east-1-node-compute-7edda log]# lsof -p 13551 | grep
> deleted | wc -l
> 803

I thought we could not REOPEN bz's that have been CLOSED RELEASE in an errata . . . if this correct, please close this bz and clone it

Comment 47 Jeff Cantrill 2020-03-25 15:58:33 UTC
*** Bug 1817059 has been marked as a duplicate of this bug. ***