Jounrald is set to limit used space to 8G. It believes it has done that, but services running in containers continue to keep these deleted files open, causing space to be held open. [root@ocp-compute-004 ~]# journalctl --disk-usage Archived and active journals take up 8.0G on disk. [root@ocp-compute-004 ~]# df -h /var/log Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel-var_log 15G 15G 698M 96% /var/log [root@ocp-compute-004 ~]# du -a /var/log |sort -nr |grep -v journal|head -5 8628792 /var/log 89088 /var/log/messages 38992 /var/log/audit 8196 /var/log/audit/audit.log.4 8196 /var/log/audit/audit.log.3 [root@ocp-compute-004 ~]# journalctl --disk-usage Archived and active journals take up 8.0G on disk. [root@ocp-compute-004 ~]# grep deleted /tmp/lsof.out |grep /var/log |grep -v fluentd |head -2 ruby-time 22524 22634 root 58r REG 253,4 8388608 4230832 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) ruby-time 22524 22634 root 67r REG 253,4 8388608 4194391 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) [root@ocp-compute-004 ~]# grep deleted /tmp/lsof.out |grep /var/log |head -2 fluentd 22524 root 58r REG 253,4 8388608 4230832 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) fluentd 22524 root 67r REG 253,4 8388608 4194391 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) [root@ocp-compute-004 ~]#grep deleted /tmp/lsof.out |grep /var/log |grep -v fluentd |grep -v ruby [ecrosby@ecrosby-localdomain Domino]$ oc get pods --all-namespaces -o wide |grep compute-004 |grep fluent openshift-logging logging-fluentd-cfcd8 1/1 Running 0 15d 10.130.3.48 ocp-compute-004.fqdn <none> [ecrosby@ecrosby-localdomain Domino]$ oc delete pod logging-fluentd-cfcd8 -n openshift-logging pod "logging-fluentd-cfcd8" deleted [ecrosby@ecrosby-localdomain Domino]$ oc get pods --all-namespaces -o wide |grep compute-004 |grep fluent openshift-logging logging-fluentd-dl29g 1/1 Running 0 17s 10.130.2.211 ocp-compute-004.fqdn <none> [root@ocp-compute-004 ~]# df -h /var/log Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel-var_log 15G 8.3G 6.8G 56% /var/log [root@ocp-compute-004 ~]# lsof > /tmp/lsof.good.out [root@ocp-compute-004 ~]# grep deleted /tmp/lsof.good.out |grep /var/log rsyslogd 128442 root 29r REG 253,4 8388608 4251471 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) rsyslogd 128442 root 251r REG 253,4 8388608 4251470 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) in:imjour 128442 128455 root 29r REG 253,4 8388608 4251471 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) in:imjour 128442 128455 root 251r REG 253,4 8388608 4251470 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) rs:main 128442 128456 root 29r REG 253,4 8388608 4251471 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) rs:main 128442 128456 root 251r REG 253,4 8388608 4251470 /var/log/journal/ee906aa861254910b7717373743348e3/system (deleted) [root@ocp-compute-004 ~]# rpm -qa |grep systemd systemd-libs-219-67.el7_7.3.x86_64 oci-systemd-hook-0.2.0-1.git05e6923.el7_6.x86_64 systemd-219-67.el7_7.3.x86_64 systemd-sysv-219-67.el7_7.3.x86_64 [root@ocp-compute-004 ~]# rpm -qa |grep openshift-ansible openshift-ansible-docs-3.11.154-2.git.0.1640c49.el7.noarch openshift-ansible-roles-3.11.154-2.git.0.1640c49.el7.noarch openshift-ansible-playbooks-3.11.154-2.git.0.1640c49.el7.noarch openshift-ansible-3.11.154-2.git.0.1640c49.el7.noarch
Similar issue. I have only identified this issue with fluentd so far. https://bugzilla.redhat.com/show_bug.cgi?id=1560358
Closing as this is a duplicate but ultimately also relies on a fix proposed https://bugzilla.redhat.com/show_bug.cgi?id=1812889. Workaround solution is to periodically cycle fluentd [1]. I believe there is an associated kbase that fundamentally documents [1] [1] https://github.com/openshift/origin-aggregated-logging/blob/master/docs/troubleshooting.md#fluentd-is-holding-onto-deleted-journald-files-that-have-been-rotated *** This bug has been marked as a duplicate of bug 1560358 ***
The bug this was dup-ed against was fixed in 3.10.z. It looks like this bug is in 3.11. Did the fix make it to 3.11?
I am still having the same problem in 3.11.98. Any updates?