Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1331235 - deleted /var/log/messages occupied the disk space /var
deleted /var/log/messages occupied the disk space /var
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.1.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Seth Jennings
DeShuai Ma
:
: 1333663 (view as bug list)
Depends On:
Blocks: OSOPS_V3
  Show dependency treegraph
 
Reported: 2016-04-28 00:33 EDT by Zhiwu Liu
Modified: 2017-03-08 13 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-09-27 05:31:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1933 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.3 Release Advisory 2016-09-27 09:24:36 EDT

  None (edit)
Description Zhiwu Liu 2016-04-28 00:33:26 EDT
Description of problem:
found that there are a bunch of deleted /var/log/messages files occupied the /var disk space

once restart atomic-openshift-node service, the occupied disk space will be released.

Version-Release number of selected component (if applicable):
openshift v3.1.1.6-26-g9549be3
kubernetes v1.1.0-origin-1107-g4c8e6f4
etcd 2.1.2

How reproducible:
almost all online cluster have this issue 100% reproduce

Steps to Reproduce:
1.
2.
3.

Actual results:
[root@vm1 ~]# lsof 2>/dev/null  | grep deleted | sort -k7 -n | tail -20
tuned        993  22369              root    6u      REG              202,2      4096   16818306 /tmp/ffie1MOIC (deleted)
openshift  65774 114440              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  17723              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  42340              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65686              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65775              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65776              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65777              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65778              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65779              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65780              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65783              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65831              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65835              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  65898              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774  66653              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
openshift  65774   7534              root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)
tuned        993                     root    6u      REG              202,2      4096   16818306 /tmp/ffie1MOIC (deleted)
monitor    22534                     root    3w      REG              202,3    407452        271 /var/log/openvswitch/ovs-vswitchd.log-20160303 (deleted)
openshift  65774                     root   21r      REG              202,3  56708073   25167127 /var/log/messages-20160405 (deleted)

[root@vm1 ~]# cd /var
[root@vm1 var]# df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda3      8.0G  1.2G  6.9G  15% /var
[root@vm1 var]# du -sh 
1.1G	.
[root@vm1 var]# 


Expected results:


Additional info:
Comment 1 Andy Goldstein 2016-04-28 08:21:07 EDT
Maybe the cadvisor code isn't handling rotated logs well when it's checking for OOM events. We'll take a look.
Comment 2 Seth Jennings 2016-05-18 17:57:44 EDT
While we haven't definitively proven that the cadvisor code is the one holding the files open, I posted a PR upstream that makes cadvisor handle log rotation properly, closing and reopening the file so that it can be freed.

https://github.com/google/cadvisor/pull/1264
Comment 3 Andy Goldstein 2016-05-27 11:17:08 EDT
PR to kube has merged: https://github.com/kubernetes/kubernetes/pull/25914. OpenShift will get this in whatever rebase contains that PR.
Comment 6 Seth Jennings 2016-06-16 12:46:30 EDT
*** Bug 1333663 has been marked as a duplicate of this bug. ***
Comment 7 Seth Jennings 2016-06-16 18:05:57 EDT
Origin rebase is complete:
https://github.com/openshift/origin/pull/8856

I verified it contains the upstream fix from comment 3.
Comment 8 Seth Jennings 2016-06-16 18:11:05 EDT
Sorry, looking at two bugs at once.  It actually looks like this doesn't come in as part of the rebase and that the origin cadvisor dep needs to be bumped.  Moving back to ASSIGNED.
Comment 12 Andy Goldstein 2016-07-22 14:33:24 EDT
This should be in the 3.3 builds now.
Comment 13 DeShuai Ma 2016-07-25 02:41:39 EDT
Test on openshift v3.3.0.9
There is no those error. verify this bug.
On node:
[root@ip-172-18-8-37 ~]# lsof 2>/dev/null  | grep deleted
[root@ip-172-18-8-37 ~]#
Comment 15 errata-xmlrpc 2016-09-27 05:31:32 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933

Note You need to log in before you can comment on or make changes to this bug.