Bug 1632350

Summary: [starter-ca-central-1] NodeDiskRunningFull reports wrong mount?
Product: OpenShift Container Platform Reporter: Justin Pierce <jupierce>
Component: MonitoringAssignee: Frederic Branczyk <fbranczy>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.11.0   
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:40:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Description Flags
alert in UI
[6d] listing for mountpoint
listing showing actual < 0 mountpoint none

Description Justin Pierce 2018-09-24 15:42:32 UTC
Created attachment 1486444 [details]
alert in UI

Description of problem:

Receiving alerts on this cluster:
`device tmpfs on node is running full within the next 24 hours (mounted at /host/root/run/user/0)` where /host/root/run/user/0 is completely empty. 

But if I check this mount, it is completely empty:
[root@ip-172-31-17-171 ~]# df -h | grep /run/user
tmpfs                                   3.2G     0  3.2G   0% /run/user/0

Version-Release number of selected component (if applicable):
oc v3.11.0-0.21.0
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

How reproducible:
Steady state on cluster at present

Actual results:
Alert is presently firing.

Expected results:
There is no danger of this partition filling. The prediction seems inaccurate. 

Additional info:

Comment 1 Justin Pierce 2018-09-24 15:45:08 UTC
Created attachment 1486445 [details]
[6d] listing for mountpoint

Comment 2 Justin Pierce 2018-09-24 15:48:09 UTC
Created attachment 1486459 [details]
listing showing actual < 0 mountpoint

Actual mountpoint seems to be:

{device="/dev/mapper/rootvg-var_log",endpoint="https",fstype="xfs",instance="",job="node-exporter",mountpoint="/host/root/var/log",namespace="openshift-monitoring",pod="node-exporter-gs6rv",service="node-exporter"}	-145849006.69107854

Comment 3 Frederic Branczyk 2019-02-22 16:48:08 UTC
https://github.com/openshift/cluster-monitoring-operator/pull/173 pulled in the changes to appropriately ignore a device such as the one reported here. This will land in 4.0.

Comment 5 Junqi Zhao 2019-03-08 07:33:51 UTC
Issue is fixed with

Comment 8 errata-xmlrpc 2019-06-04 10:40:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.