1632350 – [starter-ca-central-1] NodeDiskRunningFull reports wrong mount?

Bug 1632350 - [starter-ca-central-1] NodeDiskRunningFull reports wrong mount?

Summary: [starter-ca-central-1] NodeDiskRunningFull reports wrong mount?

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Frederic Branczyk
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-09-24 15:42 UTC by Justin Pierce
Modified:	2019-06-04 10:40 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:40:35 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
alert in UI (65.14 KB, image/png) 2018-09-24 15:42 UTC, Justin Pierce	no flags	Details
[6d] listing for mountpoint (16.64 KB, text/plain) 2018-09-24 15:45 UTC, Justin Pierce	no flags	Details
listing showing actual < 0 mountpoint (4.87 KB, text/plain) 2018-09-24 15:48 UTC, Justin Pierce	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:40:45 UTC

Description Justin Pierce 2018-09-24 15:42:32 UTC

Created attachment 1486444 [details]
alert in UI

Description of problem:

Receiving alerts on this cluster:
`device tmpfs on node 172.31.17.171:9100 is running full within the next 24 hours (mounted at /host/root/run/user/0)` where /host/root/run/user/0 is completely empty. 

But if I check this mount, it is completely empty:
[root@ip-172-31-17-171 ~]# df -h | grep /run/user
tmpfs                                   3.2G     0  3.2G   0% /run/user/0


Version-Release number of selected component (if applicable):
oc v3.11.0-0.21.0
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

How reproducible:
Steady state on cluster at present

Actual results:
Alert is presently firing.

Expected results:
There is no danger of this partition filling. The prediction seems inaccurate. 

Additional info:

Comment 1 Justin Pierce 2018-09-24 15:45:08 UTC

Created attachment 1486445 [details]
[6d] listing for mountpoint

Comment 2 Justin Pierce 2018-09-24 15:48:09 UTC

Created attachment 1486459 [details]
listing showing actual < 0 mountpoint

Actual mountpoint seems to be:

{device="/dev/mapper/rootvg-var_log",endpoint="https",fstype="xfs",instance="172.31.17.171:9100",job="node-exporter",mountpoint="/host/root/var/log",namespace="openshift-monitoring",pod="node-exporter-gs6rv",service="node-exporter"}	-145849006.69107854

Comment 3 Frederic Branczyk 2019-02-22 16:48:08 UTC

https://github.com/openshift/cluster-monitoring-operator/pull/173 pulled in the changes to appropriately ignore a device such as the one reported here. This will land in 4.0.

Comment 5 Junqi Zhao 2019-03-08 07:33:51 UTC

Issue is fixed with
4.0.0-0.nightly-2019-03-06-074438

Comment 8 errata-xmlrpc 2019-06-04 10:40:35 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.