Bug 1812529

Summary: console dashboard shows wrong filesystem size on s390x
Product: OpenShift Container Platform Reporter: Alexander Klein <alklein>
Component: Management ConsoleAssignee: Rastislav Wagner <rawagner>
Status: CLOSED DUPLICATE QA Contact: Yadan Pei <yapei>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.3.zCC: alegrand, anpicker, aos-bugs, danili, erooth, harpatil, Holger.Wolf, hwolf, jokerman, kakkoyun, lcosic, mloibl, nagrawal, nbziouec, nthomas, pkrupa, rawagner, rdave, rphillips, surbania, tjelinek, vjaypurk, yzamir
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: All   
OS: Unspecified   
Whiteboard: multi-arch
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-18 10:48:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexander Klein 2020-03-11 14:05:15 UTC
Description of problem:
for a cluster (4.3.0-0.nightly-s390x-2020-03-09-183623) with 5 nodes (3xmaster, 2xworker) each installed to a 120Gi scsi (fcp) disk the dashboard on the console displays wrong total Filesystem size of 2.22Ti instead of 0,6Ti

same is true for another cluster (4.2.20) with 53 nodes (3xmaster,50worker) each installed to same size of disks  the dashboard on the console displays 23.27Ti instead of 6,63Ti

Comment 1 Yaacov Zamir 2020-05-12 04:43:17 UTC
Hi, it looks like the offending prometheus query is:

`sum(node_filesystem_size_bytes)` [1]

[1] https://github.com/openshift/console/blob/master/frontend/public/components/dashboard/dashboards-page/cluster-dashboard/queries.ts#L78

a - moving to management console because the offending code is in public code tree.

b - this may be a monitoring problem, it looks like the data in prometheus may be already wrong ?

----

@Alexander hi,

I moved this to management console, but maybe this should go to monitoring instead, because it looks like the prometheus data it wrong ?

Comment 2 Yaacov Zamir 2020-05-12 04:50:41 UTC
@Tomas FYI

Comment 3 Tomas Jelinek 2020-05-12 05:32:10 UTC
@Rastislav: PTAL

Comment 4 Rastislav Wagner 2020-05-12 06:43:26 UTC
@Alexander probably a monitoring issue, but to make sure, can you give us the output of `sum(node_filesystem_size_bytes)`  query directly from prometheus ?

Comment 5 Rastislav Wagner 2020-05-21 06:41:28 UTC
Moving to Monitoring, we dont do anything fancy on UI side, just query prometheus.

Comment 7 Alexander Klein 2020-05-27 14:20:46 UTC
tried this on another cluster with 21 nodes each with 120GiB fcp lun.
sum(node_filesystem_size_bytes) is showing 9281955530752 which eqals the 8.44TiB shown on the dashboard but not the actual disc space.

Comment 10 Dan Li 2020-07-28 20:55:21 UTC
Adding "UpcomingSprint" label as this bug is unlikely to be fixed during this sprint.

Comment 16 Rastislav Wagner 2020-10-01 07:09:29 UTC
The same issue occurs on x86, cluster deployed with Assisted Installer service (baremetal, cloud.redhat.com).
Thus changing platforms to all and severity to high.

Comment 17 Dan Li 2020-10-01 12:13:37 UTC
If this is all platforms, should we change the component of this bug to its corresponding team instead of multi-arch only?

Comment 18 Rastislav Wagner 2020-10-01 16:15:17 UTC
right, reassigning to Node team.

Comment 30 Rastislav Wagner 2020-11-18 10:48:45 UTC
Marking as duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1893601 which already has open PR.

*** This bug has been marked as a duplicate of bug 1893601 ***