Bug 1909004

Summary: "No datapoints found" for RHEL node's filesystem graph
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: Management ConsoleAssignee: Seth Jennings <sjenning>
Status: CLOSED ERRATA QA Contact: Yadan Pei <yapei>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.7CC: aos-bugs, jhadvig, jokerman, sjenning, yapei
Target Milestone: ---Keywords: UpcomingSprint
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:46:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1914342    
Attachments:
Description Flags
Filesystem graph shows error "No datapoints found" for rhel worker none

Description Junqi Zhao 2020-12-18 07:06:25 UTC
Created attachment 1740161 [details]
Filesystem graph shows error "No datapoints found" for rhel worker

Created attachment 1740161 [details]
Filesystem graph shows error "No datapoints found" for rhel worker

Description of problem:
upi-on-baremetal openstack cluster, added two rhel 7.9 workers, login cluster console, check on "Compute -> Nodes", select one rhel worker, Filesystem graph shows error "No datapoints found", the issue is only happen with the rhel workers

this is the total file system size API for rhel worker 'piqin-1217-fjnpq-rhel-0'
sum(node_filesystem_size_bytes{instance='piqin-1217-fjnpq-rhel-0',mountpoint="/var"})

but there is not mountpoint="/var" for the rhel woker
sum(node_filesystem_size_bytes{instance='piqin-1217-fjnpq-rhel-0'}) by (mountpoint)
{mountpoint="/"}     64412954624
{mountpoint="/run"}  4100419584

there is mountpoint="/var" for the nodes except rhel workers, example
sum(node_filesystem_size_bytes{instance='piqin-1217-fjnpq-compute-0'}) by (mountpoint)
{mountpoint="/boot/efi"} 132888576
{mountpoint="/"}         85350920192
{mountpoint="/etc"}      85350920192
{mountpoint="/sysroot"}  85350920192
{mountpoint="/usr"}      85350920192
{mountpoint="/var"}      85350920192
{mountpoint="/run"}      8400769024
{mountpoint="/boot"}     381549568
{mountpoint="/tmp"}      8400769024

# oc get infrastructures/cluster -o jsonpath="{..status.platform}"
None

# oc get node --show-labels
NAME                               STATUS   ROLES    AGE   VERSION           LABELS
piqin-1217-fjnpq-compute-0         Ready    worker   28h   v1.19.2+e386040   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-compute-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos
piqin-1217-fjnpq-compute-1         Ready    worker   28h   v1.19.2+e386040   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-compute-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos
piqin-1217-fjnpq-control-plane-0   Ready    master   28h   v1.19.2+e386040   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-control-plane-0,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos
piqin-1217-fjnpq-control-plane-1   Ready    master   28h   v1.19.2+e386040   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-control-plane-1,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos
piqin-1217-fjnpq-control-plane-2   Ready    master   28h   v1.19.2+e386040   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-control-plane-2,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos
piqin-1217-fjnpq-rhel-0            Ready    worker   26h   v1.20.0+87544c5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-rhel-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhel
piqin-1217-fjnpq-rhel-1            Ready    worker   26h   v1.20.0+87544c5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-rhel-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhel

# oc describe node piqin-1217-fjnpq-rhel-0
Name:               piqin-1217-fjnpq-rhel-0
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=piqin-1217-fjnpq-rhel-0
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.openshift.io/os_id=rhel
Annotations:        machineconfiguration.openshift.io/currentConfig: rendered-worker-e0400e0f533353886ce817a5fbe31542
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-e0400e0f533353886ce817a5fbe31542
                    machineconfiguration.openshift.io/ssh: accessed
                    machineconfiguration.openshift.io/state: Done
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 16 Dec 2020 22:44:12 -0500
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  piqin-1217-fjnpq-rhel-0
  AcquireTime:     <unset>
  RenewTime:       Fri, 18 Dec 2020 02:04:23 -0500
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Fri, 18 Dec 2020 02:03:03 -0500   Wed, 16 Dec 2020 22:44:12 -0500   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 18 Dec 2020 02:03:03 -0500   Wed, 16 Dec 2020 22:44:12 -0500   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Fri, 18 Dec 2020 02:03:03 -0500   Wed, 16 Dec 2020 22:44:12 -0500   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Fri, 18 Dec 2020 02:03:03 -0500   Wed, 16 Dec 2020 22:44:42 -0500   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.0.98.83
  Hostname:    piqin-1217-fjnpq-rhel-0
Capacity:
  cpu:                4
  ephemeral-storage:  62903276Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             8008632Ki
  pods:               250
Allocatable:
  cpu:                3500m
  ephemeral-storage:  56897917242
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             6857656Ki
  pods:               250
System Info:
  Machine ID:                             49e60791ac874413b11b229248695db5
  System UUID:                            49E60791-AC87-4413-B11B-229248695DB5
  Boot ID:                                17162f44-3297-4ad8-94d3-632db132f357
  Kernel Version:                         3.10.0-1160.11.1.el7.x86_64
  OS Image:                               Red Hat Enterprise Linux Server 7.9 (Maipo)
  Operating System:                       linux
  Architecture:                           amd64
  Container Runtime Version:              cri-o://1.20.0-0.rhaos4.7.gitf3390f3.el7.19-dev
  Kubelet Version:                        v1.20.0+87544c5
  Kube-Proxy Version:                     v1.20.0+87544c5



Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-12-14-165231

How reproducible:
upi-on-baremetal openstack cluster, added two rhel 7.9 workers

Steps to Reproduce:
1. upi-on-baremetal openstack cluster, added two rhel 7.9 workers
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Jakub Hadvig 2020-12-18 10:59:22 UTC
Looks like an aftermath of https://github.com/openshift/console/pull/7201 which is affecting baremetal.

Please select the Target Release once we are confident that the issue will be fixed in selected release. Thanks.

Comment 3 Jakub Hadvig 2020-12-21 12:06:32 UTC
Not really sure what the problem solution should be, but it sounds just odd to read that that user can't see stats for a RHEL node, which I thought is the primary node type.

Comment 7 Yadan Pei 2021-01-14 05:56:41 UTC
1. Create 4.7 cluster and add RHEL worker nodes
2. Check filesystem usage query for RHCOS and RHEL worker node


Filesystem usage for RHEL node is also shown correctly
sum(node_filesystem_size_bytes{instance="xxx-rhel-0",mountpoint="/"} - node_filesystem_avail_bytes{instance="xxx-rhel-0",mountpoint="/"})

Verified on 4.7.0-0.nightly-2021-01-13-054018

Comment 8 Yadan Pei 2021-01-14 05:57:41 UTC
Also query sum(node_filesystem_size_bytes{instance="xxxx",mountpoint="/"} - node_filesystem_avail_bytes{instance="xxxx",mountpoint="/"}) is also applicable to RHCOS nodes

Comment 11 errata-xmlrpc 2021-02-24 15:46:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633