Hide Forgot
Created attachment 1740161 [details] Filesystem graph shows error "No datapoints found" for rhel worker Created attachment 1740161 [details] Filesystem graph shows error "No datapoints found" for rhel worker Description of problem: upi-on-baremetal openstack cluster, added two rhel 7.9 workers, login cluster console, check on "Compute -> Nodes", select one rhel worker, Filesystem graph shows error "No datapoints found", the issue is only happen with the rhel workers this is the total file system size API for rhel worker 'piqin-1217-fjnpq-rhel-0' sum(node_filesystem_size_bytes{instance='piqin-1217-fjnpq-rhel-0',mountpoint="/var"}) but there is not mountpoint="/var" for the rhel woker sum(node_filesystem_size_bytes{instance='piqin-1217-fjnpq-rhel-0'}) by (mountpoint) {mountpoint="/"} 64412954624 {mountpoint="/run"} 4100419584 there is mountpoint="/var" for the nodes except rhel workers, example sum(node_filesystem_size_bytes{instance='piqin-1217-fjnpq-compute-0'}) by (mountpoint) {mountpoint="/boot/efi"} 132888576 {mountpoint="/"} 85350920192 {mountpoint="/etc"} 85350920192 {mountpoint="/sysroot"} 85350920192 {mountpoint="/usr"} 85350920192 {mountpoint="/var"} 85350920192 {mountpoint="/run"} 8400769024 {mountpoint="/boot"} 381549568 {mountpoint="/tmp"} 8400769024 # oc get infrastructures/cluster -o jsonpath="{..status.platform}" None # oc get node --show-labels NAME STATUS ROLES AGE VERSION LABELS piqin-1217-fjnpq-compute-0 Ready worker 28h v1.19.2+e386040 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-compute-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos piqin-1217-fjnpq-compute-1 Ready worker 28h v1.19.2+e386040 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-compute-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos piqin-1217-fjnpq-control-plane-0 Ready master 28h v1.19.2+e386040 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-control-plane-0,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos piqin-1217-fjnpq-control-plane-1 Ready master 28h v1.19.2+e386040 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-control-plane-1,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos piqin-1217-fjnpq-control-plane-2 Ready master 28h v1.19.2+e386040 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-control-plane-2,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos piqin-1217-fjnpq-rhel-0 Ready worker 26h v1.20.0+87544c5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-rhel-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhel piqin-1217-fjnpq-rhel-1 Ready worker 26h v1.20.0+87544c5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=piqin-1217-fjnpq-rhel-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhel # oc describe node piqin-1217-fjnpq-rhel-0 Name: piqin-1217-fjnpq-rhel-0 Roles: worker Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=piqin-1217-fjnpq-rhel-0 kubernetes.io/os=linux node-role.kubernetes.io/worker= node.openshift.io/os_id=rhel Annotations: machineconfiguration.openshift.io/currentConfig: rendered-worker-e0400e0f533353886ce817a5fbe31542 machineconfiguration.openshift.io/desiredConfig: rendered-worker-e0400e0f533353886ce817a5fbe31542 machineconfiguration.openshift.io/ssh: accessed machineconfiguration.openshift.io/state: Done volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Wed, 16 Dec 2020 22:44:12 -0500 Taints: <none> Unschedulable: false Lease: HolderIdentity: piqin-1217-fjnpq-rhel-0 AcquireTime: <unset> RenewTime: Fri, 18 Dec 2020 02:04:23 -0500 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Fri, 18 Dec 2020 02:03:03 -0500 Wed, 16 Dec 2020 22:44:12 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 18 Dec 2020 02:03:03 -0500 Wed, 16 Dec 2020 22:44:12 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Fri, 18 Dec 2020 02:03:03 -0500 Wed, 16 Dec 2020 22:44:12 -0500 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Fri, 18 Dec 2020 02:03:03 -0500 Wed, 16 Dec 2020 22:44:42 -0500 KubeletReady kubelet is posting ready status Addresses: InternalIP: 10.0.98.83 Hostname: piqin-1217-fjnpq-rhel-0 Capacity: cpu: 4 ephemeral-storage: 62903276Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8008632Ki pods: 250 Allocatable: cpu: 3500m ephemeral-storage: 56897917242 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 6857656Ki pods: 250 System Info: Machine ID: 49e60791ac874413b11b229248695db5 System UUID: 49E60791-AC87-4413-B11B-229248695DB5 Boot ID: 17162f44-3297-4ad8-94d3-632db132f357 Kernel Version: 3.10.0-1160.11.1.el7.x86_64 OS Image: Red Hat Enterprise Linux Server 7.9 (Maipo) Operating System: linux Architecture: amd64 Container Runtime Version: cri-o://1.20.0-0.rhaos4.7.gitf3390f3.el7.19-dev Kubelet Version: v1.20.0+87544c5 Kube-Proxy Version: v1.20.0+87544c5 Version-Release number of selected component (if applicable): 4.7.0-0.nightly-2020-12-14-165231 How reproducible: upi-on-baremetal openstack cluster, added two rhel 7.9 workers Steps to Reproduce: 1. upi-on-baremetal openstack cluster, added two rhel 7.9 workers 2. 3. Actual results: Expected results: Additional info:
Looks like an aftermath of https://github.com/openshift/console/pull/7201 which is affecting baremetal. Please select the Target Release once we are confident that the issue will be fixed in selected release. Thanks.
Not really sure what the problem solution should be, but it sounds just odd to read that that user can't see stats for a RHEL node, which I thought is the primary node type.
1. Create 4.7 cluster and add RHEL worker nodes 2. Check filesystem usage query for RHCOS and RHEL worker node Filesystem usage for RHEL node is also shown correctly sum(node_filesystem_size_bytes{instance="xxx-rhel-0",mountpoint="/"} - node_filesystem_avail_bytes{instance="xxx-rhel-0",mountpoint="/"}) Verified on 4.7.0-0.nightly-2021-01-13-054018
Also query sum(node_filesystem_size_bytes{instance="xxxx",mountpoint="/"} - node_filesystem_avail_bytes{instance="xxxx",mountpoint="/"}) is also applicable to RHCOS nodes
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633