1845201 – Regression: Some critical cadvisor disk metrics are not reported in 4.4

Bug 1845201 - Regression: Some critical cadvisor disk metrics are not reported in 4.4

Summary: Regression: Some critical cadvisor disk metrics are not reported in 4.4

Keywords:
Status:	CLOSED DUPLICATE of bug 1849269
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	low
Target Milestone:	---
Target Release:	4.5.z
Assignee:	Harshal Patil
QA Contact:	Sunil Choudhary
Docs Contact:
URL:
Whiteboard:
Depends On:	1831908
Blocks:
TreeView+	depends on / blocked

Reported:	2020-06-08 16:00 UTC by Neelesh Agrawal
Modified:	2020-07-27 16:16 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1831908
Clones:	1856258 (view as bug list)
Environment:
Last Closed:	2020-06-22 15:22:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift origin pull 25140	0	None	closed	Bug 1831908: bump cadvisor for incorrect diskstats fix	2020-11-03 16:55:50 UTC

Description Neelesh Agrawal 2020-06-08 16:00:45 UTC

+++ This bug was initially created as a clone of Bug #1831908 +++

While debugging a high priority CI failure, we realized that a number of key host level IO metrics returned by cAdvisor were set to zero.

container_fs_io_time_seconds_total
container_fs_read_seconds_total
container_fs_reads_merged_total
container_fs_io_current

These are zero for all containers AND the root device (which is usually special).  This worked in 3.11 and is a key tool for debugging IO - this has regressed sometime in recent releases.

Needs to be fixed and backported to 4.4 at minimum.

--- Additional comment from Clayton Coleman on 2020-05-05 18:15:09 EDT ---

The disk stats from one of the boxes reporting 0

sh-4.4# cat /proc/diskstats 
 259       0 nvme0n1 3890442 1503 208168451 2536364 155312820 84262481 13280576806 152625519 0 17300601 58405818 0 0 0 0
 259       1 nvme0n1p1 144 0 8802 46 14 0 40 8 0 50 6 0 0 0 0
 259       2 nvme0n1p2 443 0 11953 227 1 0 1 0 0 54 13 0 0 0 0
 259       3 nvme0n1p3 26 0 208 5 0 0 0 0 0 7 0 0 0 0 0
 259       4 nvme0n1p4 3889753 1503 208143872 2536064 155312805 84262481 13280576765 152625510 0 17300508 58405798 0 0 0 0
 253       0 dm-0 3891145 0 208140230 2457857 239575286 0 13280576765 249604868 0 17302451 252062725 0 0 0 0
 259       5 nvme1n1 465396 19 3752698 378287 402540 531025 9171280 423786 0 373694 353150 0 0 0 0
 259       6 nvme2n1 176 0 6634 87 1081 1380 20384 620 0 680 39 0 0 0 0

Also, we should note that the metrics for the LUKS device are returned, but not the root device.  Is devicemapper messing up our metrics after the introduction of the encryption stuff?

--- Additional comment from Harshal Patil on 2020-05-20 09:28:58 EDT ---

I wasn't able to spend adequate amount of time on this sprint due to other higher priority work, I will be looking into it in next sprint.

--- Additional comment from Clayton Coleman on 2020-06-02 10:08:38 EDT ---

These metrics are fundamentally to debugging.  You may not move this to 4.6.

--- Additional comment from Harshal Patil on 2020-06-04 08:37:30 EDT ---

On 4.1.1

# uname -a
Linux ip-10-0-136-166 4.18.0-80.1.2.el8_0.x86_64 #1 SMP Sun Apr 28 09:21:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# lsblk -fs 
NAME   FSTYPE LABEL UUID                                 MOUNTPOINT
xvda1                                                    
└─xvda                                                   
xvda2  ext4   boot  51b1d71e-c3f9-4a6a-b22e-407ff7412d06 /boot
└─xvda                                                   
xvda3  xfs    root  77a6c49a-0403-474d-b727-80e256f7c564 /sysroot
└─xvda                                                   


# cat /proc/diskstats 
 202       0 xvda 5296 20 622167 7711 152379 60718 8599959 574473 0 53334 501001 0 0 0 0
 202       1 xvda1 28 0 224 5 0 0 0 0 0 11 0 0 0 0 0
 202       2 xvda2 165 0 9466 65 7 1 64 3 0 71 12 0 0 0 0
 202       3 xvda3 5042 20 609413 7607 152372 60717 8599895 574469 0 53310 500976 0 0 0 0

For the Prometheus query, 'container_fs_io_time_seconds_total{id="/"}' (column 13 in /proc/diskstats), I can see the corresponding metrics.

container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.128.87:10250",job="kubelet",node="ip-10-0-128-87.ap-south-1.compute.internal",service="kubelet"}	0.000000063
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.136.166:10250",job="kubelet",node="ip-10-0-136-166.ap-south-1.compute.internal",service="kubelet"}	0.000000071
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.146.253:10250",job="kubelet",node="ip-10-0-146-253.ap-south-1.compute.internal",service="kubelet"}	0.000000072
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.151.178:10250",job="kubelet",node="ip-10-0-151-178.ap-south-1.compute.internal",service="kubelet"}	0.000000071
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.165.81:10250",job="kubelet",node="ip-10-0-165-81.ap-south-1.compute.internal",service="kubelet"}	0.000000063
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.175.194:10250",job="kubelet",node="ip-10-0-175-194.ap-south-1.compute.internal",service="kubelet"}	0.000000065
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.128.87:10250",job="kubelet",node="ip-10-0-128-87.ap-south-1.compute.internal",service="kubelet"}	0.000181785
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.136.166:10250",job="kubelet",node="ip-10-0-136-166.ap-south-1.compute.internal",service="kubelet"}	0.000049739
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.146.253:10250",job="kubelet",node="ip-10-0-146-253.ap-south-1.compute.internal",service="kubelet"}	0.00003107
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.151.178:10250",job="kubelet",node="ip-10-0-151-178.ap-south-1.compute.internal",service="kubelet"}	0.000179717
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.165.81:10250",job="kubelet",node="ip-10-0-165-81.ap-south-1.compute.internal",service="kubelet"}	0.000171187
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.175.194:10250",job="kubelet",node="ip-10-0-175-194.ap-south-1.compute.internal",service="kubelet"}	0.000035655


While on 4.4.6, 

# uname -a 
Linux debug-plxnc-w-a-br6nb.c.openshift-gce-devel.internal 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Wed Feb 26 03:08:15 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux


# lsblk -fs 
NAME                     FSTYPE      LABEL        UUID                                 MOUNTPOINT
sda1                     ext4        boot         fd9689d9-fbcc-488a-a70e-145fddf245e7 /boot
└─sda                                                                                  
sda2                     vfat        EFI-SYSTEM   4FC5-6A46                            /boot/efi
└─sda                                                                                  
sda3                                                                                   
└─sda                                                                                  
coreos-luks-root-nocrypt xfs         root         d2d725c8-d0aa-4f58-a6fb-811d90240135 /sysroot
└─sda4                   crypto_LUKS crypt_rootfs 00000000-0000-4000-a000-000000000002 
  └─sda            

# cat /proc/diskstats 
   8       0 sda 7512 11 827015 7436 321336 33931 10380796 582825 0 186009 438510 0 0 0 0
   8       1 sda1 144 0 8802 84 9 5 40 9 0 97 24 0 0 0 0
   8       2 sda2 443 0 11953 804 1 0 1 0 0 89 616 0 0 0 0
   8       3 sda3 26 0 208 17 0 0 0 0 0 14 8 0 0 0 0
   8       4 sda4 6823 11 802436 6487 255736 33926 10380755 573503 0 168713 436019 0 0 0 0
 253       0 dm-0 6816 0 798794 6565 355249 0 10941122 819340 0 186266 825905 0 0 0 0

# ls -l /dev/mapper/coreos-luks-root-nocrypt 
lrwxrwxrwx. 1 root root 7 Jun  4 06:43 /dev/mapper/coreos-luks-root-nocrypt -> ../dm-0

Prometheus query, container_fs_io_time_seconds_total{id="/"}, looks like,

container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.0.3:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-2.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.0.4:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-1.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.0.5:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-0.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.32.2:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-c-ntcdh.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.32.3:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-a-br6nb.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.32.4:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-b-w6tbg.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.0.3:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-2.c.openshift-gce-devel.internal",service="kubelet"}	0.000000131
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.0.4:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-1.c.openshift-gce-devel.internal",service="kubelet"}	0.000000079
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.0.5:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-0.c.openshift-gce-devel.internal",service="kubelet"}	0.000000103
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.32.2:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-c-ntcdh.c.openshift-gce-devel.internal",service="kubelet"}	0.000000106
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.32.3:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-a-br6nb.c.openshift-gce-devel.internal",service="kubelet"}	0.000000097
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.32.4:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-b-w6tbg.c.openshift-gce-devel.internal",service="kubelet"}	0.000000101


              
So in case of 4.1.1 (root device /dev/xvda3) we get the metrics but in case of 4.4.6 (root device  /dev/mapper/coreos-luks-root-nocrypt) we don't get corresponding metrics. 


Looking into it.

--- Additional comment from Harshal Patil on 2020-06-04 09:00:19 EDT ---

For the record and reference, adding the mapping between a metric and corresponding column number in /proc/diskstats 



# cat /proc/diskstats 
   8       0 sda 7512 11 827015 7436 321336 33931 10380796 582825 0 186009 438510 0 0 0 0
   8       1 sda1 144 0 8802 84 9 5 40 9 0 97 24 0 0 0 0
   8       2 sda2 443 0 11953 804 1 0 1 0 0 89 616 0 0 0 0
   8       3 sda3 26 0 208 17 0 0 0 0 0 14 8 0 0 0 0
   8       4 sda4 6823 11 802436 6487 255736 33926 10380755 573503 0 168713 436019 0 0 0 0
 253       0 dm-0 6816 0 798794 6565 355249 0 10941122 819340 0 186266 825905 0 0 0 0


container_fs_io_time_seconds_total - 13th column 
container_fs_read_seconds_total - 7th column
container_fs_reads_merged_total - 5nd column
container_fs_io_current - 12th column

meaning of individual columns
1 - major number
2 - minor mumber
3 - device name
4 - reads completed successfully
5 - reads merged
6 - sectors read
7 - time spent reading (ms)
8 - writes completed
9 - writes merged
10 - sectors written
11 - time spent writing (ms)

Comment 2 Stephen Cuppett 2020-06-22 15:22:26 UTC

BZ1849269 has the actual 4.5 PR. This just links to the existing 4.6 PR at this point.

If a new clone is needed for 4.4.z and 4.3.z, please create those.

*** This bug has been marked as a duplicate of bug 1849269 ***

Note You need to log in before you can comment on or make changes to this bug.