Bug 1831908 - [4.6] Regression: Some critical cadvisor disk metrics are not reported in 4.4
Summary: [4.6] Regression: Some critical cadvisor disk metrics are not reported in 4.4
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.4
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.6.0
Assignee: Harshal Patil
QA Contact: Cameron Meadors
URL:
Whiteboard:
Depends On:
Blocks: 1845201 1849269
TreeView+ depends on / blocked
 
Reported: 2020-05-05 22:07 UTC by Clayton Coleman
Modified: 2020-10-27 15:59 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Missing metrics for device mapper for devices Consequence: No metrics available if system was using device mapper for root device Fix: cadvisor has been fixed to report bugs even if the root device is using device mapper Result: Metrics are available irrespective of if the device mapper is used for root device or not.
Clone Of:
: 1845201 (view as bug list)
Environment:
Last Closed: 2020-10-27 15:58:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 25140 0 None closed Bug 1831908: bump cadvisor for incorrect diskstats fix 2020-10-29 08:08:41 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 15:59:14 UTC

Description Clayton Coleman 2020-05-05 22:07:13 UTC
While debugging a high priority CI failure, we realized that a number of key host level IO metrics returned by cAdvisor were set to zero.

container_fs_io_time_seconds_total
container_fs_read_seconds_total
container_fs_reads_merged_total
container_fs_io_current

These are zero for all containers AND the root device (which is usually special).  This worked in 3.11 and is a key tool for debugging IO - this has regressed sometime in recent releases.

Needs to be fixed and backported to 4.4 at minimum.

Comment 1 Clayton Coleman 2020-05-05 22:15:09 UTC
The disk stats from one of the boxes reporting 0

sh-4.4# cat /proc/diskstats 
 259       0 nvme0n1 3890442 1503 208168451 2536364 155312820 84262481 13280576806 152625519 0 17300601 58405818 0 0 0 0
 259       1 nvme0n1p1 144 0 8802 46 14 0 40 8 0 50 6 0 0 0 0
 259       2 nvme0n1p2 443 0 11953 227 1 0 1 0 0 54 13 0 0 0 0
 259       3 nvme0n1p3 26 0 208 5 0 0 0 0 0 7 0 0 0 0 0
 259       4 nvme0n1p4 3889753 1503 208143872 2536064 155312805 84262481 13280576765 152625510 0 17300508 58405798 0 0 0 0
 253       0 dm-0 3891145 0 208140230 2457857 239575286 0 13280576765 249604868 0 17302451 252062725 0 0 0 0
 259       5 nvme1n1 465396 19 3752698 378287 402540 531025 9171280 423786 0 373694 353150 0 0 0 0
 259       6 nvme2n1 176 0 6634 87 1081 1380 20384 620 0 680 39 0 0 0 0

Also, we should note that the metrics for the LUKS device are returned, but not the root device.  Is devicemapper messing up our metrics after the introduction of the encryption stuff?

Comment 2 Harshal Patil 2020-05-20 13:28:58 UTC
I wasn't able to spend adequate amount of time on this sprint due to other higher priority work, I will be looking into it in next sprint.

Comment 3 Clayton Coleman 2020-06-02 14:08:38 UTC
These metrics are fundamentally to debugging.  You may not move this to 4.6.

Comment 4 Harshal Patil 2020-06-04 12:37:30 UTC
On 4.1.1

# uname -a
Linux ip-10-0-136-166 4.18.0-80.1.2.el8_0.x86_64 #1 SMP Sun Apr 28 09:21:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# lsblk -fs 
NAME   FSTYPE LABEL UUID                                 MOUNTPOINT
xvda1                                                    
└─xvda                                                   
xvda2  ext4   boot  51b1d71e-c3f9-4a6a-b22e-407ff7412d06 /boot
└─xvda                                                   
xvda3  xfs    root  77a6c49a-0403-474d-b727-80e256f7c564 /sysroot
└─xvda                                                   


# cat /proc/diskstats 
 202       0 xvda 5296 20 622167 7711 152379 60718 8599959 574473 0 53334 501001 0 0 0 0
 202       1 xvda1 28 0 224 5 0 0 0 0 0 11 0 0 0 0 0
 202       2 xvda2 165 0 9466 65 7 1 64 3 0 71 12 0 0 0 0
 202       3 xvda3 5042 20 609413 7607 152372 60717 8599895 574469 0 53310 500976 0 0 0 0

For the Prometheus query, 'container_fs_io_time_seconds_total{id="/"}' (column 13 in /proc/diskstats), I can see the corresponding metrics.

container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.128.87:10250",job="kubelet",node="ip-10-0-128-87.ap-south-1.compute.internal",service="kubelet"}	0.000000063
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.136.166:10250",job="kubelet",node="ip-10-0-136-166.ap-south-1.compute.internal",service="kubelet"}	0.000000071
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.146.253:10250",job="kubelet",node="ip-10-0-146-253.ap-south-1.compute.internal",service="kubelet"}	0.000000072
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.151.178:10250",job="kubelet",node="ip-10-0-151-178.ap-south-1.compute.internal",service="kubelet"}	0.000000071
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.165.81:10250",job="kubelet",node="ip-10-0-165-81.ap-south-1.compute.internal",service="kubelet"}	0.000000063
container_fs_io_time_seconds_total{device="/dev/xvda2",endpoint="https-metrics",id="/",instance="10.0.175.194:10250",job="kubelet",node="ip-10-0-175-194.ap-south-1.compute.internal",service="kubelet"}	0.000000065
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.128.87:10250",job="kubelet",node="ip-10-0-128-87.ap-south-1.compute.internal",service="kubelet"}	0.000181785
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.136.166:10250",job="kubelet",node="ip-10-0-136-166.ap-south-1.compute.internal",service="kubelet"}	0.000049739
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.146.253:10250",job="kubelet",node="ip-10-0-146-253.ap-south-1.compute.internal",service="kubelet"}	0.00003107
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.151.178:10250",job="kubelet",node="ip-10-0-151-178.ap-south-1.compute.internal",service="kubelet"}	0.000179717
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.165.81:10250",job="kubelet",node="ip-10-0-165-81.ap-south-1.compute.internal",service="kubelet"}	0.000171187
container_fs_io_time_seconds_total{device="/dev/xvda3",endpoint="https-metrics",id="/",instance="10.0.175.194:10250",job="kubelet",node="ip-10-0-175-194.ap-south-1.compute.internal",service="kubelet"}	0.000035655


While on 4.4.6, 

# uname -a 
Linux debug-plxnc-w-a-br6nb.c.openshift-gce-devel.internal 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Wed Feb 26 03:08:15 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux


# lsblk -fs 
NAME                     FSTYPE      LABEL        UUID                                 MOUNTPOINT
sda1                     ext4        boot         fd9689d9-fbcc-488a-a70e-145fddf245e7 /boot
└─sda                                                                                  
sda2                     vfat        EFI-SYSTEM   4FC5-6A46                            /boot/efi
└─sda                                                                                  
sda3                                                                                   
└─sda                                                                                  
coreos-luks-root-nocrypt xfs         root         d2d725c8-d0aa-4f58-a6fb-811d90240135 /sysroot
└─sda4                   crypto_LUKS crypt_rootfs 00000000-0000-4000-a000-000000000002 
  └─sda            

# cat /proc/diskstats 
   8       0 sda 7512 11 827015 7436 321336 33931 10380796 582825 0 186009 438510 0 0 0 0
   8       1 sda1 144 0 8802 84 9 5 40 9 0 97 24 0 0 0 0
   8       2 sda2 443 0 11953 804 1 0 1 0 0 89 616 0 0 0 0
   8       3 sda3 26 0 208 17 0 0 0 0 0 14 8 0 0 0 0
   8       4 sda4 6823 11 802436 6487 255736 33926 10380755 573503 0 168713 436019 0 0 0 0
 253       0 dm-0 6816 0 798794 6565 355249 0 10941122 819340 0 186266 825905 0 0 0 0

# ls -l /dev/mapper/coreos-luks-root-nocrypt 
lrwxrwxrwx. 1 root root 7 Jun  4 06:43 /dev/mapper/coreos-luks-root-nocrypt -> ../dm-0

Prometheus query, container_fs_io_time_seconds_total{id="/"}, looks like,

container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.0.3:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-2.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.0.4:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-1.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.0.5:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-0.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.32.2:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-c-ntcdh.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.32.3:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-a-br6nb.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/mapper/coreos-luks-root-nocrypt",endpoint="https-metrics",id="/",instance="10.0.32.4:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-b-w6tbg.c.openshift-gce-devel.internal",service="kubelet"}	0
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.0.3:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-2.c.openshift-gce-devel.internal",service="kubelet"}	0.000000131
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.0.4:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-1.c.openshift-gce-devel.internal",service="kubelet"}	0.000000079
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.0.5:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-m-0.c.openshift-gce-devel.internal",service="kubelet"}	0.000000103
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.32.2:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-c-ntcdh.c.openshift-gce-devel.internal",service="kubelet"}	0.000000106
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.32.3:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-a-br6nb.c.openshift-gce-devel.internal",service="kubelet"}	0.000000097
container_fs_io_time_seconds_total{device="/dev/sda1",endpoint="https-metrics",id="/",instance="10.0.32.4:10250",job="kubelet",metrics_path="/metrics/cadvisor",node="debug-plxnc-w-b-w6tbg.c.openshift-gce-devel.internal",service="kubelet"}	0.000000101


              
So in case of 4.1.1 (root device /dev/xvda3) we get the metrics but in case of 4.4.6 (root device  /dev/mapper/coreos-luks-root-nocrypt) we don't get corresponding metrics. 


Looking into it.

Comment 5 Harshal Patil 2020-06-04 13:00:19 UTC
For the record and reference, adding the mapping between a metric and corresponding column number in /proc/diskstats 



# cat /proc/diskstats 
   8       0 sda 7512 11 827015 7436 321336 33931 10380796 582825 0 186009 438510 0 0 0 0
   8       1 sda1 144 0 8802 84 9 5 40 9 0 97 24 0 0 0 0
   8       2 sda2 443 0 11953 804 1 0 1 0 0 89 616 0 0 0 0
   8       3 sda3 26 0 208 17 0 0 0 0 0 14 8 0 0 0 0
   8       4 sda4 6823 11 802436 6487 255736 33926 10380755 573503 0 168713 436019 0 0 0 0
 253       0 dm-0 6816 0 798794 6565 355249 0 10941122 819340 0 186266 825905 0 0 0 0


container_fs_io_time_seconds_total - 13th column 
container_fs_read_seconds_total - 7th column
container_fs_reads_merged_total - 5nd column
container_fs_io_current - 12th column

meaning of individual columns
1 - major number
2 - minor mumber
3 - device name
4 - reads completed successfully
5 - reads merged
6 - sectors read
7 - time spent reading (ms)
8 - writes completed
9 - writes merged
10 - sectors written
11 - time spent writing (ms)

Comment 8 Seth Jennings 2020-06-09 15:45:14 UTC
Still needs bump in Origin

Comment 9 Seth Jennings 2020-06-09 15:46:02 UTC
https://github.com/openshift/google-cadvisor/pull/10

Comment 13 errata-xmlrpc 2020-10-27 15:58:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.