Bug 1997926
| Summary: | container_runtime_crio_operations_latency_microseconds does not have quantile anymore | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Alan Chan <alchan> | |
| Component: | Node | Assignee: | Sascha Grunert <sgrunert> | |
| Node sub component: | CRI-O | QA Contact: | Weinan Liu <weinliu> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | low | |||
| Priority: | low | CC: | aos-bugs | |
| Version: | 4.8 | |||
| Target Milestone: | --- | |||
| Target Release: | 4.8.z | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | No Doc Update | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2010831 2010841 (view as bug list) | Environment: | ||
| Last Closed: | 2022-02-16 06:51:40 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2010831, 2010841 | |||
Sascha can you PTAL Confirmed, fix is incoming in https://github.com/cri-o/cri-o/pull/5258. I'll backport the fix if merged. Upstream PR merged, cherry-picking now. Upstream PR got merged into release-1.21 (https://github.com/cri-o/cri-o/pull/5266) Means that the next package build of CRI-O should contain the fix in 4.8. sh-4.4# oc get --raw /metrics --server http://${NODEIP}:9537 | grep "container_runtime_crio_operations_atency" | grep -c quantile 57 Verified to be fixed on $ oc get clusterversions NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2022-02-09-031830 True False 43m Cluster version is 4.8.0-0.nightly-2022-02-09-031830 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.31 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0484 |
Description of problem: ----------------------- Starting with 4.6 (4.7 & 4.8 too), there are no more quantile numbers anymore with container_runtime_crio_operations_latency_microseconds metric. There are only container_runtime_crio_operations_latency_microseconds_sum and container_runtime_crio_operations_latency_microseconds_count. There are some customers monitor these quantile metrics on 4.5. With just _sum & _count, it's not quite possible to calculate quantile info at all. Version-Release number of selected component (if applicable): ------------------------------------------------------------- OCP 4.6 (cri-o 1.19), 4.7 (cri-o 1.20), 4.8 (cri-o 1.21) How reproducible: ----------------- Always On 4.5: ------- The quantile info are still there... [quicklab@upi-0 ~]$ oc version Client Version: 4.5.41 Server Version: 4.5.41 Kubernetes Version: v1.18.3+d8ef5ad [quicklab@upi-0 ~]$ NODEIP=$(oc get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}') [quicklab@upi-0 ~]$ oc get --raw /metrics --server http://${NODEIP}:9537 | grep "container_runtime_crio_operations_latency" # HELP container_runtime_crio_operations_latency_microseconds Latency in microseconds of CRI-O operations. Broken down by operation type. # TYPE container_runtime_crio_operations_latency_microseconds summary container_runtime_crio_operations_latency_microseconds{operation_type="Attach",quantile="0.5"} NaN container_runtime_crio_operations_latency_microseconds{operation_type="Attach",quantile="0.9"} NaN container_runtime_crio_operations_latency_microseconds{operation_type="Attach",quantile="0.99"} NaN container_runtime_crio_operations_latency_microseconds_sum{operation_type="Attach"} 19510 container_runtime_crio_operations_latency_microseconds_count{operation_type="Attach"} 17 container_runtime_crio_operations_latency_microseconds{operation_type="ContainerStatus",quantile="0.5"} 32 container_runtime_crio_operations_latency_microseconds{operation_type="ContainerStatus",quantile="0.9"} 54 container_runtime_crio_operations_latency_microseconds{operation_type="ContainerStatus",quantile="0.99"} 92 . . . On 4.8 (same for 4.6, 4.7): --------------------------- There are no quantile info but just _total_sum & _total_count metrics... [quicklab@upi-0 ~]$ oc version Client Version: 4.8.5 Server Version: 4.8.5 Kubernetes Version: v1.21.1+9807387 [quicklab@upi-0 ~]$ NODEIP=$(oc get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}') [quicklab@upi-0 ~]$ oc get --raw /metrics --server http://${NODEIP}:9537 | grep "container_runtime_crio_operations_latency" | grep -c quantile 0 Expected results: ----------------- Quantile numbers should be provided like 4.5.