Bug 1694766
Summary: | No CPU metrics for non-pod services on a node | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> | ||||
Component: | Monitoring | Assignee: | Frederic Branczyk <fbranczy> | ||||
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.1.0 | CC: | anpicker, aos-bugs, erooth, fbranczy, jokerman, mloibl, mmccomas, pkrupa, surbania | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.1.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-06-04 10:46:44 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Clayton Coleman
2019-04-01 15:24:24 UTC
Might be partially fixed by https://github.com/openshift/machine-config-operator/pull/581, but there are some services with accounting on already that aren't showing up. As part of any fix please add an origin e2e suite that is part of conformance that verifies that node level non-pod CPU metrics show up (and pod CPU metrics) by adding to one of the existing "prometheus metrics should be retrieved" e2e tests so this doesn't regress in the future. Confirmed that container_cpu_usage_seconds_total{id!~"/kubepods.slice/.*"} returns no datapoints with a 4.1 cluster launched today. @Lucas Would you be able to take a look? BTW, I see that https://github.com/openshift/machine-config-operator/pull/581 has now merged. First PR to remove the too aggressive dropping is out: https://github.com/coreos/prometheus-operator/pull/2545 Now having this trickle down into the cluster-monitoring-operator. Moving to POST. The final PR to have this trickle down into the cluster-monitoring stack has been opened: https://github.com/openshift/cluster-monitoring-operator/pull/319 The above PR was merged, and I've verified again that it does work on new clusters. Adding an e2e test now. PR to add the additional test to prevent this regression in the future has been opened: https://github.com/openshift/origin/pull/22575 Both the fix and e2e test to catch regressions have been fixed. Moving to modified. Confirmed that container_cpu_usage_seconds_total{id!~"/kubepods.slice/.*"} returns datapoints now payload: 4.0.0-0.nightly-2019-04-20-175518 Created attachment 1557483 [details]
container_cpu_usage_seconds_total{id!~"/kubepods.slice/.*"} returns datapoints
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |