Bug 1670330
Summary: | node:node_net_utilisation:sum_irate recording errors due to locked interface | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Kim Borup <kborup> |
Component: | Monitoring | Assignee: | Matthias Loibl <mloibl> |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.11.0 | CC: | adeshpan, calfonso, dcaldwel, grodrigu, jkaur, lserven, mloibl, mluther, mmariyan, rdiazgav, romank, sauchter, sponnaga, surbania |
Target Milestone: | --- | ||
Target Release: | 4.1.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-04 10:42:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Kim Borup
2019-01-29 09:56:04 UTC
The network interface selector is already configurable in kubernetes mixin project [1], defaulting to `eth0` [2]. This gives us the possibility of adjusting the interface selector, but at the wrong stage, at cluster monitoring operator compile time, and not at run time. Maybe mloibl knows of any case, where we have been templating rule values at run time before? [1] https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/rules/rules.libsonnet#L328 [2] https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/config.libsonnet#L15 After talking to Matthias, Casey and Frederic, we can change the default value 'device="eth0"' to a regex ignoring the interfaces that we don't want. In the long term the network operator could expose the names for us, which could then be templated into the rules manifest by the cluster monitoring operator. Assigning to Matthias for now. Let me know if you want me to further look into this. https://github.com/openshift/cluster-monitoring-operator/pull/226 merged, hence this fix will soon be available in Openshift 4.0. Thanks for the report and thanks Matthias for looking into this. It seems it is the same bug as bug 1654907 *** Bug 1654907 has been marked as a duplicate of this bug. *** Tested with 4.0.0-0.nightly-2019-03-06-074438 device name is not restricted to eth0, "veth.+" devices are excluded, could show stats for network in grafana / Prometheus Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |