Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1573717 - No prometheus labels set for pod slices, which means we can't query by label selector for CPU attributed to a pod
No prometheus labels set for pod slices, which means we can't query by label ...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.10.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.11.0
Assigned To: Derek Carr
DeShuai Ma
: TestCaseNeeded
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-05-02 02:28 EDT by Clayton Coleman
Modified: 2018-10-11 03:19 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-10-11 03:19:10 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 None None None 2018-10-11 03:19 EDT

  None (edit)
Description Clayton Coleman 2018-05-02 02:28:28 EDT
The container scopes have pod_name/namespace, but the pod slice doesn't, which I think means that we can't see build CPU metrics. Noticed this on https://prometheus-openshift-monitoring.svc.ci.openshift.org/graph?g0.range_input=6h&g0.expr=rate(haproxy_server_bytes_in_total%5B5m%5D)&g0.tab=0&g1.range_input=1h&g1.expr=sort_desc(container_cpu_usage_rate%7Bnode_role_kubernetes_io_master%3D%22true%22%7D)&g1.tab=0 when I was looking at the master static pod metrics.

Appears to have been broken since at least 3.9, maybe earlier.

Without this we cannot query for CPU or memory by pod
Comment 1 Derek Carr 2018-05-02 13:20:40 EDT
Pod usage stats collected from the pod level cgroup is reported in the kubelet stats summary API since 1.9.

https://github.com/kubernetes/kubernetes/pull/55969
Comment 2 Derek Carr 2018-05-02 23:24:30 EDT
After further discussion, it is clear the issue is with the /metrics/cadvisor endpoint missing the required labels for pod name and namespace when called from a Kubernetes context.  It's possible we can wrap the label decorator func so it has access to do efficient sub-container lookup so it can decorate pod_name and pod_namespace for pod cgroups.
Comment 3 Seth Jennings 2018-05-06 20:39:06 EDT
Kube PR:
https://github.com/kubernetes/kubernetes/pull/63406
Comment 4 Seth Jennings 2018-05-11 01:26:30 EDT
Deferred to 3.11.  PR merged upstream and will come in on the kube 1.11 rebase.
Wait until 3.11 rebases on 1.11 before moving to ON_QA.
Comment 6 DeShuai Ma 2018-09-04 03:21:23 EDT
Verify on openshift v3.11.0-0.25.0
we can get the master static metrics in prometheus console

query script: 
sum without (cpu) (rate(container_cpu_usage_seconds_total{node_role_kubernetes_io_master="true"}[5m]))

I can get data in prometheus console like:
{beta_kubernetes_io_arch="amd64",beta_kubernetes_io_instance_type="431ac1fb-1463-4527-b3d1-79245dd698e1",beta_kubernetes_io_os="linux",container_name="c",failure_domain_beta_kubernetes_io_region="regionOne",failure_domain_beta_kubernetes_io_zone="nova",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podfcaf46e6_afd8_11e8_a6a2_fa163ed26999.slice/docker-9f5182efa3260a7741a3592719dc0742c2a8df05535ecf786b50c4874f567056.scope",image="registry.dev.redhat.io/openshift3/ose-template-service-broker@sha256:f3f805a08103267155f3459885adc884985d77f3c56d11620433d50d16baa24c",instance="qe-juzhao-311-qeos-1-master-etcd-1",job="kubernetes-cadvisor",kubernetes_io_hostname="qe-juzhao-311-qeos-1-master-etcd-1",name="k8s_c_apiserver-rkqrw_openshift-template-service-broker_fcaf46e6-afd8-11e8-a6a2-fa163ed26999_0",namespace="openshift-template-service-broker",node_role_kubernetes_io_master="true",pod_name="apiserver-rkqrw"}	0.001801434737499985
Comment 9 errata-xmlrpc 2018-10-11 03:19:10 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.