Bug 1305100

Summary: Metrics do not show for Resource/Quota projects but work for unbound projects
Product: OpenShift Container Platform Reporter: Boris Kurktchiev <kurktchiev>
Component: HawkularAssignee: Matt Wringe <mwringe>
Status: CLOSED DUPLICATE QA Contact: chunchen <chunchen>
Severity: high Docs Contact:
Priority: high    
Version: 3.1.0CC: aos-bugs, erjones, hrosnet, mwringe, wsun
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-08 14:54:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Screenshot none

Description Boris Kurktchiev 2016-02-05 15:33:42 UTC
Description of problem:
The pie charts of limited projects are empty, while on unlimited ones metrics for pods are shown and work just fine

Version-Release number of selected component (if applicable):
OSE 3.1.1

How reproducible:
Set limits. Fire up a project and deploy a pod, click on Metrics tab.

Steps to Reproduce:
1. See above
2.
3.

Actual results:
No Metrics for projects with Limits

Expected results:
Populated Pie chart with metrics for projects with Limits

Additional info:
Here is what my project template looks like:
apiVersion: v1
kind: Template
metadata:
  creationTimestamp: null
  name: project-limited
objects:
- apiVersion: v1
  kind: Project
  metadata:
    annotations:
      openshift.io/description: ${PROJECT_DESCRIPTION}
      openshift.io/display-name: ${PROJECT_DISPLAYNAME}
    creationTimestamp: null
    name: ${PROJECT_NAME}
  spec: {}
  status: {}
- apiVersion: v1
  groupNames: []
  kind: RoleBinding
  metadata:
    creationTimestamp: null
    name: admins
    namespace: ${PROJECT_NAME}
  roleRef:
    name: admin
  subjects:
  - kind: User
    name: ${PROJECT_ADMIN_USER}
  userNames:
  - ${PROJECT_ADMIN_USER}
- apiVersion: v1
  groupNames:
  - system:serviceaccounts:${PROJECT_NAME}
  kind: RoleBinding
  metadata:
    creationTimestamp: null
    name: system:image-pullers
    namespace: ${PROJECT_NAME}
  roleRef:
    name: system:image-puller
  subjects:
  - kind: SystemGroup
    name: system:serviceaccounts:${PROJECT_NAME}
  userNames: []
- apiVersion: v1
  groupNames: []
  kind: RoleBinding
  metadata:
    creationTimestamp: null
    name: system:image-builders
    namespace: ${PROJECT_NAME}
  roleRef:
    name: system:image-builder
  subjects:
  - kind: ServiceAccount
    name: builder
  userNames:
  - system:serviceaccount:${PROJECT_NAME}:builder
- apiVersion: v1
  groupNames: []
  kind: RoleBinding
  metadata:
    creationTimestamp: null
    name: system:deployers
    namespace: ${PROJECT_NAME}
  roleRef:
    name: system:deployer
  subjects:
  - kind: ServiceAccount
    name: deployer
  userNames:
  - system:serviceaccount:${PROJECT_NAME}:deployer
- apiVersion: "v1"
  kind: "ResourceQuota"
  metadata:
    name: "default-quota"
  spec:
    hard:
      memory: "2560Mi"
      cpu: "10"
      pods: "5"
      services: "5"
      replicationcontrollers: "5"
      resourcequotas: "1"
      persistentvolumeclaims: "5"
- apiVersion: "v1"
  kind: "LimitRange"
  metadata:
    name: "default-limits"
  spec:
    limits:
      -
        type: "Pod"
        max:
          cpu: "2"
          memory: "512Mi"
        min:
          cpu: "200m"
          memory: "6Mi"
      -
        type: "Container"
        max:
          cpu: "2"
          memory: "512Mi"
        min:
          cpu: "200m"
          memory: "6Mi"
        default:
          cpu: "500m"
          memory: "512Mi"
        defaultRequest:
          cpu: "500m"
          memory: "512Mi"
        maxLimitRequestRatio:
          cpu: "10"
parameters:
- name: PROJECT_NAME
- name: PROJECT_DISPLAYNAME
- name: PROJECT_DESCRIPTION
- name: PROJECT_ADMIN_USER

Comment 1 Matt Wringe 2016-02-05 15:40:59 UTC
"The pie charts of limited projects are empty, while on unlimited ones metrics for pods are shown and work just fine"

Just to clarify, its only the pie chart which is missing right? Are you are still seeing the normal chart being displayed for limited resources?

Can you please post the output of 'oc get -o yaml pod <PODNAME>' for the pod which is limited?

Comment 2 Boris Kurktchiev 2016-02-05 15:44:26 UTC
Created attachment 1121423 [details]
Screenshot

Comment 3 Boris Kurktchiev 2016-02-05 15:46:12 UTC
No there is nothing displayed for limited projects. I have attached a screenshot above. Here is the output

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/created-by: |
      {"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"boris","name":"cakephp-example-1","uid":"20fe3798-cc1b-11e5-9c5a-005056a6874f","apiVersion":"v1","resourceVersion":"5685"}}
    kubernetes.io/limit-ranger: 'LimitRanger plugin set: cpu, memory request for container
      cakephp-example; cpu, memory limit for container cakephp-example'
    openshift.io/deployment-config.latest-version: "1"
    openshift.io/deployment-config.name: cakephp-example
    openshift.io/deployment.name: cakephp-example-1
    openshift.io/scc: restricted
  creationTimestamp: 2016-02-05T15:14:25Z
  generateName: cakephp-example-1-
  labels:
    deployment: cakephp-example-1
    deploymentconfig: cakephp-example
    name: cakephp-example
  name: cakephp-example-1-pi5ik
  namespace: boris
  resourceVersion: "5711"
  selfLink: /api/v1/namespaces/boris/pods/cakephp-example-1-pi5ik
  uid: 24c52a25-cc1b-11e5-9c5a-005056a6874f
spec:
  containers:
  - env:
    - name: DATABASE_SERVICE_NAME
    - name: DATABASE_ENGINE
    - name: DATABASE_NAME
    - name: DATABASE_USER
    - name: DATABASE_PASSWORD
    - name: CAKEPHP_SECRET_TOKEN
      value: pTR5m6cfn_Kf8XVynWuw7qS2CRTU5q0UVMhND8fI_geWwGtoLb
    - name: CAKEPHP_SECURITY_SALT
      value: 7gUCEuvRbMpJG6U3gGT4URJl8JMKs8U6h3tbEfju
    - name: CAKEPHP_SECURITY_CIPHER_SEED
      value: "082603343523286654421652166200"
    - name: OPCACHE_REVALIDATE_FREQ
      value: "2"
    image: 172.30.134.24:5000/boris/cakephp-example@sha256:5577abab95db9618f1b2c6c6e995dbe2e781ce8cd4f3914e37c9c56dcbcdc17e
    imagePullPolicy: IfNotPresent
    name: cakephp-example
    ports:
    - containerPort: 8080
      protocol: TCP
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 500m
        memory: 512Mi
    securityContext:
      privileged: false
      runAsUser: 1000050000
      seLinuxOptions:
        level: s0:c7,c4
    terminationMessagePath: /dev/termination-log
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-vb8d3
      readOnly: true
  dnsPolicy: ClusterFirst
  host: osnode0s.devapps.unc.edu
  imagePullSecrets:
  - name: default-dockercfg-v5904
  nodeName: osnode0s.devapps.unc.edu
  nodeSelector:
    region: primary
    zone: cloudapps
  restartPolicy: Always
  securityContext:
    seLinuxOptions:
      level: s0:c7,c4
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  volumes:
  - name: default-token-vb8d3
    secret:
      secretName: default-token-vb8d3
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: 2016-02-05T15:14:27Z
    status: "True"
    type: Ready
  containerStatuses:
  - containerID: docker://eaae029afa9e0a0d2baa8f098a1e1672444ec47213a8d5fad47089bcddd97718
    image: 172.30.134.24:5000/boris/cakephp-example@sha256:5577abab95db9618f1b2c6c6e995dbe2e781ce8cd4f3914e37c9c56dcbcdc17e
    imageID: docker://96b3704b3ef0c4528108cfeda2679bde2d5c4564522c6a79b0862e7216c95d39
    lastState: {}
    name: cakephp-example
    ready: true
    restartCount: 0
    state:
      running:
        startedAt: 2016-02-05T15:14:27Z
  hostIP: 152.19.229.207
  phase: Running
  podIP: 10.1.3.8
  startTime: 2016-02-05T15:14:25Z

Comment 4 Matt Wringe 2016-02-05 15:50:24 UTC
Ok, I understand your issue a bit better now. Its not that the charts are missing, its that they show zero values.

A couple of things could be causing this:

1) issue with certificates: https://github.com/openshift/origin-metrics/blob/master/docs/troubleshooting.adoc#empty-charts

2) issue with permissions. If you do 'oadm policy who-can get pods --all-namespaces' does the Heapster user appear in this list?

The heapster logs may also help indicate what is going on here.

Comment 5 Boris Kurktchiev 2016-02-05 16:03:43 UTC
#1 doesnt seem relevant as you can see in my screenshot I am not getting the error message, plus I see metrics for non limited projects as I already said, from the same account and same browser.

#2 here is the output
----> oadm policy who-can get pods --all-namespaces
Namespace: <all>
Verb:      get
Resource:  pods

Users:  boris
        system:serviceaccount:management-infra:management-admin
        system:serviceaccount:openshift-infra:build-controller
        system:serviceaccount:openshift-infra:deployment-controller
        system:serviceaccount:openshift-infra:heapster
        system:serviceaccount:openshift-infra:pv-recycler-controller

Groups: system:cluster-admins
        system:cluster-readers
        system:masters
        system:nodes

Comment 6 Matt Wringe 2016-02-05 16:20:21 UTC
#1 can be relevant if your have a multinode setup where your limited pods are being deployed to one node and your non-limited pods are being deployed to another. In this case you could have the situation where one node is not configured properly while the other nodes are.

'oc get -o yaml pod <PODNAME>' will have a nodeName parameter.  If you could verify that its not a node issue (eg that other pods on the same node are displaying metrics) then this might help to rule this out as an issue.

Ok, good, from #2 it looks like your user has the right permissions.

Is there any errors in your Heapster logs?

Comment 7 Boris Kurktchiev 2016-02-05 16:26:59 UTC
I definitely do have multi node setup and my limited pods are running in a separate zone vs my non limited.

heapster just has this in its logs:
Starting Heapster with the following arguments: --source=kubernetes:https://kubernetes.default.svc:443?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250 --sink=hawkular:https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=EGKWHny8x-2vv_0&filter=label(container_name:^/system.slice.*|^/user.slice) --logtostderr=true --tls_cert=/secrets/heapster.cert --tls_key=/secrets/heapster.key --tls_client_ca=/secrets/heapster.client-ca --allowed_users=system:master-proxy
2	I0205 10:11:54.826953       1 heapster.go:60] heapster --source=kubernetes:https://kubernetes.default.svc:443?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250 --sink=hawkular:https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=EGKWHny8x-2vv_0&filter=label(container_name:^/system.slice.*|^/user.slice) --logtostderr=true --tls_cert=/secrets/heapster.cert --tls_key=/secrets/heapster.key --tls_client_ca=/secrets/heapster.client-ca --allowed_users=system:master-proxy
3	I0205 10:11:54.830374       1 heapster.go:61] Heapster version 0.18.0
4	I0205 10:11:54.830930       1 kube_factory.go:168] Using Kubernetes client with master "https://kubernetes.default.svc:443" and version "v1"
5	I0205 10:11:54.830941       1 kube_factory.go:169] Using kubelet port 10250
6	I0205 10:11:54.831237       1 driver.go:491] Initialised Hawkular Sink with parameters {_system https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=EGKWHny8x-2vv_0&filter=label(container_name:^/system.slice.*|^/user.slice) 0xc20818c5a0 }
7	I0205 10:11:55.015882       1 heapster.go:71] Starting heapster on port 8082
8	W0205 11:11:54.953527       1 reflector.go:224] /builddir/build/BUILD/heapster-af4752e8f366cdce38d5b768c2f6ec36248ec9e3/_build/src/k8s.io/heapster/sources/pods.go:173: watch of *api.Pod ended with: very short watch

Comment 8 Boris Kurktchiev 2016-02-05 16:29:45 UTC
Ok so looks like the installer did not set NodeIPs in the node.yaml but their certs do contain the right IPs. I can try setting all that and bouncing.

Comment 9 Boris Kurktchiev 2016-02-05 16:39:10 UTC
OK so looks like it will not work right without nodeIP: being defined. Would probably be a good idea to update both the Metrics documentation and ansible installer example files to mention that if you want metrics to work you should enable the: openshift_node_set_node_ip=True and openshift_set_hostname=True options during install.

Comment 10 Matt Wringe 2016-02-08 14:54:40 UTC

*** This bug has been marked as a duplicate of bug 1305131 ***