Bug 1612813 - hawkular-metrics pod failed to start up due to unsuccessful version check
Summary: hawkular-metrics pod failed to start up due to unsuccessful version check
Keywords:
Status: CLOSED DUPLICATE of bug 1612648
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: 3.7.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.6.z
Assignee: John Sanda
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On: 1611941 1619497
Blocks: 1612648 1613095
TreeView+ depends on / blocked
 
Reported: 2018-08-06 10:50 UTC by Anping Li
Modified: 2018-08-21 04:15 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1611941
Environment:
Last Closed: 2018-08-06 18:33:36 UTC
Target Upstream Version:


Attachments (Terms of Use)

Comment 1 Anping Li 2018-08-06 10:51:13 UTC
Have similar bug with v3.7.
[anli@upg_slave_qeos10 37]$ oc describe pod hawkular-metrics-mwpsh
Name:           hawkular-metrics-mwpsh
Namespace:      openshift-infra
Node:           172.16.120.107/172.16.120.107
Start Time:     Mon, 06 Aug 2018 06:34:20 -0400
Labels:         metrics-infra=hawkular-metrics
                name=hawkular-metrics
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"openshift-infra","name":"hawkular-metrics","uid":"2bd612cb-9964-11e8-a...
                openshift.io/scc=restricted
Status:         Running
IP:             10.129.0.98
Controlled By:  ReplicationController/hawkular-metrics
Containers:
  hawkular-metrics:
    Container ID:  docker://2af10703f80eef65e27ba45c6c35447cddeb634c509d431ccfd75c59344aa113
    Image:         registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics:v3.7
    Image ID:      docker-pullable://registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics@sha256:0c804ff20cb23c1d340d8b8d50fa3e4808a7f46e91d603960c0a2087f545da75
    Ports:         8080/TCP, 8443/TCP, 8888/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Command:
      /opt/hawkular/scripts/hawkular-metrics-wrapper.sh
      -b
      0.0.0.0
      -Dhawkular.metrics.cassandra.nodes=hawkular-cassandra
      -Dhawkular.metrics.cassandra.use-ssl
      -Dhawkular.metrics.openshift.auth-methods=openshift-oauth,htpasswd
      -Dhawkular.metrics.openshift.htpasswd-file=/hawkular-account/hawkular-metrics.htpasswd
      -Dhawkular.metrics.allowed-cors-access-control-allow-headers=authorization
      -Dhawkular.metrics.default-ttl=7
      -Dhawkular.metrics.admin-tenant=_hawkular_admin
      -Dhawkular-alerts.cassandra-nodes=hawkular-cassandra
      -Dhawkular-alerts.cassandra-use-ssl
      -Dhawkular.alerts.openshift.auth-methods=openshift-oauth,htpasswd
      -Dhawkular.alerts.openshift.htpasswd-file=/hawkular-account/hawkular-metrics.htpasswd
      -Dhawkular.alerts.allowed-cors-access-control-allow-headers=authorization
      -Dorg.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true
      -Dorg.apache.catalina.connector.CoyoteAdapter.ALLOW_BACKSLASH=true
      -Dcom.datastax.driver.FORCE_NIO=true
      -DKUBERNETES_MASTER_URL=https://kubernetes.default.svc
      -DUSER_WRITE_ACCESS=False
      -Dhawkular.metrics.jmx-reporting-enabled
    State:          Running
      Started:      Mon, 06 Aug 2018 06:50:31 -0400
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 06 Aug 2018 06:44:31 -0400
      Finished:     Mon, 06 Aug 2018 06:50:31 -0400
    Ready:          False
    Restart Count:  2
    Limits:
      memory:  2500M
    Requests:
      memory:   1500M
    Liveness:   exec [/opt/hawkular/scripts/hawkular-metrics-liveness.py] delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:  exec [/opt/hawkular/scripts/hawkular-metrics-readiness.py] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:                  openshift-infra (v1:metadata.namespace)
      MASTER_URL:                     https://kubernetes.default.svc
      JGROUPS_PASSWORD:               QZEXRDtDeIQE2PBmi
      TRUSTSTORE_AUTHORITIES:         /hawkular-metrics-certs/tls.truststore.crt
      ENABLE_PROMETHEUS_ENDPOINT:     True
      OPENSHIFT_KUBE_PING_NAMESPACE:  openshift-infra (v1:metadata.namespace)
      OPENSHIFT_KUBE_PING_LABELS:     metrics-infra=hawkular-metrics,name=hawkular-metrics
      STARTUP_TIMEOUT:                500
    Mounts:
      /hawkular-account from hawkular-metrics-account (rw)
      /hawkular-metrics-certs from hawkular-metrics-certs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from hawkular-token-skj24 (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  hawkular-metrics-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hawkular-metrics-certs
    Optional:    false
  hawkular-metrics-account:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hawkular-metrics-account
    Optional:    false
  hawkular-token-skj24:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hawkular-token-skj24
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     <none>
Events:
  Type     Reason                 Age   From                     Message
  ----     ------                 ----  ----                     -------
  Normal   Scheduled              16m   default-scheduler        Successfully assigned hawkular-metrics-mwpsh to 172.16.120.107
  Normal   SuccessfulMountVolume  16m   kubelet, 172.16.120.107  MountVolume.SetUp succeeded for volume "hawkular-metrics-certs"
  Normal   SuccessfulMountVolume  16m   kubelet, 172.16.120.107  MountVolume.SetUp succeeded for volume "hawkular-token-skj24"
  Normal   SuccessfulMountVolume  16m   kubelet, 172.16.120.107  MountVolume.SetUp succeeded for volume "hawkular-metrics-account"
  Normal   Pulling                16m   kubelet, 172.16.120.107  pulling image "registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics:v3.7"
  Normal   Pulled                 12m   kubelet, 172.16.120.107  Successfully pulled image "registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics:v3.7"
  Normal   Started                12m   kubelet, 172.16.120.107  Started container
  Normal   Created                12m   kubelet, 172.16.120.107  Created container
  Warning  Unhealthy              12m   kubelet, 172.16.120.107  Liveness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>.
Traceback (most recent call last):
  File "/opt/hawkular/scripts/hawkular-metrics-liveness.py", line 48, in <module>
    if int(uptime) < int(timeout):
ValueError: invalid literal for int() with base 10: ''
  Warning  Unhealthy  11m (x2 over 12m)  kubelet, 172.16.120.107  Readiness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>. This may be due to Hawkular Metrics not being ready yet. Will try again.
  Warning  Unhealthy  11m                kubelet, 172.16.120.107  Readiness probe failed: Failed to access the status endpoint : HTTP Error 404: Not Found. This may be due to Hawkular Metrics not being ready yet. Will try again.
  Warning  Unhealthy  6m (x3 over 6m)    kubelet, 172.16.120.107  Liveness probe failed: The MetricsService is in a FAILED state. Aborting
  Warning  Unhealthy  1m (x57 over 11m)  kubelet, 172.16.120.107  Readiness probe failed: The MetricService is not yet in the STARTED state [STARTING]. We need to wait until its in the STARTED state.

Comment 2 John Sanda 2018-08-06 18:33:36 UTC

*** This bug has been marked as a duplicate of bug 1612648 ***


Note You need to log in before you can comment on or make changes to this bug.