Bug 1650020 - hawkular-metrics pod failed to become ready, infinispan configuration is not right
Summary: hawkular-metrics pod failed to become ready, infinispan configuration is not ...
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: 3.6.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.6.z
Assignee: Jan Martiska
QA Contact: Junqi Zhao
URL:
Whiteboard:
Keywords: Regression, TestBlocker
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-15 07:19 UTC by Junqi Zhao
Modified: 2018-12-03 17:34 UTC (History)
6 users (show)

(edit)
undefined
Clone Of:
(edit)
Last Closed: 2018-12-03 17:33:57 UTC


Attachments (Terms of Use)
metrics 36 logs (17.76 KB, application/x-gzip)
2018-11-15 07:19 UTC, Junqi Zhao
no flags Details
v3.6.173.0.138_standalone.xml (20.69 KB, text/plain)
2018-11-15 07:27 UTC, Junqi Zhao
no flags Details
v3.6.173.0.130-3_standalone.xml (27.29 KB, application/xml)
2018-11-15 07:28 UTC, Junqi Zhao
no flags Details
patch file (13.23 KB, patch)
2018-11-16 06:21 UTC, Jan Martiska
no flags Details | Diff
metrics pods log - v3.6.173.0.139 (18.22 KB, application/x-gzip)
2018-11-19 13:30 UTC, Junqi Zhao
no flags Details
metrics logs with metrics-hawkular-metrics-v3.6.173.0.139-2 (20.11 KB, application/x-gzip)
2018-11-21 08:29 UTC, Junqi Zhao
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:3598 None None None 2018-12-03 17:34 UTC

Description Junqi Zhao 2018-11-15 07:19:46 UTC
Created attachment 1505961 [details]
metrics 36 logs

Description of problem:
Deployed metrics v3.6.173.0.138, hawkular-metrics pod is not READY
# oc -n openshift-infra get pod
NAME                         READY     STATUS    RESTARTS   AGE
hawkular-cassandra-1-ctc07   1/1       Running   0          38m
hawkular-metrics-t9tz5       0/1       Running   2          18m
heapster-zbz8g               0/1       Running   4          37m

# oc -n openshift-infra logs hawkular-metrics-t9tz5
2018-11-15 06:40:32,319 INFO  [org.wildfly.security] (MSC service thread 1-2) ELY00001: WildFly Elytron version 1.1.10.Final-redhat-1
2018-11-15 06:40:32,427 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("deploy") failed - address: ([("deployment" => "hawkular-metrics.war")]) - failure description: {
    "WFLYCTL0412: Required services that are not installed:" => [
        "jboss.naming.context.java.jboss.infinispan.container.hawkular-metrics",
        "jboss.naming.context.java.jboss.infinispan.cache.hawkular-metrics.locks"
    ],
    "WFLYCTL0180: Services with missing/unavailable dependencies" => [
        "jboss.naming.context.java.module.hawkular-metrics.hawkular-metrics.env.container.hawkular-metrics is missing [jboss.naming.context.java.jboss.infinispan.container.hawkular-metrics]",
        "jboss.naming.context.java.module.hawkular-metrics.hawkular-metrics.env.cache.locks is missing [jboss.naming.context.java.jboss.infinispan.cache.hawkular-metrics.locks]"
    ]
}
2018-11-15 06:40:32,437 INFO  [org.jboss.as.server] (ServerService Thread Pool -- 34) WFLYSRV0010: Deployed "hawkular-metrics.war" (runtime-name : "hawkular-metrics.war")
2018-11-15 06:40:32,437 INFO  [org.jboss.as.server] (ServerService Thread Pool -- 34) WFLYSRV0010: Deployed "activemq-rar.rar" (runtime-name : "activemq-rar.rar")
2018-11-15 06:40:32,440 INFO  [org.jboss.as.controller] (Controller Boot Thread) WFLYCTL0183: Service status report
WFLYCTL0184:    New missing/unsatisfied dependencies:
      service jboss.naming.context.java.jboss.infinispan.cache.hawkular-metrics.locks (missing) dependents: [service jboss.naming.context.java.module.hawkular-metrics.hawkular-metrics.env.cache.locks] 
      service jboss.naming.context.java.jboss.infinispan.container.hawkular-metrics (missing) dependents: [service jboss.naming.context.java.module.hawkular-metrics.hawkular-metrics.env.container.hawkular-metrics] 
2018-11-15 06:40:32,497 INFO  [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server
2018-11-15 06:40:32,499 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://127.0.0.1:9990/management
2018-11-15 06:40:32,499 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://127.0.0.1:9990
2018-11-15 06:40:32,499 ERROR [org.jboss.as] (Controller Boot Thread) WFLYSRV0026: JBoss EAP 7.1.4.GA (WildFly Core 3.0.17.Final-redhat-1) started (with errors) in 8689ms - Started 424 of 709 services (22 services failed or missing dependencies, 391 services are lazy, passive or on-demand)

Version-Release number of selected component (if applicable):
# openshift version
openshift v3.6.173.0.138
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

metrics images version: v3.6.173.0.138

How reproducible:
Always

Steps to Reproduce:
1. Deploy metrics v3.6.173.0.138
2.
3.

Actual results:
hawkular-metrics pod is not READY

Expected results:
hawkular-metrics should be READY

Additional info:

Comment 3 Junqi Zhao 2018-11-15 07:27 UTC
Created attachment 1505962 [details]
v3.6.173.0.138_standalone.xml

Comment 4 Junqi Zhao 2018-11-15 07:28 UTC
Created attachment 1505963 [details]
v3.6.173.0.130-3_standalone.xml

Comment 11 Jan Martiska 2018-11-16 06:21 UTC
Created attachment 1506308 [details]
patch file

Comment 13 Junqi Zhao 2018-11-19 13:22:25 UTC
The latest metrics-hawkular-metrics is metrics-hawkular-metrics:v3.6.173.0.139-1,
issue is still not fixed.

still see error in metrics-hawkular-metrics pod logs:
2018-11-19 13:10:24,926 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("deploy") failed - address: ([("deployment" => "hawkular-metrics.war")]) - failure description: {
    "WFLYCTL0412: Required services that are not installed:" => [
        "jboss.naming.context.java.jboss.infinispan.container.hawkular-metrics",
        "jboss.naming.context.java.jboss.infinispan.cache.hawkular-metrics.locks"
    ],
    "WFLYCTL0180: Services with missing/unavailable dependencies" => [
        "jboss.naming.context.java.module.hawkular-metrics.hawkular-metrics.env.container.hawkular-metrics is missing [jboss.naming.context.java.jboss.infinispan.container.hawkular-metrics]",
        "jboss.naming.context.java.module.hawkular-metrics.hawkular-metrics.env.cache.locks is missing [jboss.naming.context.java.jboss.infinispan.cache.hawkular-metrics.locks]"
    ]

BTW, since we want to release openshift v3.6.173.0.138, so the metrics-hawkular-metrics should use v3.6.173.0.138 two, and the latest metrics-hawkular-metrics:v3.6.173.0.138 points to v3.6.173.0.138-2, then the image version fix this error should be v3.6.173.0.138-3

Comment 15 Junqi Zhao 2018-11-19 13:30 UTC
Created attachment 1507223 [details]
metrics pods log - v3.6.173.0.139

Comment 19 Junqi Zhao 2018-11-21 08:26:42 UTC
Tested with metrics-hawkular-metrics-v3.6.173.0.139-2, issue is not fixed and hawkular-metrics threw out error: "org.jboss.weld.exceptions.DeploymentException: WELD-001408: Unsatisfied dependencies for type MetricsService with qualifiers @Default"

more info please see the attached logs file

# oc -n openshift-infra get pod
NAME                         READY     STATUS    RESTARTS   AGE
hawkular-cassandra-1-kmrtw   1/1       Running   0          22m
hawkular-metrics-rnjpw       0/1       Running   2          22m
heapster-kvpvh               0/1       Running   2          22m

Comment 20 Junqi Zhao 2018-11-21 08:29 UTC
Created attachment 1507572 [details]
metrics logs with metrics-hawkular-metrics-v3.6.173.0.139-2

Comment 26 Junqi Zhao 2018-11-23 02:18:12 UTC
Tested metrics-hawkular-metrics:v3.6.173.0.139-3, issue is fixed

The most important thing is
we want to release errata: https://errata.devel.redhat.com/advisory/37911
and openshift version is v3.6.173.0.138, so the metrics-hawkular-metrics should use v3.6.173.0.138 too, the latest metrics-hawkular-metrics:v3.6.173.0.138 links to v3.6.173.0.138-2, then the image version fix this error should be v3.6.173.0.138-3, we should re-tag metrics-hawkular-metrics:v3.6.173.0.139-3 as metrics-hawkular-metrics:v3.6.173.0.138-3

and I had tested, the following metrics v3.6.173.0.138 works with metrics-hawkular-metrics-v3.6.173.0.139-3

metrics-cassandra-v3.6.173.0.138-2
metrics-heapster-v3.6.173.0.138-2

Comment 31 Junqi Zhao 2018-11-27 00:39:57 UTC
Issue is fixed with images:
metrics-hawkular-metrics:v3.6.173.0.139-3
metrics-cassandra:v3.6.173.0.139-1
metrics-heapster:v3.6.173.0.139-1

Set to VERIFIED

Comment 33 Eric Christensen 2018-11-28 18:28:30 UTC
Can someone please add doc_text to this bug so I can add it to an RHSA that will be going out soon?  Thanks!

Comment 34 Jan Martiska 2018-12-03 06:43:03 UTC
I would say this doesn't need any doc update because this is not a bug that could be seen in any publicly released images, only in internal ones. I'll set the doc type to 'no doc update', let me know if that's wrong.

Comment 36 errata-xmlrpc 2018-12-03 17:33:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3598


Note You need to log in before you can comment on or make changes to this bug.