Bug 1612648
Summary: | [3.7]hawkular-metrics pod failed to start up due to unsuccessful version check | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Junqi Zhao <juzhao> | |
Component: | Hawkular | Assignee: | Ruben Vargas Palma <rvargasp> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Junqi Zhao <juzhao> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 3.7.1 | CC: | ahaile, anli, aos-bugs, rvargasp, wsun | |
Target Milestone: | --- | Keywords: | Regression, TestBlocker | |
Target Release: | 3.7.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | 1611941 | |||
: | 1613095 (view as bug list) | Environment: | ||
Last Closed: | 2019-11-21 18:37:52 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1611941, 1612813, 1619497 | |||
Bug Blocks: |
Comment 1
Junqi Zhao
2018-08-06 03:47:23 UTC
*** Bug 1612813 has been marked as a duplicate of this bug. *** Tested with, metrics-cassandra-v3.7.61-10 metrics-hawkular-metrics-v3.7.61-10 metrics-heapster-v3.7.61-10 it is still "Version check unsuccessful after 30 attempts" # oc get po -n openshift-infra NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-5lbzv 1/1 Running 0 23m hawkular-metrics-wp7wz 0/1 Running 3 23m heapster-wkbl7 0/1 Running 2 23m logs in hawkular-metrics pods ****************************************************************** 2018-08-07 05:17:00,807 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Version check failed: Keyspace hawkular_metrics does not exist 2018-08-07 05:17:00,808 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Trying again in 10000 ms 2018-08-07 05:17:10,812 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Version check failed: Keyspace hawkular_metrics does not exist 2018-08-07 05:17:10,813 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Trying again in 10000 ms 2018-08-07 05:17:20,818 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Version check failed: Keyspace hawkular_metrics does not exist 2018-08-07 05:17:20,818 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Trying again in 10000 ms 2018-08-07 05:17:30,819 FATAL [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle] (metricsservice-lifecycle-thread) The schema version check failed. Start up cannot proceed.: org.hawkular.metrics.api.jaxrs.util.SchemaVersionCheckException: Version check unsuccessful after 30 attempts at org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker.waitForSchemaUpdates(SchemaVersionChecker.java:73) at org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle.doSchemaVersionCheck(MetricsServiceLifecycle.java:531) at org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle.startMetricsService(MetricsServiceLifecycle.java:368) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) *** JBossAS process (340) received TERM signal *** 2018-08-07 05:17:58,938 INFO [org.jboss.as.server] (Thread-5) WFLYSRV0220: Server shutdown has been requested via an OS signal 2018-08-07 05:17:59,011 INFO [org.wildfly.extension.undertow] (ServerService Thread Pool -- 62) WFLYUT0022: Unregistered web context: /hawkular/metrics 2018-08-07 05:17:59,042 INFO [org.wildfly.extension.undertow] (MSC service thread 1-6) WFLYUT0019: Host default-host stopping 2018-08-07 05:17:59,047 INFO [org.jboss.weld.deployer] (MSC service thread 1-2) WFLYWELD0010: Stopping weld service for deployment hawkular-metrics.war 2018-08-07 05:17:59,049 ERROR [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle] (MSC service thread 1-2) HAWKMETRICS200009: Unexcepted exception while shutting down: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.NullPointerException at com.google.common.util.concurrent.Futures.wrapAndThrowExceptionOrError(Futures.java:1464) at com.google.common.util.concurrent.Futures.get(Futures.java:1453) at org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle.destroy(MetricsServiceLifecycle.java:670) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.jboss.weld.injection.producer.DefaultLifecycleCallbackInvoker.invokeMethods(DefaultLifecycleCallbackInvoker.java:98) at org.jboss.weld.injection.producer.DefaultLifecycleCallbackInvoker.preDestroy(DefaultLifecycleCallbackInvoker.java:91) at org.jboss.weld.injection.producer.BasicInjectionTarget.preDestroy(BasicInjectionTarget.java:131) at org.jboss.weld.bean.ManagedBean.destroy(ManagedBean.java:189) at org.jboss.weld.util.bean.IsolatedForwardingBean.destroy(IsolatedForwardingBean.java:50) at org.jboss.weld.context.AbstractContext.destroyContextualInstance(AbstractContext.java:147) at org.jboss.weld.context.AbstractContext.destroy(AbstractContext.java:161) at org.jboss.weld.context.AbstractSharedContext.destroy(AbstractSharedContext.java:61) at org.jboss.weld.context.AbstractSharedContext.invalidate(AbstractSharedContext.java:56) at org.jboss.weld.bootstrap.WeldRuntime.shutdown(WeldRuntime.java:56) at org.jboss.weld.bootstrap.WeldBootstrap.shutdown(WeldBootstrap.java:113) at org.jboss.as.weld.WeldStartService.stop(WeldStartService.java:128) at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:2150) at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:2101) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle.stopServices(MetricsServiceLifecycle.java:682) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ... 3 more 2018-08-07 05:17:59,084 INFO [org.jboss.as.server.deployment] (MSC service thread 1-3) WFLYSRV0028: Stopped deployment activemq-rar.rar (runtime-name: activemq-rar.rar) in 126ms 2018-08-07 05:17:59,096 INFO [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-2) WFLYJCA0010: Unbound data source [java:jboss/datasources/ExampleDS] 2018-08-07 05:17:59,114 INFO [org.wildfly.extension.undertow] (MSC service thread 1-8) WFLYUT0008: Undertow HTTP listener default suspending 2018-08-07 05:17:59,114 INFO [org.wildfly.extension.undertow] (MSC service thread 1-1) WFLYUT0008: Undertow HTTPS listener https suspending 2018-08-07 05:17:59,114 INFO [org.wildfly.extension.undertow] (MSC service thread 1-8) WFLYUT0007: Undertow HTTP listener default stopped, was bound to 0.0.0.0:8080 2018-08-07 05:17:59,114 INFO [org.wildfly.extension.undertow] (MSC service thread 1-1) WFLYUT0007: Undertow HTTPS listener https stopped, was bound to 0.0.0.0:8443 2018-08-07 05:17:59,115 INFO [org.wildfly.extension.undertow] (MSC service thread 1-1) WFLYUT0004: Undertow 1.3.28.Final-redhat-4 stopping 2018-08-07 05:17:59,110 INFO [org.jboss.as.connector.deployers.jdbc] (MSC service thread 1-3) WFLYJCA0019: Stopped Driver service with driver-name = h2 2018-08-07 05:17:59,164 INFO [org.jboss.as.server.deployment] (MSC service thread 1-2) WFLYSRV0028: Stopped deployment hawkular-metrics.war (runtime-name: hawkular-metrics.war) in 197ms 2018-08-07 05:17:59,168 INFO [org.jboss.as] (MSC service thread 1-5) WFLYSRV0050: JBoss EAP 7.0.6.GA (WildFly Core 2.1.15.Final-redhat-1) stopped in 191ms ************************************************************** # oc exec hawkular-cassandra-1-5lbzv -n openshift-infra -- cqlsh --ssl -e "select table_name from system_schema.tables where keyspace_name = 'hawkular_metrics'" table_name ------------ (0 rows) tested with metrics-cassandra-v3.7.61-11 metrics-hawkular-metrics-v3.7.61-11 metrics-heapster-v3.7.61-11 metrics-hawkular-openshift-agent-v3.7.61-11 Issue is fixed, and metrics works well add TestBlocker back,issue is reproduced with metrics-cassandra-v3.7.62-2 metrics-hawkular-metrics-v3.7.62-2 metrics-heapster-v3.7.62-2 # oc get pod -n openshift-infra NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-nzrmw 1/1 Running 0 9m hawkular-metrics-k2wtz 0/1 Running 1 9m heapster-7gxvk 0/1 Running 1 9m # oc logs -f hawkular-metrics-k2wtz -n openshift-infra 2018-08-22 10:45:25,405 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Trying again in 10000 ms 2018-08-22 10:45:35,409 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Version check failed: Keyspace hawkular_metrics does not exist 2018-08-22 10:45:35,409 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Trying again in 10000 ms 2018-08-22 10:45:45,410 FATAL [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle] (metricsservice-lifecycle-thread) The schema version check failed. Start up cannot proceed.: org.hawkular.metrics.api.jaxrs.util.SchemaVersionCheckException: Version check unsuccessful after 30 attempts at org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker.waitForSchemaUpdates(SchemaVersionChecker.java:73) at org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle.doSchemaVersionCheck(MetricsServiceLifecycle.java:531) at org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle.startMetricsService(MetricsServiceLifecycle.java:368) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2018-08-22 10:46:09,906 INFO [org.jboss.as.server] (Thread-5) WFLYSRV0220: Server shutdown has been requested via an OS signal *** JBossAS process (318) received TERM signal *** 2018-08-22 10:46:09,958 INFO [org.wildfly.extension.undertow] (ServerService Thread Pool -- 70) WFLYUT0022: Unregistered web context: /hawkular/metrics 2018-08-22 10:46:10,010 INFO [org.wildfly.extension.undertow] (MSC service thread 1-7) WFLYUT0019: Host default-host stopping 2018-08-22 10:46:10,022 INFO [org.jboss.as.server.deployment] (MSC service thread 1-7) WFLYSRV0028: Stopped deployment activemq-rar.rar (runtime-name: activemq-rar.rar) in 108ms 2018-08-22 10:46:10,024 INFO [org.jboss.weld.deployer] (MSC service thread 1-2) WFLYWELD0010: Stopping weld service for deployment hawkular-metrics.war 2018-08-22 10:46:10,029 ERROR [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle] (MSC service thread 1-2) HAWKMETRICS200009: Unexcepted exception while shutting down: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.NullPointerException at com.google.common.util.concurrent.Futures.wrapAndThrowExceptionOrError(Futures.java:1464) at com.google.common.util.concurrent.Futures.get(Futures.java:1453) at org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle.destroy(MetricsServiceLifecycle.java:670) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.jboss.weld.injection.producer.DefaultLifecycleCallbackInvoker.invokeMethods(DefaultLifecycleCallbackInvoker.java:98) at org.jboss.weld.injection.producer.DefaultLifecycleCallbackInvoker.preDestroy(DefaultLifecycleCallbackInvoker.java:91) at org.jboss.weld.injection.producer.BasicInjectionTarget.preDestroy(BasicInjectionTarget.java:131) at org.jboss.weld.bean.ManagedBean.destroy(ManagedBean.java:189) at org.jboss.weld.util.bean.IsolatedForwardingBean.destroy(IsolatedForwardingBean.java:50) at org.jboss.weld.context.AbstractContext.destroyContextualInstance(AbstractContext.java:147) at org.jboss.weld.context.AbstractContext.destroy(AbstractContext.java:161) at org.jboss.weld.context.AbstractSharedContext.destroy(AbstractSharedContext.java:61) at org.jboss.weld.context.AbstractSharedContext.invalidate(AbstractSharedContext.java:56) at org.jboss.weld.bootstrap.WeldRuntime.shutdown(WeldRuntime.java:56) at org.jboss.weld.bootstrap.WeldBootstrap.shutdown(WeldBootstrap.java:113) at org.jboss.as.weld.WeldStartService.stop(WeldStartService.java:128) at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:2150) at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:2101) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle.stopServices(MetricsServiceLifecycle.java:682) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ... 3 more 2018-08-22 10:46:10,063 INFO [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-2) WFLYJCA0010: Unbound data source [java:jboss/datasources/ExampleDS] 2018-08-22 10:46:10,070 INFO [org.jboss.as.connector.deployers.jdbc] (MSC service thread 1-7) WFLYJCA0019: Stopped Driver service with driver-name = h2 2018-08-22 10:46:10,093 INFO [org.wildfly.extension.undertow] (MSC service thread 1-5) WFLYUT0008: Undertow HTTPS listener https suspending 2018-08-22 10:46:10,094 INFO [org.wildfly.extension.undertow] (MSC service thread 1-5) WFLYUT0007: Undertow HTTPS listener https stopped, was bound to 0.0.0.0:8443 2018-08-22 10:46:10,094 INFO [org.wildfly.extension.undertow] (MSC service thread 1-8) WFLYUT0008: Undertow HTTP listener default suspending 2018-08-22 10:46:10,094 INFO [org.wildfly.extension.undertow] (MSC service thread 1-8) WFLYUT0007: Undertow HTTP listener default stopped, was bound to 0.0.0.0:8080 2018-08-22 10:46:10,095 INFO [org.wildfly.extension.undertow] (MSC service thread 1-8) WFLYUT0004: Undertow 1.3.28.Final-redhat-4 stopping 2018-08-22 10:46:10,128 INFO [org.jboss.as.server.deployment] (MSC service thread 1-8) WFLYSRV0028: Stopped deployment hawkular-metrics.war (runtime-name: hawkular-metrics.war) in 213ms 2018-08-22 10:46:10,135 INFO [org.jboss.as] (MSC service thread 1-4) WFLYSRV0050: JBoss EAP 7.0.6.GA (WildFly Core 2.1.15.Final-redhat-1) stopped in 211ms Issue is not fixed with images: metrics-cassandra:v3.7.64-2 metrics-hawkular-metrics:v3.7.64-2 metrics-heapster:v3.7.64-2 The latest 3.7 image is 3.7.64 and the problem still exist because that image uses Hawkular Metrics 0.28.5 The schema installer was introduced in OCP 3.10. We wanted to back port it to 3.7. The changes were introduced upstream in Hawkular Metrics 0.29.5. Because of the ongoing problems we have had with trying to back port, we are going to update the 3.7 image to use Hawkular Metrics 0.28.4. fixed with metrics-hawkular-metrics-v3.7.64-5 other images: metrics-cassandra-v3.7.64-3 metrics-heapster-v3.7.64-3 # oc get pod -n openshift-infra NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-f8mwb 1/1 Running 0 30m hawkular-metrics-l9sl7 1/1 Running 0 30m heapster-z4cn5 1/1 Running 0 30m please change to ON_QA, so we can close it. Per Comment 9, set it to VERIFIED |