Description of problem: Metrics install leads to non-working metrics stack with hawkular-cassandra pod showing below logs: # oc logs hawkular-cassandra-1-xxxxx |grep "hawkular_metrics" 2018-09-06 06:28:36,583 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Version check failed: Keyspace hawkular_metrics does not exist 2018-09-06 06:28:36,583 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Trying again in 10000 ms 2018-09-06 06:28:46,597 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Version check failed: Keyspace hawkular_metrics does not exist 2018-09-06 06:28:46,597 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Trying again in 10000 ms 2018-09-06 06:28:56,599 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Version check failed: Keyspace hawkular_metrics does not exist 2018-09-06 06:28:56,599 INFO [org.hawkular.metrics.api.jaxrs.util.SchemaVersionChecker] (metricsservice-lifecycle-thread) Trying again in 10000 ms Additional info: openshift-ansible ==> v3.10.41 # oc version oc v3.10.34 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://masters.lab.example.com:8443 openshift v3.10.34 kubernetes v1.10.0+b81c8f8:
image: registry.access.redhat.com/openshift3/metrics-cassandra:v3.10 registry.access.redhat.com/openshift3/metrics-hawkular-metrics:v3.10 registry.access.redhat.com/openshift3/metrics-heapster:v3.10 openshift3/metrics-cassandra:v3.10.14-12 openshift3/metrics-hawkular-metrics:v3.10.14-12 openshift3/metrics-heapster:v3.10.14-13
This looks very similar to bug 1625417. See https://access.redhat.com/solutions/3606401 for a work around.
The problem appears to be that the schema installer job (see https://goo.gl/Ry6vPr) failed or did not run. Prior to 3.10, Hawkular Metrics at start up would install/update schema in Cassandra. That has changed in 3.10 however. The hawkular-metrics pod no longer applies any schema changes. The schema installer k8s job installs/updates schema. The hawkular-metrics pod polls cassandra, waiting for the schema to be updated (if necessary). You can check to see if there is a pod for the schema installer job. I suspect that it was never deployed. We will need to investigate more to figure out what caused the regression.
Do you have the output from running the playbook? If not, can you run the playbook again and share the output?
A new playbook for run the schema job on demand was introduced to solve this BZ. The PR for 3.11 was already merged: https://github.com/openshift/openshift-ansible/pull/10340 The PR for 3.10 is still in review: https://github.com/openshift/openshift-ansible/pull/10340 You will be able to run the schema installer job running the following playbook. ansible-playbook ./openshift-ansible/playbooks/openshift-metrics/schema.yml -i <inventory_file>
Both PRs are already merged, those changes allow you to re-run the schema installer. Could we close this BZ? or there is still something pending here?
Hello, Customer is using Persistent volume for his cassandra pod but he is still getting the error and following workaround mentioned in below KCS every time after pod patch update. https://access.redhat.com/solutions/3645682 customer is currently using openshift-ansible version 3.11.374-1. I see that this issue has been fixed in version openshift-ansible-3.11.23-1. Thanks, Vijay
Customer has confirmed that issue is with his PV. Hence closing the bug *** This bug has been marked as a duplicate of bug 1632870 ***