Bug 1544546 - Use deployment config for hawkular-metrics to improve deployment process
Summary: Use deployment config for hawkular-metrics to improve deployment process
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.9.z
Assignee: John Sanda
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks: 1540413
TreeView+ depends on / blocked
 
Reported: 2018-02-12 20:12 UTC by John Sanda
Modified: 2021-12-10 15:40 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-26 19:13:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description John Sanda 2018-02-12 20:12:02 UTC
Description of problem:
At start up hawkular-metrics applies schema updates to Cassandra if necessary. Schema updates should generally be done serially in Cassandra so as to avoid inconsistencies between Cassandra nodes. In theory concurrent schema updates to a Cassandra cluster should not be a problem. In reality, they often are a source of problems. 

If the replica count for hawkular-metrics is greater than one, there is a possibility of concurrent schema updates. We use an infinispan cache at start up in hawkular-metrics for coordination with schema updates. On the one hand, this seem like overkill to introduce infinispan just for this one small use case. At the time it seemed like a reasonable approach because it could be used in environments other than OpenShift.

As it turns out now, OpenShift is the only environment we need to worry about for hawkular-metrics. The Infinispan integration has been a source of some problems (see bug 1469423).

With deployment config, we can use a lifecycle hook to run a single container that will apply schema updates before any hawkular-metrics pods are started. By utilizing the lifecycle hook, we do not have to worry about concurrent schema updates. And coupled with bug 1543647, we can completely all dependencies on infinispan which will further simplify things.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Borja Aranda 2018-02-15 15:01:17 UTC
I'm adding case 02022261 to this BZ, the errors reported seems to be related to infinispan/JGroups and they show always when the hawkular-metrics is scaled to 2 replicas.

Errors:
2018-01-29 08:21:58,577 ERROR [org.jgroups.protocols.ASYM_ENCRYPT] (thread-1,ee,hawkular-metrics-l1l9x) null: key server is currently not set

JOIN fails:
2018-01-29 08:22:26,995 ERROR [org.jgroups.protocols.ASYM_ENCRYPT] (thread-1,ee,hawkular-metrics-l1l9x) null: key server is currently not set
2018-01-29 08:22:29,894 WARN  [org.jgroups.protocols.pbcast.GMS] (MSC service thread 1-6) hawkular-metrics-l1l9x: JOIN(hawkular-metrics-l1l9x) sent to hawkular-metrics-rz9qc timed out (after 3000 ms), on try 10
2018-01-29 08:22:29,894 WARN  [org.jgroups.protocols.pbcast.GMS] (MSC service thread 1-6) hawkular-metrics-l1l9x: too many JOIN attempts (10): becoming singleton

2018-01-29 08:22:34,518 WARN  [org.jgroups.protocols.ASYM_ENCRYPT] (thread-1,ee,hawkular-metrics-l1l9x) hawkular-metrics-l1l9x: unrecognized cipher; discarding message from hawkular-metrics-rz9qc

Comment 6 John Sanda 2018-03-26 19:13:07 UTC
The lifecycle hooks in the deployment config do not work the way I thought and will not be a suitable solution. I am closing this ticket. I have created bug 1560695 to address the deployment issues.


Note You need to log in before you can comment on or make changes to this bug.