Red Hat Bugzilla – Bug 1306678
handle Cluster Metrics updates
Last modified: 2017-07-24 10:11 EDT
Description of problem:
Currently cluster metrics image version is hard-coded in the template, so during deployment a specific version (currently 3.1.0, see bz#1306665) is created and stays at this version since there's no related imageStream/deploymentConfig/etc
Version-Release number of selected component (if applicable):
OpenShift Enterprise 3.1.1
Steps to Reproduce:
1. Install OpenShift 3.1.0
2. Install Cluster Metrics following the official documentation:
3. version 3.1.0 is deployed
4. Upgrade to 3.1.1
Metrics stay at version 3.1.0
Metrics updated just like Registry and Router
Documentation suggests using image version "latest" as a workaround (and that's default in Origin) yet this may lead to inconsistent results due to node restarts, etc and different pods from the Metrics deployment running various versions at the same time. So we need to handle this systematically.
For now the plan is to have the default for all logging and metrics deployments be the 3.1.1 images. We've verified it works in development and we'll have QE make sure there aren't any regressions for 3.1.0 environments.
In the medium-term we've added a task to https://trello.com/c/pQ2cmWhG/123-8-openshift-ansible-playbook-for-logging-metrics-installation to solve this in a cleaner way in ansible. That's one of our highest priority backlog items.
Actually, I was referring to installing 3.1.1 and getting the 3.1.0 logging/metrics images. I'm going to move this back to assigned and market it upcoming release since it's technically the same thing as the card I mentioned.
Assuming 3.4 updates the template based on:
I would expect metrics to be upgraded to the correct version. Additionally, in 3.5, using the openshift_metrics ansible role will allow you to pass the right value from your host inventory file.
@jcantril, I test below scenario, not sure it's sufficient to verify this bug.
1. deploy a previous version(3.4.1) Metrics using deployer
2. use ansible to deploy 3.5.0 Metrics.
$MASTER ansible_user=root ansible_ssh_user=root ansible_ssh_private_key_file="~/.ssh/libra.pem" openshift_public_hostname=MASTER
3. check the pods are updated to 3.5.0, pvc is there, and previous metrics data(metrics gathered by 3.4.1 Metrics) is there.
# oc get pod
NAME READY STATUS RESTARTS AGE
hawkular-cassandra-1-tnk3t 1/1 Running 0 1m
hawkular-metrics-g5svn 1/1 Running 0 1m
heapster-m2nng 1/1 Running 0 1m
metrics-deployer-djdbc 0/1 Completed 0 13m
# oc get pvc
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
metrics-cassandra-1 Bound pv1 10Gi RWO 13m
@Wei this seems a reasonable test to me.
set to verified base on comment #12 and comment #14
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.