Hide Forgot
Description of problem: Currently cluster metrics image version is hard-coded in the template, so during deployment a specific version (currently 3.1.0, see bz#1306665) is created and stays at this version since there's no related imageStream/deploymentConfig/etc Version-Release number of selected component (if applicable): OpenShift Enterprise 3.1.1 How reproducible: always Steps to Reproduce: 1. Install OpenShift 3.1.0 2. Install Cluster Metrics following the official documentation: https://access.redhat.com/documentation/en/openshift-enterprise/3.1/installation-and-configuration/chapter-18-enabling-cluster-metrics#creating-the-deployer-template 3. version 3.1.0 is deployed 4. Upgrade to 3.1.1 Actual results: Metrics stay at version 3.1.0 Expected results: Metrics updated just like Registry and Router Additional info: Documentation suggests using image version "latest" as a workaround (and that's default in Origin) yet this may lead to inconsistent results due to node restarts, etc and different pods from the Metrics deployment running various versions at the same time. So we need to handle this systematically.
For now the plan is to have the default for all logging and metrics deployments be the 3.1.1 images. We've verified it works in development and we'll have QE make sure there aren't any regressions for 3.1.0 environments. In the medium-term we've added a task to https://trello.com/c/pQ2cmWhG/123-8-openshift-ansible-playbook-for-logging-metrics-installation to solve this in a cleaner way in ansible. That's one of our highest priority backlog items.
Actually, I was referring to installing 3.1.1 and getting the 3.1.0 logging/metrics images. I'm going to move this back to assigned and market it upcoming release since it's technically the same thing as the card I mentioned.
Assuming 3.4 updates the template based on: https://github.com/openshift/openshift-ansible/blob/release-1.4/roles/openshift_hosted_templates/files/v1.4/enterprise/metrics-deployer.yaml#L108 and/or https://github.com/openshift/openshift-ansible/blob/openshift-ansible-3.4.58-1/roles/openshift_hosted_templates/files/v1.4/enterprise/metrics-deployer.yaml I would expect metrics to be upgraded to the correct version. Additionally, in 3.5, using the openshift_metrics ansible role will allow you to pass the right value from your host inventory file.
@jcantril, I test below scenario, not sure it's sufficient to verify this bug. 1. deploy a previous version(3.4.1) Metrics using deployer 2. use ansible to deploy 3.5.0 Metrics. [oo_first_master] $MASTER ansible_user=root ansible_ssh_user=root ansible_ssh_private_key_file="~/.ssh/libra.pem" openshift_public_hostname=MASTER [oo_first_master:vars] deployment_type=openshift-enterprise openshift_release=v3.5.0 openshift_metrics_install_metrics=true openshift_metrics_hawkular_hostname=hawkular-metrics.$SUBDOMAIN openshift_metrics_project=openshift-infra openshift_metrics_image_prefix=registry.ops.openshift.com/openshift3/ openshift_metrics_image_version=3.5.0 openshift_metrics_cassandra_storage_type=pv openshift_metrics_cassandra_pvc_size=10Gi 3. check the pods are updated to 3.5.0, pvc is there, and previous metrics data(metrics gathered by 3.4.1 Metrics) is there. # oc get pod NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-tnk3t 1/1 Running 0 1m hawkular-metrics-g5svn 1/1 Running 0 1m heapster-m2nng 1/1 Running 0 1m metrics-deployer-djdbc 0/1 Completed 0 13m # oc get pvc NAME STATUS VOLUME CAPACITY ACCESSMODES AGE metrics-cassandra-1 Bound pv1 10Gi RWO 13m
@Wei this seems a reasonable test to me.
set to verified base on comment #12 and comment #14
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0903