Bug 1420229

Summary: [IntService_public_295] upgrade from 3.4 metrics failed, no object is refreshed
Product: OpenShift Container Platform Reporter: Peng Li <penli>
Component: InstallerAssignee: Jeff Cantrill <jcantril>
Status: CLOSED ERRATA QA Contact: Johnny Liu <jialiu>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.5.0CC: aos-bugs, bbarcaro, jokerman, mmccomas
Target Milestone: ---   
Target Release: 3.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-14 21:01:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1306678    

Description Peng Li 2017-02-08 09:11:59 UTC
Description of problem:
User should be able to upgrade from 3.4 to 3.5 Metrics, this should do what deployer used to do when set MODE=refresh, pv and route should be kept, but pods are updated with new version of images. As we discussed in the card, this should be done by just run the playbook.
However, after run the playbook, previous version is still there.

Version-Release number of selected component (if applicable):
OCP 3.5
openshift-ansible master branch

How reproducible:
always

Steps to Reproduce:
1. Deploy 3.4 Metrics and check

oc project openshift-infra

oc create -f - <<API
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metrics-deployer
secrets:
- name: metrics-deployer
API

oadm policy add-role-to-user edit system:serviceaccount:openshift-infra:metrics-deployer

oc secrets new metrics-deployer nothing=/dev/null

oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:openshift-infra:heapster

oadm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra

oc new-app -f metrics.yaml --as=system:serviceaccount:openshift-infra:metrics-deployer \
-p IMAGE_PREFIX=$PREFIX \
-p IMAGE_VERSION=3.4.1 \
-p HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.$SUBDOMAIN \
-p MODE=deploy \
-p USE_PERSISTENT_STORAGE=true \
-p MASTER_URL=$MASTERURL \
-p DYNAMICALLY_PROVISION_STORAGE=true \
-p CASSANDRA_NODES=1 \
-p CASSANDRA_PV_SIZE=5Gi \
-p USER_WRITE_ACCESS=false

# oc get pod
NAME                         READY     STATUS      RESTARTS   AGE
hawkular-cassandra-1-z5b7w   1/1       Running     0          1h
hawkular-metrics-n7jkp       1/1       Running     0          1h
heapster-jnkv6               1/1       Running     0          1h
metrics-deployer-vvzbl       0/1       Completed   0          1h

# oc get pvc
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESSMODES   AGE
metrics-cassandra-1   Bound     pvc-550a6728-edca-11e6-bb92-0e9aebfd1b9e   5Gi        RWO           1h

2. prepare inventory file

[oo_first_master]
$MASTER ansible_user=root ansible_ssh_user=root ansible_ssh_private_key_file="~/.ssh/libra.pem" openshift_public_hostname=$MASTER

[oo_first_master:vars]
deployment_type=openshift-enterprise
openshift_release=v3.5.0

openshift_metrics_install_metrics=true

openshift_metrics_hawkular_hostname=hawkular-metrics.$SUBDOMAIN
openshift_metrics_project=openshift-infra

openshift_metrics_image_prefix=registry.ops.openshift.com/openshift3/
openshift_metrics_image_version=3.5.0

openshift_metrics_cassandra_storage_type=dynamic
openshift_metrics_cassandra_pv_size=5Gi

3. run playbook
git clone https://github.com/openshift/openshift-ansible
ansible-playbook  -vvv  -i  ~/inventory  playbooks/common/openshift-cluster/openshift_metrics.yml

Actual results:
# oc get pod
NAME                         READY     STATUS      RESTARTS   AGE
hawkular-cassandra-1-z5b7w   1/1       Running     0          2h
hawkular-metrics-n7jkp       1/1       Running     0          2h
heapster-jnkv6               1/1       Running     0          2h
metrics-deployer-vvzbl       0/1       Completed   0          2h


Expected results:
Previous Metrics should be updated, metrics data and route should still be there

Additional info:
Ansible log is attached.

Comment 2 Peng Li 2017-02-08 09:18:55 UTC
check the pod name, no pod is deleted and created with newer version.

Comment 3 Jeff Cantrill 2017-02-09 01:33:49 UTC
Reran this scenario, what I find is:

RC has image like: registry.ops.openshift.com/openshift3/metrics-cassandra:3.5.0
Pod has image: registry.ops.openshift.com/openshift3/metrics-cassandra:3.4.1

It appears the issue is the pods need to be bounced to pick up the new image since they do not automatically redeploy.  Looking into what can be done to cycle the RC.

Lowering severity as its not a blocker.

Comment 4 Jeff Cantrill 2017-02-09 15:10:46 UTC
fixed in https://github.com/openshift/openshift-ansible/pull/3309

Comment 5 openshift-github-bot 2017-02-10 22:00:26 UTC
Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/ac23d6fc37dd98be8ad4ecc5a924d482e6e74957
bug 1420229. Bounce metrics components to recognize changes on updates or upgrades

Comment 6 Peng Li 2017-02-20 06:12:05 UTC
@jcantril, this pr is merged.
https://github.com/openshift/openshift-ansible/pull/3309
And 3.4.1 to 3.5.0 upgrade test scenario is passed, please feel free to set it to ON_QA, then I can close it.

Comment 7 Peng Li 2017-02-21 00:09:39 UTC
set status to verified based on comment #6.

Comment 11 errata-xmlrpc 2017-12-14 21:01:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3438