Bug 1536839

Summary: Upgrade against containerized ocp failed at Verify masters are already upgraded
Product: OpenShift Container Platform Reporter: Weihua Meng <wmeng>
Component: Cluster Version OperatorAssignee: Michael Gugino <mgugino>
Status: CLOSED ERRATA QA Contact: Weihua Meng <wmeng>
Severity: high Docs Contact:
Priority: high    
Version: 3.9.0CC: aos-bugs, jiajliu, jokerman, mgugino, mmccomas, wmeng, wsun, xtian
Target Milestone: ---Keywords: TestBlocker
Target Release: 3.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-28 14:21:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1540537, 1551388    
Bug Blocks:    

Comment 1 liujia 2018-01-23 02:11:52 UTC
Block container upgrade

Comment 2 Michael Gugino 2018-01-23 21:29:59 UTC
PR Created: https://github.com/openshift/openshift-ansible/pull/6842

Comment 3 Weihua Meng 2018-01-25 11:55:12 UTC
Fixed.
openshift-ansible-3.9.0-0.24.0.git.0.735690f.el7.noarch

PLAY [Verify masters are already upgraded] ************************************************************************************************************************************************************************

TASK [fail] *******************************************************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/pre/config.yml:60
skipping: [host-8-244-181.host.centralci.eng.rdu2.redhat.com] => {
    "changed": false,
    "skip_reason": "Conditional result was False"
}
META: ran handlers
META: ran handlers
PLAY [Validate configuration for rolling restart] *****************************************************************************************************************************************************************

Comment 4 Weihua Meng 2018-01-25 12:09:44 UTC
Sorry. Wrong steps used above.

Still got error.
openshift-ansible-3.9.0-0.24.0.git.0.735690f.el7.noarch

TASK [fail] *******************************************************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/pre/config.yml:60
fatal: [host-8-244-181.host.centralci.eng.rdu2.redhat.com]: FAILED! => {
    "changed": false, 
    "msg": "Master running 3.9 must be upgraded to 3.9.0 before node upgrade can be run."
}
	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade_nodes.retry

Comment 5 Michael Gugino 2018-01-25 14:20:07 UTC
Please include full verbose output of the play run.

Also, how are the masters being upgraded?  Is that happening during the same test?  I need to see inventory, procedure steps, and output for that as well.

Comment 9 Weihua Meng 2018-01-26 08:11:14 UTC
1. set up OCP 3.7 on Atomic Host
2. ansible-playbook -vvv -i ah3726.inv /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade_control_plane.yml | tee up39024ah_ctrl_01.log
3. ansible-playbook -vvv -i ah3726.inv /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade_nodes.yml | tee up39024ah_nodes.log

Comment 10 Michael Gugino 2018-01-26 17:32:41 UTC
Weihua,

  Thank you for the attachments, they are very helpful.

It seems this is an edge cause when upgrading from 3.8 to 3.9 instead of 3.7 to 3.9.

I have created additional logic to account for both scenarios regarding the usage of openshift_image_tag: https://github.com/openshift/openshift-ansible/pull/6896

Comment 11 liujia 2018-01-29 09:12:47 UTC
Still hit the issue on latest build openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7.noarch.

Comment 12 liujia 2018-01-29 09:17:35 UTC
Blocker containerized ocp's upgrade test

Comment 13 Scott Dodson 2018-01-29 15:28:30 UTC
PR from comment 10 has been merged but not in a tagged build yet, MODIFIED

Comment 15 Weihua Meng 2018-02-01 06:42:17 UTC
not in latest build.
waiting for new build.

Comment 16 Xiaoli Tian 2018-02-02 03:29:20 UTC
Has been merged since openshift-ansible-3.9.0-0.32

Comment 17 Weihua Meng 2018-02-08 13:24:46 UTC
There is still something wrong. Please check it. Thanks.

openshift-ansible-3.9.0-0.41.0.git.0.8290c01.el7.noarch

containerized installation on Atomic Host
with openshift_image_tag=v3.9.0-0.41.0 in inventory file.

other parameters are same with before.

TASK [fail] *******************************************************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/pre/config.yml:60
fatal: [host-xxx.redhat.com]: FAILED! => {
    "changed": false, 
    "msg": "Master running 3.9.0 must be upgraded to 3.6.173.0.96 before node upgrade can be run."


Failure summary:


  1. Hosts:    host-xxx.redhat.com
     Play:     Verify masters are already upgraded
     Task:     fail
     Message:  Master running 3.9.0 must be upgraded to 3.6.173.0.96 before node upgrade can be run.

Comment 18 Michael Gugino 2018-02-15 15:50:20 UTC
This should be fixed by: https://github.com/openshift/openshift-ansible/pull/7124

PR Merged.

Comment 19 Weihua Meng 2018-02-16 03:22:39 UTC
Fixed.
openshift-ansible-3.9.0-0.45.0.git.0.05f6826.el7.noarch

Upgrade succeeded without error.
Thanks.

Comment 22 errata-xmlrpc 2018-03-28 14:21:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489