Bug 1567857

Summary: [3.6] Upgrade failed for validate_etcd_conf.yml not found
Product: OpenShift Container Platform Reporter: Gaoyun Pei <gpei>
Component: Cluster Version OperatorAssignee: Vadim Rutkovsky <vrutkovs>
Status: CLOSED ERRATA QA Contact: Gaoyun Pei <gpei>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.1CC: aos-bugs, jokerman, mifiedle, mmccomas, wmeng
Target Milestone: ---   
Target Release: 3.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-07 20:20:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1563375    

Description Gaoyun Pei 2018-04-16 10:06:22 UTC
Description of problem:

During upgrade from 3.6.173.0.21 to 3.6.173.0.113, upgrade playbook failed as below:

TASK [etcd_upgrade : Failt if r_etcd_upgrade_mechanism is not set during upgrade] *******************************************************************************************
skipping: [ec2-34-207-75-197.compute-1.amazonaws.com] => {
    "changed": false, 
    "skip_reason": "Conditional result was False", 
    "skipped": true
}

TASK [etcd_upgrade : Upgrade rpm based etcd] ********************************************************************************************************************************
fatal: [ec2-34-207-75-197.compute-1.amazonaws.com]: FAILED! => {
    "failed": true, 
    "reason": "Unable to retrieve file contents\nCould not find or access '/usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/etcd/validate_etcd_conf.yml'"
}
	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_6/upgrade.retry


This should be a regression issue caused by https://github.com/openshift/openshift-ansible/pull/7781/commits/44ecd904b493cdee8b7673b9da4bae714df50236
validate_etcd_conf.yml is included in role etcd_upgrade, but it's added under roles/etcd/tasks/upgrade/
 

Version-Release number of the following components:
openshift-ansible-3.6.173.0.113-1.git.0.8a42ef5.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1.Run upgrade playbook against the old version 3.6 cluster
ansible-playbook -i host/host /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_6/upgrade.yml


Actual results:

Expected results:

Additional info:

Comment 1 Vadim Rutkovsky 2018-04-17 09:25:15 UTC
Created https://github.com/openshift/openshift-ansible/pull/7988

Comment 4 Mike Fiedler 2018-04-26 20:36:12 UTC
Confirmed the PR in comment 1 fixes the issue.   Tested off of release-3.6 branch with latest commits and did not see the problem any more.

Comment 5 Gaoyun Pei 2018-04-27 07:09:53 UTC
Verify this bug with openshift-ansible-3.6.173.0.113-1.git.1.8eaab14.el7.noarch

During upgrade from 3.5 to 3.6, also with etcd upgrade from etcd-3.1.9-2.el7.x86_64 to etcd-3.2.15-2.el7.x86_64, this step passed.

TASK [etcd_upgrade : Upgrade rpm based etcd] ********************************************************************************************************************************
included: /usr/share/ansible/openshift-ansible/roles/etcd_upgrade/tasks/upgrade_rpm.yml for ec2-54-227-169-85.compute-1.amazonaws.com

Comment 9 errata-xmlrpc 2018-05-07 20:20:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1335