Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1625568

Summary: [3.10] Running upgrade_nodes.yml fails with 'first_master_client_binary' is undefined
Product: OpenShift Container Platform Reporter: Kenjiro Nakayama <knakayam>
Component: Cluster Version OperatorAssignee: Michael Gugino <mgugino>
Status: CLOSED ERRATA QA Contact: Wenkai Shi <weshi>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.10.0CC: aos-bugs, bleanhar, dmoessne, jokerman, mmccomas, pdwyer, scott.c.worthington
Target Milestone: ---   
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-01 15:35:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1636018    

Description Kenjiro Nakayama 2018-09-05 08:53:33 UTC
Description of problem:

- When running Running /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_10/upgrade_nodes.yml fails with following error:

  1. Hosts:    abc03.example.com
     Play:     Drain and upgrade nodes
     Task:     Check for GlusterFS cluster health
     Message:  The task includes an option with an undefined variable. The error was: 'first_master_client_binary' is undefined

               The error appears to have been in '/usr/share/ansible/openshift-ansible/roles/openshift_storage_glusterfs/tasks/cluster_health.yml': line 4, column 3, but may
               be elsewhere in the file depending on the exact syntax problem.

               The offending line appears to be:

               # lib_utils/library/glusterfs_check_containerized.py
               - name: Check for GlusterFS cluster health
                 ^ here

               exception type: <class 'ansible.errors.AnsibleUndefinedVariable'>
               exception: 'first_master_client_binary' is undefined

Version-Release number of the following components:
- OCP 3.10.14 to 3.10.34
- ansible version 2.4.6.0
- playbook version (confirming it rihgt now)

How reproducible: Not 100%, but customer hits the issue 2/2.

Steps to Reproduce:
1. Run /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_10/upgrade_nodes.yml

Actual results:
- Please refer to above error and attached logs in private.

Expected results:
- Pass Check for GlusterFS cluster health task.

Additional info:
- proposal patch https://github.com/openshift/openshift-ansible/pull/9917
- playbook version and inventory file will be uploaded later.
- Entire ansible log is attached in private.

Comment 2 Kenjiro Nakayama 2018-09-05 12:27:18 UTC
Here is the playbook version:

$ rpm -qa | grep ansible
openshift-ansible-3.10.41-1.git.0.fd15dd7.el7.noarch
openshift-ansible-roles-3.10.41-1.git.0.fd15dd7.el7.noarch
ansible-2.4.6.0-1.el7ae.noarch
ansible-lint-3.4.15-1.el7.noarch
openshift-ansible-docs-3.10.41-1.git.0.fd15dd7.el7.noarch
openshift-ansible-playbooks-3.10.41-1.git.0.fd15dd7.el7.noarch

Comment 4 Michael Gugino 2018-09-05 13:54:05 UTC
PR Created in master: https://github.com/openshift/openshift-ansible/pull/9925

Will backport to 3.10 after merge.

Comment 5 Michael Gugino 2018-09-05 14:20:41 UTC
We will merge and backport https://github.com/openshift/openshift-ansible/pull/9924 instead of 9925.

Comment 6 Michael Gugino 2018-09-05 23:48:04 UTC
Master PR merged; Backport to 3.10 created: https://github.com/openshift/openshift-ansible/pull/9933

Comment 7 Kenjiro Nakayama 2018-09-05 23:59:46 UTC
Thank you Micahel & I'm sorry for bothering you.

Comment 8 Michael Gugino 2018-09-06 00:41:38 UTC
(In reply to Kenjiro Nakayama from comment #7)
> Thank you Micahel & I'm sorry for bothering you.

It's no bother, thanks for the patch, nice work.  BZ's and patches are always welcome!

Comment 10 Wenkai Shi 2018-09-21 07:00:43 UTC
Verified with version openshift-ansible-3.10.47-1.git.0.95bc2d2.el7_5.noarch, code has been merged, upgrade doesn't have this issue now. 

# rpm -q openshift-ansible
openshift-ansible-3.10.47-1.git.0.95bc2d2.el7_5.noarch
# grep -nir "oc_bin" /usr/share/ansible/openshift-ansible/roles/openshift_storage_glusterfs/tasks/cluster_health.yml
6:    oc_bin: "{{ hostvars[groups.oo_first_master.0]['first_master_client_binary'] }}"

Comment 11 Scott Dodson 2018-10-01 15:35:10 UTC
The fix for this is included in openshift-ansible-3.10.47-1.git.0.95bc2d2.el7_5 which is the latest 3.10 errata.