Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1536317

Summary: upgrade etcd failed at TASK [Run variable sanity checks]
Product: OpenShift Container Platform Reporter: Weihua Meng <wmeng>
Component: Cluster Version OperatorAssignee: Michael Gugino <mgugino>
Status: CLOSED CURRENTRELEASE QA Contact: Weihua Meng <wmeng>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.9.0CC: aos-bugs, bleanhar, ccallega, jokerman, mmccomas, wmeng, wsun
Target Milestone: ---   
Target Release: 3.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-18 16:10:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ccallegar-inventory none

Description Weihua Meng 2018-01-19 06:39:33 UTC
Description of problem:
upgrade etcd failed at TASK [Run variable sanity checks]

Version-Release number of the following components:
openshift-ansible-3.9.0-0.21.0.git.0.296d767.el7.noarch
ansible-2.4.1.0-1.el7.noarch
ansible 2.4.1.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, May  3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)]

How reproducible:
Always

Steps to Reproduce:
1. upgrade etcd
# ansible-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/upgrade.yml -vvv

Actual results:
TASK [Run variable sanity checks] *********************************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/init/sanity_checks.yml:12
The full traceback is:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", line 125, in run
    res = self._execute()
  File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", line 521, in _execute
    result = self._handler.run(task_vars=variables)
  File "/usr/share/ansible/openshift-ansible/roles/lib_utils/action_plugins/sanity_checks.py", line 175, in run
    self.run_checks(hostvars, host)
  File "/usr/share/ansible/openshift-ansible/roles/lib_utils/action_plugins/sanity_checks.py", line 145, in run_checks
    self.check_python_version(hostvars, host, distro)
  File "/usr/share/ansible/openshift-ansible/roles/lib_utils/action_plugins/sanity_checks.py", line 82, in check_python_version
    if ansible_python['version']['major'] != 2:
TypeError: 'NoneType' object has no attribute '__getitem__'
fatal: [host-8-241-12.host.centralci.eng.rdu2.redhat.com]: FAILED! => {
    "failed": true, 
    "msg": "Unexpected failure during module execution.", 
    "stdout": ""
}

NO MORE HOSTS LEFT ************************************************************************************************************************************************************************************************
	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/upgrade.retry

Expected results:
Upgrade etcd succeeds

Comment 1 Michael Gugino 2018-01-19 14:37:45 UTC
Please provide inventory and full output.

Comment 2 Michael Gugino 2018-01-19 14:41:48 UTC
This is caused by sanity_checks checking the values for hosts that have not had facts gathered.  I have limited the scope on checking these facts on node scaleup play in the latest patch, but it's probably easier to patch the sanity_checks plugin to disregard hosts that haven't had facts gathered on the limited number of items like ansible version that are checked.

Comment 3 Michael Gugino 2018-01-19 22:25:07 UTC
PR Created: https://github.com/openshift/openshift-ansible/pull/6796

Comment 6 Weihua Meng 2018-01-24 02:15:03 UTC
Fixed.
openshift-ansible-3.9.0-0.23.0.git.0.d53d7ed.el7.noarch

TASK [Run variable sanity checks] *********************************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/init/sanity_checks.yml:13
ok: [hostxxxx.redhat.com] => {
    "changed": false,
    "failed": false,
    "msg": "Sanity Checks passed"
}
META: ran handlers
META: ran handlers

Comment 7 Weihua Meng 2018-01-25 00:37:51 UTC
Fixed.
openshift-ansible-3.9.0-0.23.0.git.0.d53d7ed.el7.noarch

Comment 8 Chris Callegari 2018-05-18 19:17:23 UTC
Looks like this problem is back

TASK [Run variable sanity checks] *************************************************************************************************************************
fatal: [ip-172-16-24-252.ec2.internal]: FAILED! => {"failed": true, "msg": "last_checked_host: ip-172-16-21-186.ec2.internal, last_checked_var: ansible_python;'NoneType' object has no attribute '__getitem__'"}

NO MORE HOSTS LEFT ****************************************************************************************************************************************
 [WARNING]: Could not create retry file '/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry'.         [Errno 13] Permission denied:
u'/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry'


PLAY RECAP ************************************************************************************************************************************************
ip-172-16-16-57.ec2.internal : ok=26   changed=0    unreachable=0    failed=0
ip-172-16-21-186.ec2.internal : ok=0    changed=0    unreachable=1    failed=0
ip-172-16-24-252.ec2.internal : ok=39   changed=0    unreachable=0    failed=1
ip-172-16-37-168.ec2.internal : ok=27   changed=0    unreachable=0    failed=0
ip-172-16-39-90.ec2.internal : ok=26   changed=0    unreachable=0    failed=0
ip-172-16-42-68.ec2.internal : ok=0    changed=0    unreachable=1    failed=0
ip-172-16-48-103.ec2.internal : ok=27   changed=0    unreachable=0    failed=0
ip-172-16-56-232.ec2.internal : ok=26   changed=0    unreachable=0    failed=0
ip-172-16-58-143.ec2.internal : ok=0    changed=0    unreachable=1    failed=0
localhost                  : ok=11   changed=0    unreachable=0    failed=0


INSTALLER STATUS ******************************************************************************************************************************************
Initialization             : In Progress (0:01:11)



Failure summary:


  1. Hosts:    ip-172-16-24-252.ec2.internal
     Play:     Verify Requirements
     Task:     Run variable sanity checks
     Message:  last_checked_host: ip-172-16-21-186.ec2.internal, last_checked_var: ansible_python;'NoneType' object has no attribute '__getitem__'

Comment 9 Chris Callegari 2018-05-18 19:17:53 UTC
$ rpm -qa | grep openshift
openshift-ansible-3.9.14-1.git.3.c62bc34.el7.noarch
openshift-ansible-docs-3.9.14-1.git.3.c62bc34.el7.noarch
openshift-ansible-roles-3.9.14-1.git.3.c62bc34.el7.noarch
atomic-openshift-utils-3.9.14-1.git.3.c62bc34.el7.noarch
openshift-ansible-playbooks-3.9.14-1.git.3.c62bc34.el7.noarch

Inventory is attached

Comment 10 Chris Callegari 2018-05-18 19:19:54 UTC
Created attachment 1438781 [details]
ccallegar-inventory

Comment 11 Michael Gugino 2018-05-18 20:31:47 UTC
(In reply to Chris C from comment #8)
> Looks like this problem is back
> 
> TASK [Run variable sanity checks]
> *****************************************************************************
> ********************************************
> fatal: [ip-172-16-24-252.ec2.internal]: FAILED! => {"failed": true, "msg":
> "last_checked_host: ip-172-16-21-186.ec2.internal, last_checked_var:
> ansible_python;'NoneType' object has no attribute '__getitem__'"}
> 
> NO MORE HOSTS LEFT
> *****************************************************************************
> ***********************************************************
>  [WARNING]: Could not create retry file
> '/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry'.      
> [Errno 13] Permission denied:
> u'/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry'
> 
> 
> PLAY RECAP
> *****************************************************************************
> *******************************************************************
> ip-172-16-16-57.ec2.internal : ok=26   changed=0    unreachable=0    failed=0
> ip-172-16-21-186.ec2.internal : ok=0    changed=0    unreachable=1   
> failed=0
> ip-172-16-24-252.ec2.internal : ok=39   changed=0    unreachable=0   
> failed=1
> ip-172-16-37-168.ec2.internal : ok=27   changed=0    unreachable=0   
> failed=0
> ip-172-16-39-90.ec2.internal : ok=26   changed=0    unreachable=0    failed=0
> ip-172-16-42-68.ec2.internal : ok=0    changed=0    unreachable=1    failed=0
> ip-172-16-48-103.ec2.internal : ok=27   changed=0    unreachable=0   
> failed=0
> ip-172-16-56-232.ec2.internal : ok=26   changed=0    unreachable=0   
> failed=0
> ip-172-16-58-143.ec2.internal : ok=0    changed=0    unreachable=1   
> failed=0
> localhost                  : ok=11   changed=0    unreachable=0    failed=0
> 
> 
> INSTALLER STATUS
> *****************************************************************************
> *************************************************************
> Initialization             : In Progress (0:01:11)
> 
> 
> 
> Failure summary:
> 
> 
>   1. Hosts:    ip-172-16-24-252.ec2.internal
>      Play:     Verify Requirements
>      Task:     Run variable sanity checks
>      Message:  last_checked_host: ip-172-16-21-186.ec2.internal,
> last_checked_var: ansible_python;'NoneType' object has no attribute
> '__getitem__'

This is no longer an issue.  You have 3 unreachable hosts, thus no facts would have been gathered, thus this will fail.

Comment 12 Chris Callegari 2018-05-19 21:38:51 UTC
Ansible ping worked for all my hosts.  I'm not sure why the installer threw this error.  I haven't seen it again.

Comment 13 Michael Gugino 2018-05-21 13:29:28 UTC
(In reply to Chris C from comment #12)
> Ansible ping worked for all my hosts.  I'm not sure why the installer threw
> this error.  I haven't seen it again.

Must have been a temporary network condition.  You can see 3 hosts marked unreachable in your output.