Bug 1431077 - fatal excluder error when upgrade atomic OCP
Summary: fatal excluder error when upgrade atomic OCP
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Jan Chaloupka
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-10 10:19 UTC by Anping Li
Modified: 2017-07-24 14:11 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: the upgrade plays are upgrading excluders on AH Consequence: the plays fail as excluders on AH are not supported Fix: skip excluders on AH Result: excluders are skip on AH, the plays no longer fail
Clone Of:
Environment:
Last Closed: 2017-04-12 19:03:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0903 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-04-12 22:45:42 UTC

Description Anping Li 2017-03-10 10:19:08 UTC
Description of problem:
Fatal message are reported with excluder TASKs

ASK [Docker excluder version detected] ****************************************
fatal: [10.8.175.65]: FAILED! => {
    "failed": true
}

MSG:

the  field 'args' has an invalid value, which appears to include a variable  that is undefined. The error was: 'dict object' has no attribute  'stdout'

The  error appears to have been in  '/usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/pre/validate_excluder.yml':  line 13, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

Version-Release number of selected component (if applicable):
atomic-openshift-utils-3.5.28-1.git.0.103513e.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. Install OCP 3.4
2. Upgrade to v3.5
   ansible-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_5/upgrade.yml

Actual results:
Upgrade failed, there fatal message with  excluder TASKs

TASK [Docker excluder version detected] ****************************************
fatal: [10.8.175.65]: FAILED! => {
    "failed": true
}

MSG:

the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'stdout'

The error appears to have been in '/usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/pre/validate_excluder.yml': line 13, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:


- name: Docker excluder version detected
  ^ here

fatal: [10.8.173.61]: FAILED! => {
    "failed": true
}

MSG:

the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'stdout'

The error appears to have been in '/usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/pre/validate_excluder.yml': line 13, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

<--snip-->
<--snip-->

TASK [fail] ********************************************************************
fatal: [localhost]: FAILED! => {
    "changed": false,
    "failed": true
}

MSG:

Upgrade cannot continue. The following hosts did not complete etcd backup: 10.8.173.61,10.8.173.40,10.8.175.65
        to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_5/upgrade.retry

PLAY RECAP *********************************************************************
10.8.173.147               : ok=30   changed=1    unreachable=0    failed=1
10.8.173.25                : ok=30   changed=1    unreachable=0    failed=1
10.8.173.40                : ok=31   changed=1    unreachable=0    failed=1
10.8.173.44                : ok=30   changed=1    unreachable=0    failed=1
10.8.173.61                : ok=34   changed=1    unreachable=0    failed=1
10.8.175.236               : ok=30   changed=1    unreachable=0    failed=1
10.8.175.4                 : ok=115  changed=11   unreachable=0    failed=1
10.8.175.65                : ok=31   changed=1    unreachable=0    failed=1
10.8.175.82                : ok=30   changed=1    unreachable=0    failed=1
localhost                  : ok=30   changed=0    unreachable=0    failed=1

Expected results:


Additional info:

Comment 1 Jan Chaloupka 2017-03-10 14:37:46 UTC
Anping Li, can you provide steps to reproduce and the entire ansible tasks log?

Comment 2 Jan Chaloupka 2017-03-10 16:15:24 UTC
I am not able to reproduce it with the latest commit in master branch with [1] applied. I will run another installation and upgrade scenario later on again once [1] and its dependent PR [2] are merged.

[1] https://github.com/openshift/openshift-ansible/pull/3620
[2] https://github.com/openshift/openshift-ansible/pull/3610

Comment 4 Jan Chaloupka 2017-03-10 16:45:57 UTC
Can you check your repositories if the excluders are available at all? Just before you run the upgrade, can you run:

# yum update atomic-openshift-docker-excluder atomic-openshift-excluder

without -y option? Just to verify the excluders can be updated to 3.5.

Comment 6 Jan Chaloupka 2017-03-13 11:56:56 UTC
Upstream PR: https://github.com/openshift/openshift-ansible/pull/3631

Excluders are not supported on AH yet.

Comment 7 Anping Li 2017-03-14 07:04:36 UTC
Verified and pass with openshift-ansible-3.5.32

Comment 9 errata-xmlrpc 2017-04-12 19:03:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0903


Note You need to log in before you can comment on or make changes to this bug.