Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1557492 - Task "Wait for master to restart" will break upgrade/install if working through bastion
Task "Wait for master to restart" will break upgrade/install if working throu...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.7.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.7.z
Assigned To: Fabian von Feilitzsch
Johnny Liu
:
Depends On: 1541946
Blocks:
  Show dependency treegraph
 
Reported: 2018-03-16 13:01 EDT by Fabian von Feilitzsch
Modified: 2018-04-29 10:37 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1541946
Environment:
Last Closed: 2018-04-29 10:36:36 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:1231 None None None 2018-04-29 10:37 EDT

  None (edit)
Comment 1 Fabian von Feilitzsch 2018-03-16 13:10:18 EDT
https://github.com/openshift/openshift-ansible/pull/7557
Comment 2 Vladislav Walek 2018-03-17 05:27:06 EDT
Thank you Fabian.
Comment 4 Johnny Liu 2018-04-20 07:38:37 EDT
The test env to reproduce this bug is not easy to be created, but QE create a dummy env, reproduce it successfully, and test the RP for host behind bastion.

1. configure a target host through bastion in .ssh/config.
Host 35.192.5.114
    User root
    HostName 35.192.5.114
    IdentityFile ~/libra-new.pem
    VerifyHostKeyDNS yes
    StrictHostKeyChecking no
    PasswordAuthentication no
    UserKnownHostsFile /dev/null
    ProxyCommand ssh root@jialiu-pc2 -W %h:%p

2. use iptable to drop direct connect to the target host on ansible host.
# iptables -A OUTPUT -p tcp -m tcp -d 35.192.5.114 -j REJECT

3. create a dummy playbook to call the PR.
$ cat test-playbook.yaml 
- name: Restart masters
  hosts: testhost
  serial: 1
  tasks:
  - include: playbooks/common/openshift-master/restart_hosts.yml

4. adding the following line into inventory file.
[testhost]
35.192.5.114

5. run the test playbooks.
$ pwd
/usr/share/ansible/openshift-ansible
$ ansible-playbook -i /tmp/qe-inventory-host-file test-playbooks.yaml -v

Reproduce with openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch.
After 600s, the playbooks failed as the following, actually the host already come back.
PLAY [Restart masters] *********************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************
ok: [35.192.5.114]

TASK [Restart master system] ***************************************************************************************************************************************************
changed: [35.192.5.114] => {"ansible_job_id": "885248510524.11362", "changed": true, "finished": 0, "results_file": "/root/.ansible_async/885248510524.11362", "started": 1}

TASK [set_fact] ****************************************************************************************************************************************************************
ok: [35.192.5.114] => {"ansible_facts": {"wait_for_host": "35.192.5.114"}, "changed": false}

TASK [Wait for master to restart] **********************************************************************************************************************************************


fatal: [35.192.5.114 -> localhost]: FAILED! => {"changed": false, "elapsed": 601, "msg": "Timeout when waiting for 35.192.5.114:22"}


Verified this bug with openshift-ansible-3.7.44-1.git.0.dbb912c.el7.noarch, and PASS.

PLAY [Restart masters] *********************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************
ok: [35.192.5.114]

TASK [Restart master system] ***************************************************************************************************************************************************
changed: [35.192.5.114] => {"ansible_job_id": "624323622282.5673", "changed": true, "finished": 0, "results_file": "/root/.ansible_async/624323622282.5673", "started": 1}

TASK [Wait for master to restart] **********************************************************************************************************************************************
ok: [35.192.5.114] => {"changed": false, "elapsed": 259}
Comment 6 Johnny Liu 2018-04-22 22:58:39 EDT
Per comment 4, move this bug to verified again.
Comment 10 errata-xmlrpc 2018-04-29 10:36:36 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1231

Note You need to log in before you can comment on or make changes to this bug.