Description of problem: When running the installation to AWS through bastion, the playbook will fail on: - name: Wait for master to restart local_action: module: wait_for host="{{ wait_for_host }}" state=started delay=10 timeout=600 port="{{ ansible_port | default(ansible_ssh_port | default(22,boolean=True),boolean=True) }}" become: no Because the playbook is not expecting the bastion to be in place. Instead it explicitly requires direct SSH connectivity to target hosts. The playbook logs are not available (as customer already fixed it with workaround). Version-Release number of the following components: openshift-ansible-3.7.14-1.git.0.4b35b2d.el7.noarch ansible-2.4.1.0-1.el7.noarch ansible 2.4.1.0 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag Description of problem: Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
just adding update that customer confirmed, the SSH port is correct, but it checks the connection between installation ansible host to the master, ignoring the bastion host.
Yeah, the problem makes sense. Looks like there's a suggested solution[1] for this problem, so I wonder if this would work, we'd need them to tell us about their bastion host. - name: Wait for master to restart wait_for: host: "{{ wait_for_host }}" state: started delay: 10 timeout: 600 port: "{{ ansible_port | default(ansible_ssh_port | default(22,boolean=True),boolean=True) }}" become: no delegate_to: "{{ openshift_bastion_host if openshift_bastion_host is defined else 'localhost' }}" 1 - https://groups.google.com/d/msg/ansible-project/BLgN_mAWh3E/8n6JCqo_AQAJ
https://github.com/openshift/openshift-ansible/pull/7080
Commits pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/0c6a4b400fb560515b4dcfc7ea764572b1e2dbd1 Bug 1541946- waiting for master reboot now works behind bastion https://github.com/openshift/openshift-ansible/commit/f4293d34754468ca85c80177d2566c7b45afceb4 Merge pull request #7080 from fabianvf/1541946 Automatic merge from submit-queue. Bug 1541946- waiting for master reboot now works behind bastion https://bugzilla.redhat.com/show_bug.cgi?id=1541946 I made this change because [this ansible PR](https://github.com/ansible/ansible/pull/28450) makes it seem like if we switch to the `wait_for_connection` module we can avoid a lot of the jankiness referenced in the removed code. If I'm interpreting the ansible change properly, this should make it use the full ssh config, proxy jumps and all, without any workarounds. I've marked it WIP because I'm still trying to test and make sure that this works.
Verified this bug with openshift-ansible-playbooks-3.9.0-0.51.0.git.0.e26400f.el7.noarch, and PASS. The test env to reproduce this bug is not easy to be created, but QE create a dummy env, reproduce it successfully, and test the RP for host behind bastion. 1. configure a target host through bastion in .ssh/config. Host 10.8.244.223 User root HostName 10.8.244.223 IdentityFile ~/libra-new.pem VerifyHostKeyDNS yes StrictHostKeyChecking no PasswordAuthentication no UserKnownHostsFile /dev/null ProxyCommand ssh root@jialiu-pc2 -W %h:%p 2. use iptable to drop direct connect to the target host on ansible host. # iptables -A OUTPUT -p tcp -m tcp -d 10.8.244.223 -j REJECT 3. create a dummy playbook to call the PR. $ cat test-playbook.yaml - name: Restart masters hosts: testhost serial: 1 post_tasks: - include_tasks: tasks/restart_hosts.yml Run the testing, and PASS.
I have submitted another bug for 3.7.z here: https://bugzilla.redhat.com/show_bug.cgi?id=1557492 and submitted a PR to backport the change: https://github.com/openshift/openshift-ansible/pull/7557
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489