| Summary: | [3.3] Ansible upgrade from 3.2 to 3.3 fails | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Brendan Mchugh <bmchugh> |
| Component: | Cluster Version Operator | Assignee: | Andrew Butcher <abutcher> |
| Status: | CLOSED ERRATA | QA Contact: | liujia <jiajliu> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.3.1 | CC: | abutcher, anli, aos-bugs, bleanhar, bmchugh, javier.ramirez, jiajliu, jokerman, mmccomas, saime, sdodson |
| Target Milestone: | --- | ||
| Target Release: | 3.3.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openshift-ansible-3.3.62-1.git.0.b7473e7.el7 | Doc Type: | Bug Fix |
| Doc Text: |
Previously, API verification during upgrades was performed from the ansible control host which may not have network access to each API server in some network topologies. Now API server verification happens from the master hosts avoiding problems with network access.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-03-06 16:37:11 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Can you confirm that this is happening only when the ansible host does not have access to the API endpoint? That seems like a really odd configuration, is that expected in this environment? That said, I agree with the proposed fix. I think the chances of being able to access the API endpoint from the remote host rather than local host is probably higher. Proposed fix: https://github.com/openshift/openshift-ansible/pull/3032 Version: atomic-openshift-utils-3.3.64-1.git.0.43bfb06.el7.noarch Step: 1. rpm install ocp3.2 2. upgrade 3.2 to 3.3 Result: Upgrade successfully. update containerized Env pass too. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:0448 |
Description of problem: Ansible upgrade from 3.2 to 3.3 fails with "Timeout when waiting for NODE" Version-Release number of selected component (if applicable): openshift-ansible-playbooks-3.3.22-1.git.0.6c888c2.el7.noarch How reproducible: Always but different nodes may fail Steps to Reproduce: 1. Install 3.2 2. Ansible upgrade to 3.3 3. Actual results: 2016-10-11 14:23:40,286 p=15651 u=wnhadm | PLAY [Restart masters] ********************************************************* 2016-10-11 14:23:40,296 p=15651 u=wnhadm | TASK [Restart master system] *************************************************** 2016-10-11 14:23:40,333 p=15651 u=wnhadm | TASK [Wait for master API to come back online] ********************************* 2016-10-11 14:23:40,370 p=15651 u=wnhadm | TASK [Wait for master to start] ************************************************ 2016-10-11 14:23:40,404 p=15651 u=wnhadm | TASK [Wait for master to become available] ************************************* 2016-10-11 14:23:40,438 p=15651 u=wnhadm | TASK [fail] ******************************************************************** 2016-10-11 14:23:40,473 p=15651 u=wnhadm | TASK [Restart master] ********************************************************** 2016-10-11 14:23:40,513 p=15651 u=wnhadm | TASK [Restart master API] ****************************************************** 2016-10-11 14:23:49,069 p=15651 u=wnhadm | TASK [Wait for master API to come back online] ********************************* task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-master/restart_services.yml:11 Using module file /usr/lib/python2.7/site-packages/ansible/modules/core/utilities/logic/wait_for.py <localhost> ESTABLISH LOCAL CONNECTION FOR USER: wnhadm <localhost> EXEC /bin/sh -c '/usr/bin/python2 && sleep 0' fatal: [SIY05E97 -> localhost]: FAILED! => { "changed": false, "elapsed": 301, "failed": true, "invocation": { "module_args": { "connect_timeout": 5, "delay": 10, "exclude_hosts": null, "host": "SIY05E97", "path": null, "port": 8443, "search_regex": null, "state": "started", "timeout": 300 }, "module_name": "wait_for" }, "msg": "Timeout when waiting for SIY05E97:8443" } NO MORE HOSTS LEFT ************************************************************* to retry, use: --limit @/home/wnhadm/.ansible-retry/upgrade.retry PLAY RECAP ********************************************************************* SIY05E85 : ok=86 changed=5 unreachable=0 failed=0 SIY05E86 : ok=86 changed=5 unreachable=0 failed=0 SIY05E87 : ok=86 changed=5 unreachable=0 failed=0 SIY05E88 : ok=86 changed=5 unreachable=0 failed=0 SIY05E89 : ok=86 changed=5 unreachable=0 failed=0 SIY05E90 : ok=86 changed=5 unreachable=0 failed=0 SIY05E91 : ok=86 changed=5 unreachable=0 failed=0 SIY05E92 : ok=86 changed=5 unreachable=0 failed=0 SIY05E93 : ok=86 changed=5 unreachable=0 failed=0 SIY05E94 : ok=86 changed=5 unreachable=0 failed=0 SIY05E95 : ok=86 changed=5 unreachable=0 failed=0 SIY05E96 : ok=86 changed=5 unreachable=0 failed=0 SIY05E97 : ok=195 changed=14 unreachable=0 failed=1 SIY05E98 : ok=189 changed=10 unreachable=0 failed=0 SIY05E99 : ok=189 changed=10 unreachable=0 failed=0 localhost : ok=30 changed=17 unreachable=0 failed=0 Expected results: Successful upgrade Additional info: Issue seems to be in /usr/share/ansible/openshift-ansible/playbooks/common/openshift-master/restart_services.yml Customer found workaround by commenting out with the following: - name: Wait for master API to come back online become: no # local_action: # module: wait_for wait_for: host="{{ inventory_hostname }}" state=started delay=10 port="{{ openshift.master.api_port }}" when: openshift_master_ha | bool and openshift.master.cluster_method != 'pacemaker' Doing so, the wait_for module is executed on the remote side. Same fix can be applied to : playbooks/common/openshift-master/restart_hosts.yml /playbooks/common/openshift-master/restart_hosts_pacemaker.yml /playbooks/common/openshift-master/restart_services.yml /playbooks/common/openshift-master/restart_services_pacemaker.yml