Created attachment 1491564 [details] installation log with inventory file embedded Description of problem: This is related to https://github.com/openshift/openshift-ansible/pull/10039, the newly introduced "Wait for sync DS to set annotations on all nodes" task probably timeout, which lead to the whole install exit. Version-Release number of the following components: openshift-ansible-3.11.16-1.git.0.4ac6f81.el7.noarch How reproducible: 2 of 3 tries Steps to Reproduce: 1. Trigger an installation against stage registry. 2. 3. Actual results: timeout for "Wait for sync DS to set annotations on all nodes" task, after the several mins for the failure, login into all node, I could assure md5sum annotations is set successfully. Need more retries??? Expected results: Install pass. Additional info: Please attach logs from ansible-playbook with the -vvv flag
`preserve-jialiustg-node-1` doesn't have this annotation set. Since its been set later on we might add more attempts for this. This seems to happen fairly often, right? I think increasing the delay between the checks would do the trick
(In reply to Vadim Rutkovsky from comment #1) > `preserve-jialiustg-node-1` doesn't have this annotation set. Since its been > set later on we might add more attempts for this. > Yeah. > This seems to happen fairly often, right? I think increasing the delay > between the checks would do the trick When installing against stage registry, this happened often, at least it is today, I am not sure if this is caused by low stage registry performance.
release-3.11 PR - https://bugzilla.redhat.com/show_bug.cgi?id=1636914
https://github.com/openshift/openshift-ansible/pull/10363 is the backport to release-3.11
Fix is available in openshift-ansible-3.11.23-1
When testing this bug with openshift-ansible-3.11.23-1.git.0.19cbe21.el7.noarch, failed at "openshift_control_plane : Wait for control plane pods to appear" task when crio runtime is enabled, even did not come to "Wait for sync DS to set annotations on all nodes" task. I will re-run the testing once the fix PR for BZ#1639201 is merged.
Due to comment 6, disable crio runtime, use docker runtime for testing. Verified this bug with openshift-ansible-3.11.23-1.git.0.19cbe21.el7.noarch, and PASS. TASK [openshift_manage_node : Wait for sync DS to set annotations on all nodes] *** Thursday 18 October 2018 19:07:11 +0800 (0:00:00.694) 0:31:18.013 ****** FAILED - RETRYING: Wait for sync DS to set annotations on all nodes (180 retries left). <--snip--> <--snip--> <--snip--> FAILED - RETRYING: Wait for sync DS to set annotations on all nodes (23 retries left). ok: [host-8-249-20.host.centralci.eng.rdu2.redhat.com -> host-8-249-20.host.centralci.eng.rdu2.redhat.com] => {"attempts": 159, "changed": false, "results": {"cmd": "/usr/bin/oc get node --selector= -o json -n default"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0024