Description of problem: TASK [openshift_master : Wait for API to become available] *FAILED - RETRYING: Wait for API to become available (xx retries left).* However, re-running the playbook allows it to continue. Version-Release number of the following components: openshift-ansible-3.7.42-1.git.2.9ee4e71.el7.noarch Tried with: ansible 2.5.0-2, 2.4.1.0, and 2.3.1.0 How reproducible: Unconfirmed Actual results: TASK [openshift_master : Wait for API to become available] ************************************************************ ************************************************************ ********************* *FAILED - RETRYING: Wait for API to become available (120 retries left).* . . . *FAILED - RETRYING: Wait for API to become available (1 retries left).* * [WARNING]: Consider using get_url or uri module rather than running curl* fatal: [ec2-54-242-126-28.compute-1.amazonaws.com]: FAILED! => { "attempts": 120, "changed": false, "cmd": [ "curl", "--silent", "--tlsv1.2", "--cacert", "/etc/origin/master/ca-bundle.crt", "https://console.sandbox.example.com/healthz/ready" ], "delta": "0:00:00.104755", "end": "2018-03-29 22:11:12.608324", "rc": 35, "start": "2018-03-29 22:11:12.503569" } MSG: non-zero return code Expected results: Install completes Additional info: I know this is a somewhat common error, however the common explanations I've found for the issue do not apply. And, it is suspicious given that the second runthrough passes this step normally and install completes. I will upload logs etc
How long does the load balancer take to insert hosts into the pool? I assume
I don't understand the question -- was "I assume" cut off?
Yes, sorry about that. I'm assuming that clusterHostname = 'console.sandbox.example.com' ? And that it's some sort of load balancer. Can you verify that the load balancer is routing traffic as expected to the hosts at the time of the failure? The fact that simply re-running things fixes it leads me to believe that there's a delay in the load balancer routing traffic to the API servers.
Hi, A little input from the customers end: I tried to hit the loadbalancer host name. I'm getting nothing back at the current stage. [ec2-user@ip-xx-xx-10-xxx ~]$ curl -k http://console.x.sandbox.x.com/ curl: (7) Failed connect to console.x.sandbox.x.com:80; Connection refused Please let me know if more information is required for the same.
(In reply to Madhusudan Upadhyay from comment #6) > Hi, > A little input from the customers end: > > I tried to hit the loadbalancer host name. I'm getting nothing back at the > current stage. > > [ec2-user@ip-xx-xx-10-xxx ~]$ curl -k http://console.x.sandbox.x.com/ > curl: (7) Failed connect to console.x.sandbox.x.com:80; Connection refused > > > Please let me know if more information is required for the same. Then this sounds as if we need to debug why their load balancer isn't routing traffic to the api servers. Can you please work on that with the customer? BTW, that curl is against http not https, where as the ansible scripts are checking https so that may be the reason. Anyway, we need to understand where this is failing. Can they curl the api servers directly?