Description of problem: Fail to upgrade ocp with external etcd when deployed service catalog at task [openshift_service_catalog : wait for api server to be ready]. fatal: [x.x.x.x]: FAILED! => {"attempts": 1, "changed": false, "connection": "close", "content": "[+]ping ok\n[+]poststarthook/generic-apiserver-start-informers ok\n[+]poststarthook/start-service-catalog-apiserver-informers ok\n[-]etcd failed: reason withheld\nhealthz check failed\n", "content_length": "180", "content_type": "text/plain; charset=utf-8", "date": "Thu, 15 Mar 2018 06:22:25 GMT", "msg": "Status code was not [200]: HTTP Error 500: Internal Server Error", "redirected": false, "status": 500, "url": "https://apiserver.kube-service-catalog.svc/healthz", "x_content_type_options": "nosniff"} # curl -k https://apiserver.kube-service-catalog.svc/healthz [+]ping ok [+]poststarthook/generic-apiserver-start-informers ok [+]poststarthook/start-service-catalog-apiserver-informers ok [-]etcd failed: reason withheld healthz check failed # oc describe pod apiserver-flplx -n kube-service-catalog | grep etcd-servers -A 1 --etcd-servers https://qe-jliu-t2-master-1:2379 Here should be etcd host name but not master host name. Version-Release number of the following components: ansible-2.4.3.0-1.el7ae.noarch openshift-ansible-3.9.9-1.git.0.1a1f7d8.el7.noarch How reproducible: always Steps to Reproduce: 1. Container install ocp v3.7 with external etcd(dedicated etcd not on master hosts) 2. Upgrade ocp v3.7 to v3.9 3. Actual results: Upgrade failed. Expected results: upgrade succeed. Additional info: Please attach logs from ansible-playbook with the -vvv flag
The proposed fix is tracked in 1557036. *** This bug has been marked as a duplicate of bug 1557036 ***
I don't think it should be duplicated. Maybe the root cause was the same. But in this scenario, service catalog was deployed before upgrade and upgrade will fail and can not continue. For bug 1557036, upgrade was not blocked and fail to deploy service_catalog on v3.9.
Mike, can you evaluate if your proposed fix in 1557036 would apply to this scenario as well?
My update was not designed to fix this as I didn't believe this was broken. However, I do believe my fix will also apply to this scenario.
PR https://github.com/openshift/openshift-ansible/pull/7542 merged, which probably also addresses this bug.
blocked by bz1566238
Verified on openshift-ansible-3.9.24-1.git.0.d0289ea.el7.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1566