I upgraded from 3.10 to 3.11. Correlating logs and pod status by timetamp, I see that Service Catalog installation succeeded, but in the task "Verify that the catalog api server is running" within openshift-ansible/roles/openshift_service_catalog/tasks/start.yml the check was actually done against the 3.10 pods. That is, the tasks checks the /healthz endpoint, but when it did, the OLD 3.10 pods were still running. We should ensure the DaemonSet rollout has completed prior to moving forward and checking for the health. Perhaps adding oc rollout status ds/apiserver -n kube-service-catalog with an expected response of "daemon set "apiserver" successfully rolled out" oc rollout status ds/controller-manager -n kube-service-catalog with an expected response of "daemon set "controller-manager" successfully rolled out" Also advise checking the endpoints to be certain at least one pod is available.
pending fix in 3.11: https://github.com/openshift/openshift-ansible/pull/10658
Set target release to 3.11.z
LGTM, verify it. Details as below: 1, Install the OCP 3.10, as below: [root@ip-172-18-3-150 ~]# oc version oc v3.10.101 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-3-150.ec2.internal:8443 openshift v3.10.101 kubernetes v1.10.0+b81c8f8 [root@ip-172-18-3-150 ~]# oc get pods -n kube-service-catalog NAME READY STATUS RESTARTS AGE apiserver-64r5m 1/1 Running 0 16m controller-manager-49zwq 1/1 Running 0 16m 2, Upgrade it to the OCP 3.11 [root@ip-172-18-3-150 ~]# oc version oc v3.11.69 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-3-150.ec2.internal:8443 openshift v3.11.69 kubernetes v1.11.0+d4cacc0 [root@ip-172-18-3-150 ~]# oc get pods -n kube-service-catalog NAME READY STATUS RESTARTS AGE apiserver-wvh8q 1/1 Running 0 1h controller-manager-q74tf 1/1 Running 2 1h [root@ip-172-18-3-150 ~]# oc get pods -n kube-service-catalog apiserver-wvh8q -o yaml |grep image image: registry.reg-aws.openshift.com:443/openshift3/ose-service-catalog:v3.11 correlating logs: TASK [openshift_service_catalog : Wait for API Server rollout success] ********* task path: /usr/share/ansible/openshift-ansible/roles/openshift_service_catalog/tasks/start.yml:2 Thursday 17 January 2019 11:41:28 +0000 (0:00:00.155) 0:13:41.861 ****** ok: [ec2-54-81-218-203.compute-1.amazonaws.com] => {"attempts": 1, "changed": false, "cmd": ["oc", "rollout", "status", "--config=/etc/origin/master/admin.kubeconfig", "-n", "kube-service-catalog", "ds/apiserver"], "delta": "0:00:33.772827", "end": "2019-01-17 06:42:35.027440", "rc": 0, "start": "2019-01-17 06:42:01.254613", "stderr": "", "stderr_lines": [], "stdout": "Waiting for daemon set \"apiserver\" rollout to finish: 0 of 1 updated pods are available...\ndaemon set \"apiserver\" successfully rolled out", "stdout_lines": ["Waiting for daemon set \"apiserver\" rollout to finish: 0 of 1 updated pods are available...", "daemon set \"apiserver\" successfully rolled out"]} TASK [openshift_service_catalog : Wait for Controller Manager rollout success] *** task path: /usr/share/ansible/openshift-ansible/roles/openshift_service_catalog/tasks/start.yml:14 Thursday 17 January 2019 11:42:02 +0000 (0:00:34.394) 0:14:16.256 ****** ok: [ec2-54-81-218-203.compute-1.amazonaws.com] => {"attempts": 1, "changed": false, "cmd": ["oc", "rollout", "status", "--config=/etc/origin/master/admin.kubeconfig", "-n", "kube-service-catalog", "ds/controller-manager"], "delta": "0:00:50.687188", "end": "2019-01-17 06:43:26.095907", "rc": 0, "start": "2019-01-17 06:42:35.408719", "stderr": "", "stderr_lines": [], "stdout": "Waiting for daemon set \"controller-manager\" rollout to finish: 0 of 1 updated pods are available...\ndaemon set \"controller-manager\" successfully rolled out", "stdout_lines": ["Waiting for daemon set \"controller-manager\" rollout to finish: 0 of 1 updated pods are available...", "daemon set \"controller-manager\" successfully rolled out"]}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0096