Description of problem: Can't get the master pods during the installation in proxy environment Version-Release number of the following components: openshift-ansible-3.10.7-1.git.220.50204c4.el7.noarch.rpm How reproducible: always Steps to Reproduce: 1. Trigger HA installation with haproxy LB set behind proxy Actual results: Installation failed at TASK [openshift_control_plane : Wait for all control plane pods to become ready] *** <--snip--> FAILED - RETRYING: Wait for all control plane pods to become ready (1 retries left). FAILED - RETRYING: Wait for all control plane pods to become ready (1 retries left). failed: [host-8-249-254.host.centralci.eng.rdu2.redhat.com] (item=controllers) => {"attempts": 60, "changed": false, "failed": true, "item": "controllers", "results": {"cmd": "/usr/bin/oc get pod master-controllers-ghuang-bug-master-etcd-2 -o json -n kube-system", "results": [{}], "returncode": 0, "stderr": "Error from server (NotFound): pods \"master-controllers-ghuang-bug-master-etcd-2\" not found\n", "stdout": ""}, "state": "list"} failed: [host-8-240-239.host.centralci.eng.rdu2.redhat.com] (item=controllers) => {"attempts": 60, "changed": false, "failed": true, "item": "controllers", "results": {"cmd": "/usr/bin/oc get pod master-controllers-ghuang-bug-master-etcd-1 -o json -n kube-system", "results": [{}], "returncode": 0, "stderr": "Error from server (NotFound): pods \"master-controllers-ghuang-bug-master-etcd-1\" not found\n", "stdout": ""}, "state": "list"} failed: [host-8-250-249.host.centralci.eng.rdu2.redhat.com] (item=controllers) => {"attempts": 60, "changed": false, "failed": true, "item": "controllers", "results": {"cmd": "/usr/bin/oc get pod master-controllers-ghuang-bug-master-etcd-3 -o json -n kube-system", "results": [{}], "returncode": 0, "stderr": "Error from server (NotFound): pods \"master-controllers-ghuang-bug-master-etcd-3\" not found\n", "stdout": ""}, "state": "list"} Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
Created https://github.com/openshift/openshift-ansible/pull/8962, I'll reuse the same inventory to verify this fix is sufficient
3.10 cherrypick - https://github.com/openshift/openshift-ansible/pull/8979
The previous PR was insufficient, https://github.com/openshift/openshift-ansible/pull/8984 for master did the trick on provided inventory
The installation however seems to get stuck at approving the nodes later on. All CSRs are in 'Approved,Issued' state, so this might be a misconfiguration
Tested against the latest release-3.10 including the two fixes. No issues found.
Fix is available in openshift-ansible-3.10.10-1
Verified in openshift-ansible-3.10.10-1.git.248.0bb6b58.el7.noarch.rpm [root@qe-ghuang-bug-master-etcd-1 ~]# grep -A 3 "NO_PROXY" /etc/origin/master/master-config.yaml - name: NO_PROXY value: .xxxx,.cluster.local,.xxxxx,.svc,10.14.89.4,169.254.169.254,172.16.120.104,172.16.120.17,172.16.120.67,172.31.0.1,qe-ghuang-bug-lb-nfs-1,qe-ghuang-bug-master-etcd-1,qe-ghuang-bug-master-etcd-2,qe-ghuang-bug-master-etcd-3,qe-ghuang-bug-node-1,qe-ghuang-bug-node-2,qe-ghuang-bug-node-registry-router-1