Created attachment 1500279 [details] openshift.tar.gz Description of problem: Director deployed OCP 3.11: scaling out with an additional master node fails during TASK [openshift_control_plane : Wait for all control plane pods to become ready]: TASK [openshift_control_plane : Wait for all control plane pods to become ready] *** FAILED - RETRYING: Wait for all control plane pods to become ready (2 retries left). FAILED - RETRYING: Wait for all control plane pods to become ready (1 retries left). failed: [openshift-master-3] (item=etcd) => {"attempts": 60, "changed": false, "item": "etcd", "results": {"cmd": "/bin/oc get pod master-etcd-openshift-master-3 -o json -n kube-system", "results": [{}], "returncode": 0, "stderr": "Error from server (NotFound): pods \"master-etcd-openshift-master-3\" not found\n", "stdout": ""}, "state": "list"} ok: [openshift-master-3] => (item=api) ok: [openshift-master-3] => (item=controllers) NO MORE HOSTS LEFT ************************************************************* PLAY RECAP ********************************************************************* localhost : ok=36 changed=0 unreachable=0 failed=0 openshift-infra-0 : ok=26 changed=5 unreachable=0 failed=0 openshift-infra-1 : ok=26 changed=5 unreachable=0 failed=0 openshift-master-0 : ok=52 changed=7 unreachable=0 failed=0 openshift-master-1 : ok=52 changed=7 unreachable=0 failed=0 openshift-master-2 : ok=92 changed=7 unreachable=0 failed=0 openshift-master-3 : ok=321 changed=126 unreachable=0 failed=1 openshift-worker-0 : ok=26 changed=5 unreachable=0 failed=0 openshift-worker-1 : ok=26 changed=5 unreachable=0 failed=0 INSTALLER STATUS *************************************************************** Initialization : Complete (0:01:57) Node Bootstrap Preparation : Complete (0:03:45) Master Install : In Progress (0:08:52) This phase can be restarted by running: playbooks/openshift-master/config.yml Failure summary: 1. Hosts: openshift-master-3 Play: Configure masters Task: Wait for all control plane pods to become ready Message: All items completed Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-9.0.1-0.20181013060867.ffbe879.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy environment with 3 x masters + 2 x infra + 2 x worker nodes 2. Add an additional master node and re-run overcloud deploy command Actual results: Deployment fails. Expected results: No failures. Additional info: Attaching /var/lib/mistral.
I've tried to reproduce this issue twice, and both times it failed earlier for me with a different error: TASK [etcd : Ensure CA certificate exists on etcd_ca_host] ********************* ok: [openshift-openshiftmaster-1 -> 192.168.24.24] TASK [etcd : fail] ************************************************************* fatal: [openshift-openshiftmaster-1]: FAILED! => {"changed": false, "msg": "CA certificate /etc/etcd/ca/ca.crt doesn't exist on CA host openshift-openshiftmaster-1. Apply 'etcd_ca' action from `etcd` role to openshift-openshiftmaster-1.\n"} NO MORE HOSTS LEFT ************************************************************* PLAY RECAP ********************************************************************* localhost : ok=39 changed=0 unreachable=0 failed=0 openshift-openshiftinfra-0 : ok=27 changed=5 unreachable=0 failed=0 openshift-openshiftinfra-1 : ok=27 changed=5 unreachable=0 failed=0 openshift-openshiftinfra-2 : ok=27 changed=5 unreachable=0 failed=0 openshift-openshiftmaster-0 : ok=53 changed=7 unreachable=0 failed=0 openshift-openshiftmaster-1 : ok=242 changed=71 unreachable=0 failed=1 openshift-openshiftworker-0 : ok=27 changed=5 unreachable=0 failed=0 openshift-openshiftworker-1 : ok=27 changed=5 unreachable=0 failed=0 openshift-openshiftworker-2 : ok=27 changed=5 unreachable=0 failed=0 INSTALLER STATUS *************************************************************** Initialization : Complete (0:01:14) Node Bootstrap Preparation : Complete (0:04:51) Failure summary: 1. Hosts: openshift-openshiftmaster-1 Play: Create etcd client certificates for master hosts Task: etcd : fail Message: CA certificate /etc/etcd/ca/ca.crt doesn't exist on CA host openshift-openshiftmaster-1. Apply 'etcd_ca' action from `etcd` role to openshift-openshiftmaster-1.
The upstream patch at https://review.openstack.org/616584 should fix the issue.
No doc text required.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045