When deploying a multi-master setup with kuryr enabled, the OSEv3 vars get overwritten by the kuryr vars at the dynamic inventory, therefore missing important information. This for instance leads to removing the value for 'openshift_master_cluster_hostname', which makes all the master nodes to point to one of the masters instead of to the loadbalancer in front of them. Thus, if that master is shutdown, the complete cluster will be down as the other master are trying to reach it.
Verified in openshift-ansible-3.11.59-1.git.0.ba8e948.el7.noarch on top of OSP 13 2018-12-13.4 puddle. Verification steps: - Deploy OCP 3.11 on OSP 13 (with ansible 2.5) - Run the dynamic inventory: ยท /usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py - Check the output contains the "openshift_master_cluster_hostname" inside inventory['OSEv3']['vars']: "OSEv3": { "hosts": [ "app-node-0.openshift.example.com", "app-node-1.openshift.example.com", "master-0.openshift.example.com", "infra-node-0.openshift.example.com" ], "vars": { "kuryr_openstack_api_lb_ip": "172.30.0.1", "kuryr_openstack_auth_url": "http://10.46.22.28:5000//v3", "kuryr_openstack_password": "redhat", "kuryr_openstack_pod_project_id": "fca7a9346b8947c6a20e3b3e974243f5", "kuryr_openstack_pod_router_id": "c1c4f31d-bc92-441b-953b-b73c0a77ad2f", "kuryr_openstack_pod_sg_id": "33262e49-2872-4162-9e6c-8d4593772a36", "kuryr_openstack_pod_subnet_id": "46a96b20-c79a-462d-98cb-43f517f4c76f", "kuryr_openstack_project_domain_name": "Default", "kuryr_openstack_project_id": "fca7a9346b8947c6a20e3b3e974243f5", "kuryr_openstack_service_subnet_id": "d981544f-8801-4d07-9651-80912f35348b", "kuryr_openstack_user_domain_name": "Default", "kuryr_openstack_username": "shiftstack_user", "kuryr_openstack_worker_nodes_subnet_id": "a5fa1e0d-15d9-4769-8175-13177ea8e924", "openshift_master_cluster_hostname": "172.30.0.1" <<--------- } } With previous OCP versions, the parameter "openshift_master_cluster_hostname" was being deleted from the inventory.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0024