Hide Forgot
Description of problem: Create a heat stack with external loadbalancer, the whole cluster would be down if the first master is down somehow. The root cause is that the certificate data is created with the first master by default if openshift_master_cluster_hostname and openshift_master_cluster_public_hostname are not specified in inventory hosts. So the cluster would not work again once the fist master(the whole instance or the atomic-openshift-master-api service) is down. Version-Release number of selected component (if applicable): v0.9.4 How reproducible: always Steps to Reproduce: 1.Create a heat stack with HA master + external loadbalancer 2.Shutdown the first master (usually named with "*master-0") 3. Actual results: # oc get po Unable to connect to the server: dial tcp 192.168.10.7:8443: i/o timeout Note: 192.168.10.7 is the first master ip. On the nodes, the client-certificate-data is created with first master [root@ghuang-test6-ha-ocp-node-42q26tq1 ~]# cat /etc/origin/node/system\:node\:ghuang-test6-ha-ocp-node-42q26tq1.test.com.kubeconfig apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1akNDQWRDZ0F3SUJBZ0lCQVRBTEJna3Foa2lHOXcwQkFRc3dKakVrTUNJR0ExVUVBd3diYjNCbGJuTm8KYVdaMExYTnBaMjVsY2tBeE5EYzNNRFEwTURreE1CNFhEVEUyTVRBeU1URXdNREV6TWxvWERUSXhNVEF5TURFdwpNREV6TTFvd0pqRWtNQ0lHQTFVRUF3d2JiM0JsYm5Ob2FXWjBMWE5wWjI1bGNrQXhORGMzTURRME1Ea3hNSUlCCklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUFzWDlKbVdpWDVMcUVkRXRyZjVVU3R3VzMKWTBEa2hhZGFPbC9FWmRjRTVKbFdveGJCRGJweWQzOC9XNW1EWTl0cGgvTUlpdkxJU054b1ZtVTQwSXBMK0EwQgoyNEcrR0RQK25lMlR3Y1JkSS83RXk1bVhETER3cWlvN1RXMlBXY2lGRWdyQTNheUJhbWh3N1BiazVhZ21ESWE2ClFvaU1rVHlNOHViZy9XSy9TS0crMHkzQklGelE4Rk8rWlVtK25QNmMwVnl0NEVXdW5WS0RKc3RIZlp0eVJjUWgKTFlSWXI3UDl6bFp3NlRtN0tTTHVnOXk5WVJGKzA3L0VaMFU1a254MEhnVFFTK2ZERHk1S2I1eUVnbWtGRit4ZgpwT2tuWFhIb1FVSXlUT1I0T1BSSG5yQ25DQWZmb3I4K05LWElUcWJlQkNOQ2lqc3Z2Q3M5YzNxRGw0MkRTUUlECkFRQUJveU13SVRBT0JnTlZIUThCQWY4RUJBTUNBS1F3RHdZRFZSMFRBUUgvQkFVd0F3RUIvekFMQmdrcWhraUcKOXcwQkFRc0RnZ0VCQUNyZ3RXS3hmR25Da3g0RlVvM2xnc0doRnpwUCtKUzY4aU03cVJOWC83eXcwRUp0RWYrcgpxZXRJVjNRempJTi92dnlLNWhLUTU3OUd0TjYrcWdBMHNMa1J1TGIwSnJpTUh3bGdTN3dJY29IQUhxUHd2eG1xCjRTbWlidzZEY3J3YkZ1QWtqRTdRS0gwRGM1NWxzbWkrWEZnRHkzVlJhbm16NW1HbzJXaWZvblNDcjJCL2JQLysKMTVVcG5HOHlLMS9uYVJyS2tvL0xnTEpXcm9pb0QvbE11eTdPNFJOQlp0eFB6S29rY1MvNGpUdUxkK0Qya1Q2TgpkWFZGM3RidE1HMzBVVkRmZHMwS0ZFRDVScHYycjFGVE1WZ2NMaHJteFJzR0YzRHpZS0IwN2lycHZ4anRDb2VECjhsdmJyR29ROTl6RHNkYTdnSTZQZTlIcEdqUWVpZjBYeWRVPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://ghuang-test6-ha-ocp-master-0.test.com:8443 name: ghuang-test6-ha-ocp-master-0-test-com:8443 contexts: - context: cluster: ghuang-test6-ha-ocp-master-0-test-com:8443 namespace: default user: system:node:ghuang-test6-ha-ocp-node-42q26tq1.test.com/ghuang-test6-ha-ocp-master-0-test-com:8443 name: default/ghuang-test6-ha-ocp-master-0-test-com:8443/system:node:ghuang-test6-ha-ocp-node-42q26tq1.test.com current-context: default/ghuang-test6-ha-ocp-master-0-test-com:8443/system:node:ghuang-test6-ha-ocp-node-42q26tq1.test.com kind: Config preferences: {} users: - name: system:node:ghuang-test6-ha-ocp-node-42q26tq1.test.com/ghuang-test6-ha-ocp-master-0-test-com:8443 user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURLakNDQWhTZ0F3SUJBZ0lCRlRBTEJna3Foa2lHOXcwQkFRc3dKakVrTUNJR0ExVUVBd3diYjNCbGJuTm8KYVdaMExYTnBaMjVsY2tBeE5EYzNNRFEwTURreE1CNFhEVEUyTVRBeU1URXdNRFl6TlZvWERURTRNVEF5TVRFdwpNRFl6Tmxvd1dERVZNQk1HQTFVRUNoTU1jM2x6ZEdWdE9tNXZaR1Z6TVQ4d1BRWURWUVFERXpaemVYTjBaVzA2CmJtOWtaVHBuYUhWaGJtY3RkR1Z6ZERZdGFHRXRiMk53TFc1dlpHVXROREp4TWpaMGNURXVkR1Z6ZEM1amIyMHcKZ2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRQ3JMN3k5eEQ5OHhVT3FrNDZIQlhSMgp3S1VyN25meHAwWXJZellKdVBWV1FYT25jS0lKNnpubWxDTStKSzZ4QkFXMWF2U2lwNWxoN1FtVFcyNXZQK25TCldEanJ5TmpJaUJEbXI4eDE1RUtFSDQ0bVIwYVQ1YkRlSzh0TTNicE1jTjBEWllnM2RockF2d0xXdmJBZU9vakEKTjRWRUcxVmtUWFE3L2JsMHJYQ3kvNnpHT1ZWNHFxRTZ4aE8vcXBweFkwUHZORDM0MEdYQmtGUVM5QW9lSzk2eApPMG1zOUllL3liTjExSjFGVGhha01IS2FGNGNLdHN1b1FhSjNHSWZvZjV4NWVXcWNaN1o3RGVsS2tIYVlkYk92CjhON3RkdS9BNUtwQUpTRVI0YW5nVTZvcHhaSHJMNk4yOC9GUWxZZGRVQmV6YjI0bkhQdmJSWVVUK0dOaUdxdjUKQWdNQkFBR2pOVEF6TUE0R0ExVWREd0VCL3dRRUF3SUFvREFUQmdOVkhTVUVEREFLQmdnckJnRUZCUWNEQWpBTQpCZ05WSFJNQkFmOEVBakFBTUFzR0NTcUdTSWIzRFFFQkN3T0NBUUVBVndQRHg4dWljQUZEeUNpTEhwNG5WMUZnCnBwczFRSTFvSGZiM1lrL1VsYzFESWtGQzVRUStyR0RjWEs1TXJlZDdrMEFDdjJlZ0djM0t1OWdpKytRcC9UTHEKcmd3bTh0dllMZ1lQblNtUjNLMkJhbW9EaHBBS2ZFbmZ4bU1KaUpWTzcraFpZL3R6RElmMTBOOTVmcWpEZzBXZQo0MjJVL1Z5UmJ2cTlkaXVjTGpVNjYzU3dqb3J4YkdwUnU5bXlwTEtQYmdoSHFzQlpoSHlzZVF5Vk42UUF5dmJ0CjdRNkdIYXplUTZObnl0U09VNTFoUVVTaElwZ0tUcmEvMU9ZOVhCRkFyaGtaTEFlQ2hwM0tpRlo5L1FHYlZNRlUKeUpZL3ViYU1IUE53MENTaTB3VG5CZUlCZ0thaFJVREEzZFNwQ1BudG1DZi9XREgwREJHc241YkIxaVRhVVE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBcXkrOHZjUS9mTVZEcXBPT2h3VjBkc0NsSys1MzhhZEdLMk0yQ2JqMVZrRnpwM0NpCkNlczU1cFFqUGlTdXNRUUZ0V3Iwb3FlWlllMEprMXR1YnovcDBsZzQ2OGpZeUlnUTVxL01kZVJDaEIrT0prZEcKaytXdzNpdkxUTjI2VEhEZEEyV0lOM1lhd0w4QzFyMndIanFJd0RlRlJCdFZaRTEwTy8yNWRLMXdzditzeGpsVgplS3FoT3NZVHY2cWFjV05EN3pROStOQmx3WkJVRXZRS0hpdmVzVHRKclBTSHY4bXpkZFNkUlU0V3BEQnltaGVICkNyYkxxRUdpZHhpSDZIK2NlWGxxbkdlMmV3M3BTcEIybUhXenIvRGU3WGJ2d09TcVFDVWhFZUdwNEZPcUtjV1IKNnkramR2UHhVSldIWFZBWHMyOXVKeHo3MjBXRkUvaGpZaHFyK1FJREFRQUJBb0lCQUNrN09VR1h5QmJjU0gwSQpSMWI4R0Y0VjduS1RZRzVpOU1LMGhhcDMweGV3Y2hQTlRDb0pid3U3ZUhXYVRqMHlrOUZyYm5yUzFWM0J3d0dzCkR3QmFxNDNQVS81dWhOQmYvWG9pczZOZGxDdlFrZU5rWFhwMzQwN1B5NHE3Q1FrcVVnRmtiaGUxcWFIdEg5anIKSFVWYW9kOXlQL1gwZzIvQ1BCSEsvZVU5ZFJ5WGxTeUpFMXJmdndDUSs2RHZOOFNmNmpKcDQ5aHFpNkZOSWJncQpHeGpPN1FVZUFOZ3dMTFZnd1FETDE1eTJLeTg2ZnJMRmdlYXpiSTBPWFVRaW9QWU1pMDhVS1BOeVpvMm5IL1VxCmg5QjROR3FGUUZXWXNWcHljMCtFUjJOa2RwcU9oOUw0dEVmWllnTk9wWlNrblZIbDI4QlpYOUNZZGJRRnN4SksKb3RiQ0xCVUNnWUVBd0wzV1J3eWtNNzVVclpBSkVSZUZVN2JsWnFMOXpHQkpZcURQaHFSSWVad1RmM3hJRjc2RQpKS2FHL1VCcWwyY2ZhUFBNUjRCUjdEeHZtNGNWNVJ1TmZseW1SamVSdWdVckE3RzRWTjcvbkdTTzhPazh5SW5uCmJYeDNpR0MrSVhIcTRnNTdwODNqRlo5UmRtc2E1OW5RNjhvU2VvUllwZzhBSlJrbWtIQ3dRd2NDZ1lFQTQxN1gKTmY2WjRoSmNtc2c2WE9aM0hwMDI2bDdBbk5wbGYxVHJ0dExQa0NnaFNYdEdsekhhWmxwMG4yQ3JZbnBMWC9EYwpVcWg5UG9QcjhDalN5cDd0WkhTenZtZmM2VVlZNkt4NTdBOHdYem0xM0NGdmFlYVJVOE1OWVk1ckt1YjBlZ3FqCi9CdDg4MXI1RmVzN3NyazlkbERwbTIxTzVYTFNsVGt1cTJXMjJQOENnWUEwMzE2Nm10TW9ocHZBQ1BVVHhUb0QKM3ZaTEU0Yy8yMklHTmtyM2luVi9OcnQ2aTJOVGNDWGJ6L3JUMmluallweVJNOS9qOVdXRHdvaHpSN2xQNGlFTQpldW41OVNCNndSUXRyVUQ5dHphemRqcG9CL051cDdYZXFQZzVaeUNCR0Rqd3pqeEpxZ2NUVldNSmN4UXNhZW9QCjVKenhFd0VtZkpMem1sU2o1dVhUWFFLQmdDS1B4aEwxRXBza3cyTGIwTk5TVFFVZ1RMcXZrSVBIUnVwbUZEYUUKTVB6dXZMQ1l4cEF4Q2N2Sk1EVVIwcnR6YjRXejdTbTdadDViMno5MFZTWnJwaFpCRHhtQVhEb3haNVBtczluSQpMVWdzVTVLVW1vVDBnVjdFSllLUXpZV0YrZCtiUW5ZT0Q1NUdVOXFiR1VYL2xuSW50bnJqMEx4Y0NkcVpDSmtSCkt3d3RBb0dBYWZ1dXQra0dYOHE1d09wMUdJVjIva3RwTXk1ekFlNUZJQUhsSjlMTERqZTl2Z3FDZHJPaHpxTHQKWFhLc3g0Um9mSlViRmovL3E2dTZldUFXWHVmQi9SazUrLzVLVkVTa2lrb3dYTUVOeUJwWlNqcmlmZVM3TjRvNQpIUUVnN2ZGM0F4QUptWDNzT0pJeWJQWDlJMjlEUGxyMHphU1ZmdjRsU25SdmtGaktBdnc9Ci0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg== Recover the first master, openshift continue working. Expected results: Additional info:
Fixed in this PR: https://github.com/redhat-openstack/openshift-on-openstack/pull/294
Fixed in 0.9.5
verified with v0.9.5 1. Create a stack which was using external loadbalancer (the hostname can be resloved via the dns nameserver) 2. app can be created successfully, and route can be accessed 3. Shutdown the first master after creating the stack 4. app can be created successfully, and route can be accessed 5. Scaling up a node 6. app can be created successfully, and route can be accessed 7. Scaling down a node 8. app can be created successfully, and route can be accessed 9. recover the first master 10. app can be created successfully, and route can be accessed