| Summary: | [ocp-on-osp] OpenShift would not work if the first master is down when selecting External loadbalancer for the stack | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Gan Huang <ghuang> |
| Component: | Installer | Assignee: | Jan Provaznik <jprovazn> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Gan Huang <ghuang> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.3.0 | CC: | aos-bugs, jokerman, jprovazn, mmccomas |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-03-20 08:37:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Fixed in this PR: https://github.com/redhat-openstack/openshift-on-openstack/pull/294 Fixed in 0.9.5 verified with v0.9.5 1. Create a stack which was using external loadbalancer (the hostname can be resloved via the dns nameserver) 2. app can be created successfully, and route can be accessed 3. Shutdown the first master after creating the stack 4. app can be created successfully, and route can be accessed 5. Scaling up a node 6. app can be created successfully, and route can be accessed 7. Scaling down a node 8. app can be created successfully, and route can be accessed 9. recover the first master 10. app can be created successfully, and route can be accessed |
Description of problem: Create a heat stack with external loadbalancer, the whole cluster would be down if the first master is down somehow. The root cause is that the certificate data is created with the first master by default if openshift_master_cluster_hostname and openshift_master_cluster_public_hostname are not specified in inventory hosts. So the cluster would not work again once the fist master(the whole instance or the atomic-openshift-master-api service) is down. Version-Release number of selected component (if applicable): v0.9.4 How reproducible: always Steps to Reproduce: 1.Create a heat stack with HA master + external loadbalancer 2.Shutdown the first master (usually named with "*master-0") 3. Actual results: # oc get po Unable to connect to the server: dial tcp 192.168.10.7:8443: i/o timeout Note: 192.168.10.7 is the first master ip. On the nodes, the client-certificate-data is created with first master [root@ghuang-test6-ha-ocp-node-42q26tq1 ~]# cat /etc/origin/node/system\:node\:ghuang-test6-ha-ocp-node-42q26tq1.test.com.kubeconfig apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1akNDQWRDZ0F3SUJBZ0lCQVRBTEJna3Foa2lHOXcwQkFRc3dKakVrTUNJR0ExVUVBd3diYjNCbGJuTm8KYVdaMExYTnBaMjVsY2tBeE5EYzNNRFEwTURreE1CNFhEVEUyTVRBeU1URXdNREV6TWxvWERUSXhNVEF5TURFdwpNREV6TTFvd0pqRWtNQ0lHQTFVRUF3d2JiM0JsYm5Ob2FXWjBMWE5wWjI1bGNrQXhORGMzTURRME1Ea3hNSUlCCklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUFzWDlKbVdpWDVMcUVkRXRyZjVVU3R3VzMKWTBEa2hhZGFPbC9FWmRjRTVKbFdveGJCRGJweWQzOC9XNW1EWTl0cGgvTUlpdkxJU054b1ZtVTQwSXBMK0EwQgoyNEcrR0RQK25lMlR3Y1JkSS83RXk1bVhETER3cWlvN1RXMlBXY2lGRWdyQTNheUJhbWh3N1BiazVhZ21ESWE2ClFvaU1rVHlNOHViZy9XSy9TS0crMHkzQklGelE4Rk8rWlVtK25QNmMwVnl0NEVXdW5WS0RKc3RIZlp0eVJjUWgKTFlSWXI3UDl6bFp3NlRtN0tTTHVnOXk5WVJGKzA3L0VaMFU1a254MEhnVFFTK2ZERHk1S2I1eUVnbWtGRit4ZgpwT2tuWFhIb1FVSXlUT1I0T1BSSG5yQ25DQWZmb3I4K05LWElUcWJlQkNOQ2lqc3Z2Q3M5YzNxRGw0MkRTUUlECkFRQUJveU13SVRBT0JnTlZIUThCQWY4RUJBTUNBS1F3RHdZRFZSMFRBUUgvQkFVd0F3RUIvekFMQmdrcWhraUcKOXcwQkFRc0RnZ0VCQUNyZ3RXS3hmR25Da3g0RlVvM2xnc0doRnpwUCtKUzY4aU03cVJOWC83eXcwRUp0RWYrcgpxZXRJVjNRempJTi92dnlLNWhLUTU3OUd0TjYrcWdBMHNMa1J1TGIwSnJpTUh3bGdTN3dJY29IQUhxUHd2eG1xCjRTbWlidzZEY3J3YkZ1QWtqRTdRS0gwRGM1NWxzbWkrWEZnRHkzVlJhbm16NW1HbzJXaWZvblNDcjJCL2JQLysKMTVVcG5HOHlLMS9uYVJyS2tvL0xnTEpXcm9pb0QvbE11eTdPNFJOQlp0eFB6S29rY1MvNGpUdUxkK0Qya1Q2TgpkWFZGM3RidE1HMzBVVkRmZHMwS0ZFRDVScHYycjFGVE1WZ2NMaHJteFJzR0YzRHpZS0IwN2lycHZ4anRDb2VECjhsdmJyR29ROTl6RHNkYTdnSTZQZTlIcEdqUWVpZjBYeWRVPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://ghuang-test6-ha-ocp-master-0.test.com:8443 name: ghuang-test6-ha-ocp-master-0-test-com:8443 contexts: - context: cluster: ghuang-test6-ha-ocp-master-0-test-com:8443 namespace: default user: system:node:ghuang-test6-ha-ocp-node-42q26tq1.test.com/ghuang-test6-ha-ocp-master-0-test-com:8443 name: default/ghuang-test6-ha-ocp-master-0-test-com:8443/system:node:ghuang-test6-ha-ocp-node-42q26tq1.test.com current-context: default/ghuang-test6-ha-ocp-master-0-test-com:8443/system:node:ghuang-test6-ha-ocp-node-42q26tq1.test.com kind: Config preferences: {} users: - name: system:node:ghuang-test6-ha-ocp-node-42q26tq1.test.com/ghuang-test6-ha-ocp-master-0-test-com:8443 user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURLakNDQWhTZ0F3SUJBZ0lCRlRBTEJna3Foa2lHOXcwQkFRc3dKakVrTUNJR0ExVUVBd3diYjNCbGJuTm8KYVdaMExYTnBaMjVsY2tBeE5EYzNNRFEwTURreE1CNFhEVEUyTVRBeU1URXdNRFl6TlZvWERURTRNVEF5TVRFdwpNRFl6Tmxvd1dERVZNQk1HQTFVRUNoTU1jM2x6ZEdWdE9tNXZaR1Z6TVQ4d1BRWURWUVFERXpaemVYTjBaVzA2CmJtOWtaVHBuYUhWaGJtY3RkR1Z6ZERZdGFHRXRiMk53TFc1dlpHVXROREp4TWpaMGNURXVkR1Z6ZEM1amIyMHcKZ2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRQ3JMN3k5eEQ5OHhVT3FrNDZIQlhSMgp3S1VyN25meHAwWXJZellKdVBWV1FYT25jS0lKNnpubWxDTStKSzZ4QkFXMWF2U2lwNWxoN1FtVFcyNXZQK25TCldEanJ5TmpJaUJEbXI4eDE1RUtFSDQ0bVIwYVQ1YkRlSzh0TTNicE1jTjBEWllnM2RockF2d0xXdmJBZU9vakEKTjRWRUcxVmtUWFE3L2JsMHJYQ3kvNnpHT1ZWNHFxRTZ4aE8vcXBweFkwUHZORDM0MEdYQmtGUVM5QW9lSzk2eApPMG1zOUllL3liTjExSjFGVGhha01IS2FGNGNLdHN1b1FhSjNHSWZvZjV4NWVXcWNaN1o3RGVsS2tIYVlkYk92CjhON3RkdS9BNUtwQUpTRVI0YW5nVTZvcHhaSHJMNk4yOC9GUWxZZGRVQmV6YjI0bkhQdmJSWVVUK0dOaUdxdjUKQWdNQkFBR2pOVEF6TUE0R0ExVWREd0VCL3dRRUF3SUFvREFUQmdOVkhTVUVEREFLQmdnckJnRUZCUWNEQWpBTQpCZ05WSFJNQkFmOEVBakFBTUFzR0NTcUdTSWIzRFFFQkN3T0NBUUVBVndQRHg4dWljQUZEeUNpTEhwNG5WMUZnCnBwczFRSTFvSGZiM1lrL1VsYzFESWtGQzVRUStyR0RjWEs1TXJlZDdrMEFDdjJlZ0djM0t1OWdpKytRcC9UTHEKcmd3bTh0dllMZ1lQblNtUjNLMkJhbW9EaHBBS2ZFbmZ4bU1KaUpWTzcraFpZL3R6RElmMTBOOTVmcWpEZzBXZQo0MjJVL1Z5UmJ2cTlkaXVjTGpVNjYzU3dqb3J4YkdwUnU5bXlwTEtQYmdoSHFzQlpoSHlzZVF5Vk42UUF5dmJ0CjdRNkdIYXplUTZObnl0U09VNTFoUVVTaElwZ0tUcmEvMU9ZOVhCRkFyaGtaTEFlQ2hwM0tpRlo5L1FHYlZNRlUKeUpZL3ViYU1IUE53MENTaTB3VG5CZUlCZ0thaFJVREEzZFNwQ1BudG1DZi9XREgwREJHc241YkIxaVRhVVE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBcXkrOHZjUS9mTVZEcXBPT2h3VjBkc0NsSys1MzhhZEdLMk0yQ2JqMVZrRnpwM0NpCkNlczU1cFFqUGlTdXNRUUZ0V3Iwb3FlWlllMEprMXR1YnovcDBsZzQ2OGpZeUlnUTVxL01kZVJDaEIrT0prZEcKaytXdzNpdkxUTjI2VEhEZEEyV0lOM1lhd0w4QzFyMndIanFJd0RlRlJCdFZaRTEwTy8yNWRLMXdzditzeGpsVgplS3FoT3NZVHY2cWFjV05EN3pROStOQmx3WkJVRXZRS0hpdmVzVHRKclBTSHY4bXpkZFNkUlU0V3BEQnltaGVICkNyYkxxRUdpZHhpSDZIK2NlWGxxbkdlMmV3M3BTcEIybUhXenIvRGU3WGJ2d09TcVFDVWhFZUdwNEZPcUtjV1IKNnkramR2UHhVSldIWFZBWHMyOXVKeHo3MjBXRkUvaGpZaHFyK1FJREFRQUJBb0lCQUNrN09VR1h5QmJjU0gwSQpSMWI4R0Y0VjduS1RZRzVpOU1LMGhhcDMweGV3Y2hQTlRDb0pid3U3ZUhXYVRqMHlrOUZyYm5yUzFWM0J3d0dzCkR3QmFxNDNQVS81dWhOQmYvWG9pczZOZGxDdlFrZU5rWFhwMzQwN1B5NHE3Q1FrcVVnRmtiaGUxcWFIdEg5anIKSFVWYW9kOXlQL1gwZzIvQ1BCSEsvZVU5ZFJ5WGxTeUpFMXJmdndDUSs2RHZOOFNmNmpKcDQ5aHFpNkZOSWJncQpHeGpPN1FVZUFOZ3dMTFZnd1FETDE1eTJLeTg2ZnJMRmdlYXpiSTBPWFVRaW9QWU1pMDhVS1BOeVpvMm5IL1VxCmg5QjROR3FGUUZXWXNWcHljMCtFUjJOa2RwcU9oOUw0dEVmWllnTk9wWlNrblZIbDI4QlpYOUNZZGJRRnN4SksKb3RiQ0xCVUNnWUVBd0wzV1J3eWtNNzVVclpBSkVSZUZVN2JsWnFMOXpHQkpZcURQaHFSSWVad1RmM3hJRjc2RQpKS2FHL1VCcWwyY2ZhUFBNUjRCUjdEeHZtNGNWNVJ1TmZseW1SamVSdWdVckE3RzRWTjcvbkdTTzhPazh5SW5uCmJYeDNpR0MrSVhIcTRnNTdwODNqRlo5UmRtc2E1OW5RNjhvU2VvUllwZzhBSlJrbWtIQ3dRd2NDZ1lFQTQxN1gKTmY2WjRoSmNtc2c2WE9aM0hwMDI2bDdBbk5wbGYxVHJ0dExQa0NnaFNYdEdsekhhWmxwMG4yQ3JZbnBMWC9EYwpVcWg5UG9QcjhDalN5cDd0WkhTenZtZmM2VVlZNkt4NTdBOHdYem0xM0NGdmFlYVJVOE1OWVk1ckt1YjBlZ3FqCi9CdDg4MXI1RmVzN3NyazlkbERwbTIxTzVYTFNsVGt1cTJXMjJQOENnWUEwMzE2Nm10TW9ocHZBQ1BVVHhUb0QKM3ZaTEU0Yy8yMklHTmtyM2luVi9OcnQ2aTJOVGNDWGJ6L3JUMmluallweVJNOS9qOVdXRHdvaHpSN2xQNGlFTQpldW41OVNCNndSUXRyVUQ5dHphemRqcG9CL051cDdYZXFQZzVaeUNCR0Rqd3pqeEpxZ2NUVldNSmN4UXNhZW9QCjVKenhFd0VtZkpMem1sU2o1dVhUWFFLQmdDS1B4aEwxRXBza3cyTGIwTk5TVFFVZ1RMcXZrSVBIUnVwbUZEYUUKTVB6dXZMQ1l4cEF4Q2N2Sk1EVVIwcnR6YjRXejdTbTdadDViMno5MFZTWnJwaFpCRHhtQVhEb3haNVBtczluSQpMVWdzVTVLVW1vVDBnVjdFSllLUXpZV0YrZCtiUW5ZT0Q1NUdVOXFiR1VYL2xuSW50bnJqMEx4Y0NkcVpDSmtSCkt3d3RBb0dBYWZ1dXQra0dYOHE1d09wMUdJVjIva3RwTXk1ekFlNUZJQUhsSjlMTERqZTl2Z3FDZHJPaHpxTHQKWFhLc3g0Um9mSlViRmovL3E2dTZldUFXWHVmQi9SazUrLzVLVkVTa2lrb3dYTUVOeUJwWlNqcmlmZVM3TjRvNQpIUUVnN2ZGM0F4QUptWDNzT0pJeWJQWDlJMjlEUGxyMHphU1ZmdjRsU25SdmtGaktBdnc9Ci0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg== Recover the first master, openshift continue working. Expected results: Additional info: