Description of problem: TASK [set_fact] **************************************************************** Friday 27 October 2017 15:11:38 +0000 (0:00:00.069) 0:10:28.609 ******** ok: [qe-jialiu-jijm-master-etcd-1.1027-qtx.qe.rhcloud.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} ok: [qe-jialiu-jijm-master-etcd-2.1027-qtx.qe.rhcloud.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} ok: [qe-jialiu-jijm-master-etcd-3.1027-qtx.qe.rhcloud.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} That leads to master is connecting wrong etcd cluster. etcdClientInfo: ca: master.etcd-ca.crt certFile: master.etcd-client.crt keyFile: master.etcd-client.key urls: - https://qe-jialiu-jijm-master-etcd-1:2379 - https://qe-jialiu-jijm-master-etcd-1:2379 - https://qe-jialiu-jijm-master-etcd-1:2379 I try another install on AWS, no such issues. Is that because ectd hostnames are similar? Version-Release number of the following components: openshift-ansible-3.7.0-0.178.1.git.0.43f8486.el7.noarch How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
Pls get inventory host file and installation log from attachment.
Created attachment 1344385 [details] installation log
okay, seem like I reproduce it on my another rpm install on AWS, seem like it is irrelevant to hostname. containerized install is passed: ok: [ec2-34-227-98-143.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-9-222.ec2.internal", "ip-172-18-4-165.ec2.internal", "ip-172-18-12-109.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} ok: [ec2-52-90-152-31.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-9-222.ec2.internal", "ip-172-18-4-165.ec2.internal", "ip-172-18-12-109.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} ok: [ec2-52-86-178-3.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-9-222.ec2.internal", "ip-172-18-4-165.ec2.internal", "ip-172-18-12-109.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} rpm install is failed: TASK [set_fact] **************************************************************** Friday 27 October 2017 15:53:42 +0000 (0:00:00.076) 0:08:26.101 ******** ok: [ec2-52-202-232-150.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} ok: [ec2-34-229-115-245.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} ok: [ec2-52-90-116-202.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
Once this issues happened, other masters (not the 1st one) api service would fail to start. Oct 28 04:09:09 qe-jialiu-xlxf-master-etcd-3 atomic-openshift-master-api[5136]: F1028 04:09:09.247495 5136 hooks.go:133] PostStartHook "oauth.openshift.io-EnsureBootstrapOAuthClients" failed: Post https://qe-jialiu-xlxf-master-etcd-3:8443/apis/oauth.openshift.io/v1/oauthclients: x509: certificate is valid for kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, openshift, openshift.default, openshift.default.svc, openshift.default.svc.cluster.local, qe-jialiu-xlxf-lb-1.1028-v-k.qe.rhcloud.com, qe-jialiu-xlxf-master-etcd-1, qe-jialiu-xlxf-master-etcd-1.1028-v-k.qe.rhcloud.com, 10.240.0.2, 172.30.0.1, 35.202.242.152, not qe-jialiu-xlxf-master-etcd-3 That means the whole multiple master env setup failed. This is blocking rpm multiple master testing.
(In reply to Johnny Liu from comment #1) > Pls get inventory host file and installation log from attachment. yeah can we get the inventory
need what info from me? the inventory host file? I said the inventory host file is included in the attachment (searching "openshift-ansible-inventory-start" keyword from the attachment). > yeah can we get the inventory I guess Scott did a typo.
https://github.com/openshift/openshift-ansible/pull/5978
Verified this bug with openshift-ansible-3.7.0-0.190.0.git.0.129e91a.el7.noarch, and PASS. TASK [set_fact] **************************************************************** Friday 03 November 2017 02:44:58 +0000 (0:00:00.068) 0:08:07.903 ******* ok: [ec2-54-242-50-70.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-8-57.ec2.internal", "ip-172-18-12-243.ec2.internal", "ip-172-18-14-135.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} ok: [ec2-52-206-149-174.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-8-57.ec2.internal", "ip-172-18-12-243.ec2.internal", "ip-172-18-14-135.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false} ok: [ec2-52-91-66-4.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-8-57.ec2.internal", "ip-172-18-12-243.ec2.internal", "ip-172-18-14-135.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188