Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1507083 - openshift_master_etcd_hosts list get wrong in rpm install.
openshift_master_etcd_hosts list get wrong in rpm install.
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.7.0
Unspecified Unspecified
urgent Severity high
: ---
: 3.7.0
Assigned To: Andrew Butcher
Johnny Liu
: TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-10-27 12:00 EDT by Johnny Liu
Modified: 2017-11-28 17:20 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-11-28 17:20:01 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
installation log (4.28 MB, text/plain)
2017-10-27 12:08 EDT, Johnny Liu
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-28 21:34:54 EST

  None (edit)
Description Johnny Liu 2017-10-27 12:00:25 EDT
Description of problem:
TASK [set_fact] ****************************************************************
Friday 27 October 2017  15:11:38 +0000 (0:00:00.069)       0:10:28.609 ******** 
ok: [qe-jialiu-jijm-master-etcd-1.1027-qtx.qe.rhcloud.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
ok: [qe-jialiu-jijm-master-etcd-2.1027-qtx.qe.rhcloud.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
ok: [qe-jialiu-jijm-master-etcd-3.1027-qtx.qe.rhcloud.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1", "qe-jialiu-jijm-master-etcd-1"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}

That leads to master is connecting wrong etcd cluster.
etcdClientInfo:
  ca: master.etcd-ca.crt
  certFile: master.etcd-client.crt
  keyFile: master.etcd-client.key
  urls:
  - https://qe-jialiu-jijm-master-etcd-1:2379
  - https://qe-jialiu-jijm-master-etcd-1:2379
  - https://qe-jialiu-jijm-master-etcd-1:2379

I try another install on AWS, no such issues.

Is that because ectd hostnames are similar?

Version-Release number of the following components:
openshift-ansible-3.7.0-0.178.1.git.0.43f8486.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. 
2.
3.

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Additional info:
Please attach logs from ansible-playbook with the -vvv flag
Comment 1 Johnny Liu 2017-10-27 12:07:10 EDT
Pls get inventory host file and installation log from attachment.
Comment 2 Johnny Liu 2017-10-27 12:08 EDT
Created attachment 1344385 [details]
installation log
Comment 3 Johnny Liu 2017-10-27 12:13:49 EDT
okay, seem like I reproduce it on my another rpm install on AWS, seem like it is irrelevant to hostname.

containerized install is passed:
ok: [ec2-34-227-98-143.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-9-222.ec2.internal", "ip-172-18-4-165.ec2.internal", "ip-172-18-12-109.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
ok: [ec2-52-90-152-31.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-9-222.ec2.internal", "ip-172-18-4-165.ec2.internal", "ip-172-18-12-109.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
ok: [ec2-52-86-178-3.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-9-222.ec2.internal", "ip-172-18-4-165.ec2.internal", "ip-172-18-12-109.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}


rpm install is failed:
TASK [set_fact] ****************************************************************
Friday 27 October 2017  15:53:42 +0000 (0:00:00.076)       0:08:26.101 ******** 
ok: [ec2-52-202-232-150.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
ok: [ec2-34-229-115-245.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
ok: [ec2-52-90-116-202.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal", "ip-172-18-14-244.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
Comment 4 Johnny Liu 2017-10-28 04:15:40 EDT
Once this issues happened, other masters (not the 1st one) api service would fail to start.

Oct 28 04:09:09 qe-jialiu-xlxf-master-etcd-3 atomic-openshift-master-api[5136]: F1028 04:09:09.247495    5136 hooks.go:133] PostStartHook "oauth.openshift.io-EnsureBootstrapOAuthClients" failed: Post https://qe-jialiu-xlxf-master-etcd-3:8443/apis/oauth.openshift.io/v1/oauthclients: x509: certificate is valid for kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, openshift, openshift.default, openshift.default.svc, openshift.default.svc.cluster.local, qe-jialiu-xlxf-lb-1.1028-v-k.qe.rhcloud.com, qe-jialiu-xlxf-master-etcd-1, qe-jialiu-xlxf-master-etcd-1.1028-v-k.qe.rhcloud.com, 10.240.0.2, 172.30.0.1, 35.202.242.152, not qe-jialiu-xlxf-master-etcd-3


That means the whole multiple master env setup failed.


This is blocking rpm multiple master testing.
Comment 5 Scott Dodson 2017-10-30 10:11:02 EDT
(In reply to Johnny Liu from comment #1)
> Pls get inventory host file and installation log from attachment.

yeah can we get the inventory
Comment 6 Johnny Liu 2017-10-30 22:48:47 EDT
need what info from me? the inventory host file? I said the inventory host file is included in the attachment (searching "openshift-ansible-inventory-start" keyword from the attachment).

> yeah can we get the inventory
I guess Scott did a typo.
Comment 10 Johnny Liu 2017-11-02 23:39:54 EDT
Verified this bug with openshift-ansible-3.7.0-0.190.0.git.0.129e91a.el7.noarch, and PASS.


TASK [set_fact] ****************************************************************
Friday 03 November 2017  02:44:58 +0000 (0:00:00.068)       0:08:07.903 ******* 
ok: [ec2-54-242-50-70.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-8-57.ec2.internal", "ip-172-18-12-243.ec2.internal", "ip-172-18-14-135.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
ok: [ec2-52-206-149-174.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-8-57.ec2.internal", "ip-172-18-12-243.ec2.internal", "ip-172-18-14-135.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
ok: [ec2-52-91-66-4.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_master_etcd_hosts": ["ip-172-18-8-57.ec2.internal", "ip-172-18-12-243.ec2.internal", "ip-172-18-14-135.ec2.internal"], "openshift_master_etcd_port": "2379"}, "changed": false, "failed": false}
Comment 13 errata-xmlrpc 2017-11-28 17:20:01 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188

Note You need to log in before you can comment on or make changes to this bug.