| Summary: | [etcd3]Failed to install ose-3.1 with etcd3 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Wenkai Shi <weshi> |
| Component: | Installer | Assignee: | Scott Dodson <sdodson> |
| Status: | CLOSED WONTFIX | QA Contact: | Wenkai Shi <weshi> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.1.0 | CC: | aos-bugs, bleanhar, jchaloup, jokerman, mbarrett, mmccomas, weshi, wmeng, xtian |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-06-22 21:10:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Additional info: Sorry for typo mistake. [root@ansible ~]# cat hosts ... [masters] master1.example.com master2.example.com [nodes] master1.example.com master2.example.com node1.example.com node2.example.com [etcd] etcd1.example.com etcd2.example.com etcd3.example.com [lb] lb.example.com [nfs] nfs.example.com This shouldn't be an issue with the packaging updates in RHEL 7.3.2 because etcd-3.0.x obsoletes etcd3. Can you please test this again? If this is no longer an issue we should CLOSED NOTABUG Hi Wenkai, it does not make much sense to run openshift 3.1 with etcd-3.*. Openshift 3.1 is derived from Kubernetes 1.2-alpha-7 which at the time did know anything about etcd v3. The Kubernetes 1.2-alpha-7 is using 2.2.2-4 which is internally (and possible externally) lot different from v3. Do we support deployment of ose-1.3 with etcd > 3 at all? (In reply to Jan Chaloupka from comment #6) > Hi Wenkai, > > it does not make much sense to run openshift 3.1 with etcd-3.*. Openshift > 3.1 is derived from Kubernetes 1.2-alpha-7 which at the time did know > anything about etcd v3. The Kubernetes 1.2-alpha-7 is using 2.2.2-4 which is > internally (and possible externally) lot different from v3. > > Do we support deployment of ose-1.3 with etcd > 3 at all? Hi~ We didn't support deployment of Openshift3.1 with etcd3 officially, because there is no etcd3 when Openshift3.1 release. But think about this, if a customer want to deploy Openshift3.1 at this moment, the version of etcd will be 3.*. Check with version openshift-ansible-3.0.101-1.git.0.4d5c0f5.el7aos.noarch, installation failed, seems like the oo_first_etcd is missing. # ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml ... PLAY [Configure etcd certificates] ******************************************** GATHERING FACTS *************************************************************** FATAL: no hosts matched or all hosts have already failed -- aborting TASK: [openshift_facts | Verify Ansible version is greater than or equal to 1.9.4] *** FATAL: no hosts matched or all hosts have already failed -- aborting ... Wenkai, Are you using ansible-1.9.4 or ansible 2.x? This version of the installer is not compatible with 2.x. (In reply to Scott Dodson from comment #13) > Wenkai, > > Are you using ansible-1.9.4 or ansible 2.x? This version of the installer is > not compatible with 2.x. Hi, I'm using ansible-1.9.4 to check this. # rpm -q ansible ansible-1.9.4-1.el7aos.noarch Wenkai, have you been able to verify the fix with Ansible-2.x? (In reply to Jan Chaloupka from comment #15) > Wenkai, have you been able to verify the fix with Ansible-2.x? Try with Ansible-2.x, It doesn't works. # rpm -q ansible ansible-2.2.3.0-1.el7.noarch # ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml ... TASK [add_host] **************************************************************** Tuesday 20 June 2017 04:44:43 +0000 (0:00:00.015) 0:00:00.052 ********** An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ImportError: cannot import name bool fatal: [localhost]: FAILED! => {"failed": true, "msg": "Unexpected failure during module execution.", "stdout": ""} ... How did you install the Ansible? It looks like missing python module. (In reply to Jan Chaloupka from comment #17) > How did you install the Ansible? It looks like missing python module. I install it with yum install command, with this version of ansible, I can install OCP 36 env well. Moving this to WONTFIX until we get a customer case associated with it. The likelihood of anyone installing 3.1 right now is very low. |
Description of problem: So far, version 3 etcd named etcd3. For testing etcd3, we modified the code of openshift-ansible, update package name from "etcd" to "etcd3" to let it install etcd3 by default. Prepare a new env with 3 etcd node, installation get failed when start etcd.service. In v3.1 openshift-ansible playbook, the behaviour of deploying etcd cluster is to deploy the fisrt etcd server, wait the etcd server started, and then deploy others. For etcd3, before others are deployed, the first etcd node can not find the other two etcd servers. then the first etcd server would never get started. While for etcd2 in the same case, the fist etcd server start successfully even if the other two etcd nodes are not installed yet. So this issue will not happen in v3.2, v3.3, v3.4. Because the behaviours in 3.2/3.3/3.4 openshift-ansible playbook are different, In those versions, it will deploy all etcd servers one by one, then wait for etcd server started. Version-Release number of selected component (if applicable): openshift-ansible-3.0.98-1 openshift v3.1.1.8 etcd3-3.0.3-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Modified the code of openshift-ansible like listed: [root@ansible ~]# vim /usr/share/ansible/openshift-ansible/roles/etcd/tasks/main.yml ... - name: Install etcd action: "{{ ansible_pkg_mgr }} name=etcd3 state=present" when: not etcd_is_containerized | bool ... 2.Prepare a env, make sure it has a etcd cluster which is composed of 3 etcd servers. 3. Actual results: Installation failed [root@ansible ~]# ansible-playboos -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml ... PLAY [Configure first etcd host] ********************************************** ... TASK: [etcd | Enable etcd] **************************************************** failed: [etcd1.example.com] => {"failed": true} msg: Job for etcd.service failed because a timeout was exceeded. See "systemctl status etcd.service" and "journalctl -xe" for details. ... Expected results: Installation succeed. Additional info: [root@ansible ~]# cat hosts ... [masters] master.example.com node.example.com [nodes] master1.example.com master2.example.com node1.example.com node2.example.com [etcd] etcd1.example.com etcd2.example.com etcd3.example.com [lb] lb.example.com [nfs] nfs.example.com