Bug 1530403
| Summary: | Installer fails noting no etcd group despite etcd hosts group that IS defined | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Eric Jones <erjones> | |
| Component: | Installer | Assignee: | Russell Teague <rteague> | |
| Status: | CLOSED ERRATA | QA Contact: | Gaoyun Pei <gpei> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | medium | |||
| Version: | 3.9.0 | CC: | aos-bugs, dmoessne, erjones, jokerman, mmccomas, rteague, sdodson, wmeng, xtian | |
| Target Milestone: | --- | |||
| Target Release: | 3.9.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: |
Error message on etcd group validation updated to reflect the required configurations to better inform the user of the failure state.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1538795 (view as bug list) | Environment: | ||
| Last Closed: | 2018-03-28 14:17:25 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
|
Description
Eric Jones
2018-01-02 23:02:59 UTC
It's complaining because there are two etcd hosts which is not a valid number of etcd hosts. In order to provide an HA etcd environment you need three etcd hosts, as it is now if a single etcd host were to fail the entire cluster would fail. If you don't care about HA then you can specify one. If this is a new environment lets leave it at that. If this is an existing environment we should try to scale them up to three etcd hosts so they have a proper HA environment. To do that add a host to [new_etcd] and run playbooks/byo/openshift-etcd/scaleup.yml. I understand that one could look at the hosts file to determine that because of the number of etcd required for HA but did you see anything in the ansible output that would indicate that? If not, then I think That is the bug here as we should note that as the error instead of "Running etcd as an embedded service is no longer supported. If this is a new install please define an 'etcd' group with either one or three hosts. These hosts may be the same hosts as your masters. If this is an upgrade you may set openshift_master_unsupported_embedded_etcd=true until a migration playbook becomes available.\n" Yes, the error message you copied and pasted says as much. "If this is a new install please define an 'etcd' group with either one or three hosts." I agree the error should be updated so we'll use this to track that. Thanks Scott! Merged The proposed PR not merged in openshift-ansible-3.9.0-0.24.0.git.0.735690f.el7.noarch yet, wait for next build to verify the bug. Verify this bug with openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7.noarch
Prepare an ansible inventory file which has two etcd hosts, run playbooks/prerequisites.yml.
#ansible-playbook -i host /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml
...
TASK [Evaluate groups - Fail if no etcd hosts group is defined] *************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Running etcd as an embedded service is no longer supported. If this is a new install please define an 'etcd' group with either one, three or five hosts. These hosts may be the same hosts as your masters. If this is an upgrade please see https://docs.openshift.com/container-platform/latest/install_config/upgrading/migrating_embedded_etcd.html for documentation on how to migrate from embedded to external etcd.\n"}
to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/prerequisites.retry
PLAY RECAP ******************************************************************************************************************************************************************
ec2-52-200-181-35.compute-1.amazonaws.com : ok=1 changed=0 unreachable=0 failed=0
localhost : ok=1 changed=0 unreachable=0 failed=1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489 |