Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1506177 - Upgrade will fail if the number of etcd hosts is more than 3
Upgrade will fail if the number of etcd hosts is more than 3
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Upgrade (Show other bugs)
3.7.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.9.0
Assigned To: Scott Dodson
liujia
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-10-25 06:31 EDT by liujia
Modified: 2018-03-28 10:08 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The etcd host validation now accepts 1 or more etcd hosts allowing greater flexibility in the number of etcd hosts configured. The recommended number of etcd hosts is still 3.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-03-28 10:08:09 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 10:08 EDT

  None (edit)
Description liujia 2017-10-25 06:31:49 EDT
Description of problem:
Run upgrade against cluster with 4 etcd hosts, upgrade will fail at task [Evaluate groups - Fail if no etcd hosts group is defined].

fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "msg": "Running etcd as an embedded service is no longer supported. If this is a new install please define an 'etcd' group with either one or three hosts. These hosts may be the same hosts as your masters. If this is an upgrade you may set openshift_master_unsupported_embedded_etcd=true until a migration playbook becomes available.\n"}

===============
g_etcd_hosts length check should not limit in [3,1].

# vim playbooks/common/openshift-cluster/evaluate_groups.yml
 - name: Evaluate groups - Fail if no etcd hosts group is defined
    fail:
      msg: >
        Running etcd as an embedded service is no longer supported. If this is a
        new install please define an 'etcd' group with either one or three
        hosts. These hosts may be the same hosts as your masters. If this is an
        upgrade you may set openshift_master_unsupported_embedded_etcd=true
        until a migration playbook becomes available.
    when:
    - g_etcd_hosts | default([]) | length not in [3,1]
    - not openshift_master_unsupported_embedded_etcd | default(False)
    - not (openshift_node_bootstrap | default(False))


Version-Release number of the following components:
openshift-ansible-docs-3.7.0-0.178.0.git.0.27a1039.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. Upgrade against ocp with more than 3 etcd hosts
2.
3.

Actual results:
Upgrade failed.

Expected results:
Upgrade succeed.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag
Comment 1 Jan Chaloupka 2017-10-25 08:05:48 EDT
The number of etcd members reflects failure tolerance of the cluster [1]. So creating a cluster of size 4 is not a huge improvement to size 3. I believe the size of the etcd cluster has been kept in bounds since the 1-etcd member and 3-etcd member clusters deployment are known and thoroughly tested.

IINM, it is preferable to deploy a cluster with 3 etcd members and then scale the etcd up with the playbooks/common/openshift-etcd/scaleup.yml. One can deploy a basic cluster, see how it behaves and then scale etcd up in case the number of etcd CRUD requests goes over a reasonable limit.

[1] https://coreos.com/etcd/docs/latest/v2/admin_guide.html#optimal-cluster-size
Comment 2 Scott Dodson 2017-10-25 09:05:45 EDT
Discussed with the master team (Michal Fojtik and Stefan Schimanski) we should accept either 1, 3, or 5 nodes as an acceptable cluster size and we should recommend 3 nodes. Lets update the error message to make that more clear.
Comment 4 Scott Dodson 2018-01-24 10:49:26 EST
https://github.com/openshift/openshift-ansible/pull/6749 updates the rules to accept 1, 3, or 5 etcd hosts. We're not going to support any other configurations.
Comment 6 liujia 2018-01-30 03:03:47 EST
Verified on openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7.noarch.
The fail msg will not block upgrade playbook when upgrade with 5 etcds, but will fail when etcd number is out of [1,3,5].
Comment 10 errata-xmlrpc 2018-03-28 10:08:09 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.