+++ This bug was initially created as a clone of Bug #1755557 +++ Scenario: 1. Perform DR steps (remove two masters, wait until quorum lost, physically delete machines) 2. Recover following instructions 3. Repeat 1 but pick a new master (one that was created in step 2) Expected: Able to start a new etcd quorum on the first master with only itself Actual: /usr/local/bin/etcd-snapshot-restore.sh /root/assets/backup/snapshot.db etcd-member-ip-10-0-149-142.ec2.internal=https://etcd-2.ci-ln-shp2psk-d5d6b.origin-ci-int-aws.dev.rhcloud.com:2380 restored the cluster containing the previous (etcd-0, etcd-1) members, preventing further progress. The root cause is that the etcd-snapshot-restore script accepts ETCD_INITIAL_CLUSTER as an argument, but then sources /run/etcd/environment which may contain the value ETCD_INITIAL_CLUSTER. So the user's intent was to start a new cluster with one member, but we started with 3 members (including the two that are permanently gone). The script needs to preserve ETCD_INITIAL_CLUSTER correctly even if it's set in /run/etcd/environment. Needs to be back ported to all releases.
I checked latest payload(4.1.0-0.nightly-2019-11-20-192514), code have not be merged into it.
Verfied with 4.1.25
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3913