Verified in OCP 4.5.0-0.nightly-2020-04-24-091134 on top of RHOS-16.1-RHEL-8-20210311.n.1 with OVN-Octavia. New master is successfully created with different port name: $ openstack port list -c Name -f value| grep master ostest-pz7zk-master-port-1 ostest-pz7zk-master-3 ostest-pz7zk-master-port-0 Procedure: Replacing an unhealthy etcd member whose machine is not running or whose node is not ready: 1. Remove master-2 $ oc -n openshift-machine-api get machines NAME PHASE TYPE REGION ZONE AGE ostest-pz7zk-master-0 Running m4.xlarge regionOne nova 33m ostest-pz7zk-master-1 Running m4.xlarge regionOne nova 33m ostest-pz7zk-master-2 Running m4.xlarge regionOne nova 33m ostest-pz7zk-worker-bmpb2 Running m4.xlarge regionOne nova 18m ostest-pz7zk-worker-swgx5 Running m4.xlarge regionOne nova 18m ostest-pz7zk-worker-vfbqn Running m4.xlarge regionOne nova 18m $ oc -n openshift-machine-api delete machine ostest-pz7zk-master-2 2. Remove etcd member (ostest-pz7zk-master-2): $ oc rsh -n openshift-etcd etcd-ostest-pz7zk-master-0 Defaulting container name to etcdctl. Use 'oc describe pod/etcd-ostest-pz7zk-master-0 -n openshift-etcd' to see all of the containers in this pod. sh-4.4# etcdctl member list -w table +------------------+---------+-----------------------+---------------------------+---------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+-----------------------+---------------------------+---------------------------+------------+ | 5c13780e866b473b | started | ostest-pz7zk-master-1 | https://10.196.1.4:2380 | https://10.196.1.4:2379 | false | | 794465c1dc67a32b | started | ostest-pz7zk-master-0 | https://10.196.1.227:2380 | https://10.196.1.227:2379 | false | | 8f899f0814a46849 | started | ostest-pz7zk-master-2 | https://10.196.2.247:2380 | https://10.196.2.247:2379 | false | +------------------+---------+-----------------------+---------------------------+---------------------------+------------+ sh-4.4# etcdctl member remove 8f899f0814a46849 sh-4.4# exit 3. Remove secrets from master: $ oc -n openshift-etcd get secrets | grep ostest-pz7zk-master-2 etcd-peer-ostest-pz7zk-master-2 kubernetes.io/tls 2 31m etcd-serving-metrics-ostest-pz7zk-master-2 kubernetes.io/tls 2 31m etcd-serving-ostest-pz7zk-master-2 kubernetes.io/tls 2 31m $ oc -n openshift-etcd delete secret etcd-peer-ostest-pz7zk-master-2 etcd-serving-metrics-ostest-pz7zk-master-2 etcd-serving-ostest-pz7zk-master-2 4. Create new master machine: $ oc get machine ostest-pz7zk-master-0 -n openshift-machine-api -o yaml > new-master-machine.yaml Edit the new-master-machine.yaml: Remove the entire status section and the annotations and change the name field to a new name (ostest-pz7zk-master-3). $ oc apply -f new-master-machine.yaml machine.machine.openshift.io/ostest-pz7zk-master-3 created $ oc get machines -A NAMESPACE NAME PHASE TYPE REGION ZONE AGE openshift-machine-api ostest-pz7zk-master-0 Running m4.xlarge regionOne nova 158m openshift-machine-api ostest-pz7zk-master-1 Running m4.xlarge regionOne nova 158m openshift-machine-api ostest-pz7zk-master-3 Running m4.xlarge regionOne nova 13m openshift-machine-api ostest-pz7zk-worker-bmpb2 Running m4.xlarge regionOne nova 143m openshift-machine-api ostest-pz7zk-worker-swgx5 Running m4.xlarge regionOne nova 143m openshift-machine-api ostest-pz7zk-worker-vfbqn Running m4.xlarge regionOne nova 143m $ openstack port list | grep master | a9d7e427-6c05-45cd-b7bf-b619afa04611 | ostest-pz7zk-master-port-1 | fa:16:3e:7d:5d:a9 | ip_address='10.196.1.4', subnet_id='0bc1dbe7-88d2-497d-8d9f-e4d55417c349' | ACTIVE | | af2ca250-0bd7-4c16-bc6c-72d5f5710f37 | ostest-pz7zk-master-3 | fa:16:3e:42:32:02 | ip_address='10.196.1.208', subnet_id='0bc1dbe7-88d2-497d-8d9f-e4d55417c349' | ACTIVE | | e90c91fb-32d4-49ad-a260-b5029baa342a | ostest-pz7zk-master-port-0 | fa:16:3e:f5:6d:ee | ip_address='10.196.1.227', subnet_id='0bc1dbe7-88d2-497d-8d9f-e4d55417c349' | ACTIVE | (shiftstack) [stack@undercloud-0 ~]$ oc get nodes NAME STATUS ROLES AGE VERSION ostest-pz7zk-master-0 Ready master 157m v1.18.3+cdb0358 ostest-pz7zk-master-1 Ready master 157m v1.18.3+cdb0358 ostest-pz7zk-master-3 Ready master 5m22s v1.18.3+cdb0358 ostest-pz7zk-worker-bmpb2 Ready worker 134m v1.18.3+cdb0358 ostest-pz7zk-worker-swgx5 Ready worker 134m v1.18.3+cdb0358 ostest-pz7zk-worker-vfbqn Ready worker 135m v1.18.3+cdb0358 $ oc rsh -n openshift-etcd etcd-ostest-pz7zk-master-0 sh-4.4# etcdctl member list -w table +------------------+---------+-----------------------+---------------------------+---------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+-----------------------+---------------------------+---------------------------+------------+ | 463e8f41186ae6db | started | ostest-pz7zk-master-3 | https://10.196.1.208:2380 | https://10.196.1.208:2379 | false | | 5c13780e866b473b | started | ostest-pz7zk-master-1 | https://10.196.1.4:2380 | https://10.196.1.4:2379 | false | | 794465c1dc67a32b | started | ostest-pz7zk-master-0 | https://10.196.1.227:2380 | https://10.196.1.227:2379 | false | +------------------+---------+-----------------------+---------------------------+---------------------------+------------+
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5.37 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1015