Description of problem: When running etcd-member-recover.sh the script fails with the following message: [core@master2 ~]$ sudo -E /usr/local/bin/etcd-member-recover.sh 192.168.50.61 etcd-member-master2 Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49... unauthorized: access to the requested resource is not authorized Error: unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49: Error reading manifest sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: unauthorized: access to the requested resource is not authorized Error: unable to find container : name or ID cannot be empty cp: cannot stat '/bin/etcdctl': No such file or directory Version-Release number of selected component (if applicable): 4.3 How reproducible: Every time Steps to Reproduce: 1. Deploy cluster with 3 masters 2. Backup etcd, remove master2/3 from cluster 3. Follow https://docs.openshift.com/container-platform/4.3/backup_and_restore/disaster_recovery/scenario-1-infra-recovery.html instructions to restore etcd 4. Redeploy master2/3 5. Follow step 5 to 'Grow etcd to full membership' 6. Run sudo -E /usr/local/bin/etcd-member-recover.sh 192.168.50.61 etcd-member-master2 Actual results: [core@master2 ~]$ sudo -E /usr/local/bin/etcd-member-recover.sh 192.168.50.61 etcd-member-master2 Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49... unauthorized: access to the requested resource is not authorized Expected results: podman to pull the image and create the container. Additional info: etcd-member-recover.sh sources /usr/local/bin/openshift-recovery-tools which containers the following function: dl_etcdctl() { local etcdimg="quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49" local etcdctr=$(podman create "${etcdimg}") local etcdmnt=$(podman mount "${etcdctr}") cp ${etcdmnt}/bin/etcdctl $ASSET_DIR/bin umount "${etcdmnt}" podman rm "${etcdctr}" $ASSET_DIR/bin/etcdctl version } podman create should reference /var/lib/kubelet/config.json as the authfile. A workaround to this issue is to manually pull the container prior to running the etcd-member-recover.sh script. [core@master2 ~]$ sudo podman pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49 --authfile=/var/lib/kubelet/config.json Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49... ... Writing manifest to image destination Storing signatures 162799682a6859c4365f8c2e21682457f9d98eaf06d4ea496767e0ef2add55a7 [core@master2 ~]$ sudo -E /usr/local/bin/etcd-member-recover.sh 192.168.50.61 etcd-member-master2 68720b4dbf4d0ecabad7e4bd5976ed04b3e949d0c7e8a5a4e4483b53f5b950ad etcdctl version: 3.3.17 API version: 3.3 ... Member 2c2b7a2883a7b796 added to cluster 6d3f57bade6e16da ETCD_NAME="etcd-member-master2" ETCD_INITIAL_CLUSTER="etcd-member-master2=https://etcd-1.openshift.lab.int:2380,etcd-member-master1=https://etcd-0.openshift.lab.int:2380" ETCD_INITIAL_ADVERTISE_PEER_URLS="https://etcd-1.openshift.lab.int:2380" ETCD_INITIAL_CLUSTER_STATE="existing" Starting etcd..
*** Bug 1824094 has been marked as a duplicate of this bug. ***
Verified with 4.5.0-0.nightly-2020-05-04-113741, check the file etcd-common-tools, it have include the right authfile path, and DR scripts works well.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409