Bug 1825236

Summary: [DR] etcd-member-recover.sh fails to pull image with unauthorized: access to the requested resource is not authorized
Product: OpenShift Container Platform Reporter: Sam Batschelet <sbatsche>
Component: EtcdAssignee: Sam Batschelet <sbatsche>
Status: CLOSED ERRATA QA Contact: ge liu <geliu>
Severity: high Docs Contact:
Priority: high    
Version: 4.3.0CC: geliu, kgarriso, mfojtik, morgan.peterman, sttts
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1823931 Environment:
Last Closed: 2020-05-26 16:50:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1823931    
Bug Blocks: 1825221    

Description Sam Batschelet 2020-04-17 13:07:41 UTC
+++ This bug was initially created as a clone of Bug #1823931 +++

Description of problem: When running etcd-member-recover.sh the script fails with the following message:

[core@master2 ~]$ sudo -E /usr/local/bin/etcd-member-recover.sh 192.168.50.61 etcd-member-master2
Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49...
  unauthorized: access to the requested resource is not authorized
Error: unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49: Error reading manifest sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: unauthorized: access to the requested resource is not authorized
Error: unable to find container : name or ID cannot be empty
cp: cannot stat '/bin/etcdctl': No such file or directory

Version-Release number of selected component (if applicable): 4.3

How reproducible: Every time

Steps to Reproduce:
1. Deploy cluster with 3 masters
2. Backup etcd, remove master2/3 from cluster
3. Follow https://docs.openshift.com/container-platform/4.3/backup_and_restore/disaster_recovery/scenario-1-infra-recovery.html instructions to restore etcd
4. Redeploy master2/3
5. Follow step 5 to 'Grow etcd to full membership'
6. Run sudo -E /usr/local/bin/etcd-member-recover.sh 192.168.50.61 etcd-member-master2

Actual results:

[core@master2 ~]$ sudo -E /usr/local/bin/etcd-member-recover.sh 192.168.50.61 etcd-member-master2
Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49...
  unauthorized: access to the requested resource is not authorized

Expected results:

podman to pull the image and create the container.

Additional info:

etcd-member-recover.sh sources /usr/local/bin/openshift-recovery-tools which containers the following function:

dl_etcdctl() {
  local etcdimg="quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49"
  local etcdctr=$(podman create "${etcdimg}")
  local etcdmnt=$(podman mount "${etcdctr}")
  cp ${etcdmnt}/bin/etcdctl $ASSET_DIR/bin
  umount "${etcdmnt}"
  podman rm "${etcdctr}"
  $ASSET_DIR/bin/etcdctl version
}

podman create should reference /var/log/kubelet/config.json as the authfile.

A workaround to this issue is to manually pull the container prior to running the etcd-member-recover.sh script.

[core@master2 ~]$ sudo podman pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49 --authfile=/var/log/kubelet/config.json
Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:978e7aaf2d1b14ac9335044576dfc3f9621ffa02cfbaf6e8a72b5155be975b49...
...
Writing manifest to image destination
Storing signatures
162799682a6859c4365f8c2e21682457f9d98eaf06d4ea496767e0ef2add55a7
[core@master2 ~]$ sudo -E /usr/local/bin/etcd-member-recover.sh 192.168.50.61 etcd-member-master2
68720b4dbf4d0ecabad7e4bd5976ed04b3e949d0c7e8a5a4e4483b53f5b950ad
etcdctl version: 3.3.17
API version: 3.3
...
Member 2c2b7a2883a7b796 added to cluster 6d3f57bade6e16da

ETCD_NAME="etcd-member-master2"
ETCD_INITIAL_CLUSTER="etcd-member-master2=https://etcd-1.openshift.lab.int:2380,etcd-member-master1=https://etcd-0.openshift.lab.int:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://etcd-1.openshift.lab.int:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
Starting etcd..

--- Additional comment from Michal Fojtik on 2020-04-17 11:15:13 UTC ---

Comment 1 Kirsten Garrison 2020-04-23 18:15:57 UTC
This bug seems to be an exact dupe of: https://bugzilla.redhat.com/show_bug.cgi?id=1823931 even down to the target version?

Comment 6 errata-xmlrpc 2020-05-26 16:50:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2180