Bug 1848752 - OCP 4.4 disconnected installation with static ip fails to install and tried to pull images from quay.io instead of internal registry
Summary: OCP 4.4 disconnected installation with static ip fails to install and tried t...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.6.0
Assignee: Urvashi Mohnani
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-18 21:25 UTC by Novonil Choudhuri
Modified: 2023-10-06 20:42 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-07 16:46:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Novonil Choudhuri 2020-06-18 21:25:13 UTC
Description of problem: 

Customer trying disconnected install of OCP4.4 on VMWare with static ips and getting following install errors. Why is the installer referring to quay.io when it is suppose to pull images from the internal mirror registry ?

Error: error getting image "quay.io/openshift-release-dev/ocp-release@sha256:039a4ef7c128a049ccf916a1d68ce93e8f5494b44d5a75df60c85e9e7191dacc": unable to find 'quay.io/openshift-release-dev/ocp-release@sha256:039a4ef7c128a049ccf916a1d68ce93e8f5494b44d5a75df60c85e9e7191dacc' in local storage: no such image
Warning: Could not resolve release image to pull by digest

CU confirmed he can reach the mirror registry from the bootstrap
$ curl -u ocpuser:<redacted> -k https://ptl01ocpjmphst101:5000/v2/_catalog
{"repositories":["ocp4/openshift4"]}

Version-Release number of the following components:
OCP 4.4 GA

How reproducible: Following https://docs.openshift.com/container-platform/4.4/installing/installing_vsphere/installing-restricted-networks-vsphere.html#installation-vsphere-machines_installing-restricted-networks-vsphere

Install-config.yaml file :

apiVersion: v1
baseDomain: dc.containers.xxxx.com
compute:
- hyperthreading: Enabled   
  name: worker
  replicas: 0 
controlPlane:
  hyperthreading: Enabled   
  name: master
  replicas: 3 
metadata:
  name: test2
platform:
  vsphere:
    vcenter: pda01vcntr002.na.xxxx.com
    username: na\bby-s-dotcominfraops
    password: <redacted>
    datacenter: 'Bloomington DC - NonProduction'
    defaultDatastore: BN01C-NTNX-LCP-LUN01
fips: false 
pullSecret: '{"auths":{"ptl01ocpjmphst101.na.xxxx.com:5000": {"auth": "<redacted>","email": "dotcomcontainerops"}}}' 
sshKey: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDyWgtLIAzTnn1vy21XukG1vZLTv6M+iNCF5+JYsks1WjXlfenD3sV011vMl2JxAvJlMxCoC80hzkFNWR+OCRpEoNaZfOtWRImKIHRqNRQQbhOaxYkXlIyX6Cbfbd4lC+F8mP6hVvL4akN8OAh/3ri8/GjjheEFuG68l/p9clTr2Jy13h7B3BY/o3iyQ0BHm9yzCrT90XtcdZaQulFRKLjrJuIGBKbmJILpikApveRfoNu69kWCFt82U2z0kJtaaMQC9kvRJi/vwwbFAj4mLrqmxHz0PxA8+pDVbXdvaxBtt4KVMgzu4i2nwj1ZH2Q6VtHMiACcaDSvOmVM6HghW6iksp2zvgkhnLjctV224MEsFDh/ywr7byF1Afu6YS7OD2gllH/DylpGJiDP25sN7bEjKVMB1RQ9j/2Ue/AKBj5JJB8yhAtSSnPbNmTyFDRTdG7wqyl+saC8ir+PG6p88mi/HbtNWd1erzgoq7zlIczwVfH1rChH4xMH2N8tfVf0W0dqzcyQAoFt05T9o4ZwxOJ5hrXmsGCYMH1u4IrWNlw28cEKSnoLmw5APSVG6i41/M0Un8YFJa6plaWApOQoqwyew7wKL/nmcnq5/1a9OQeeYj8wHmChAunXLnQ8bSS37LfhgHsj4CZh2IRQpz0vV5iHAYtOTsDJPTcclE8dO8b4Pw== u1202294.xxxxx.com'

additionalTrustBundle: | 


imageContentSources:
- mirrors:
  - ptl01ocpjmphst101.na.xxxx.com:5000/ocp4/openshift4
  source: quay.io/openshift-release-dev/ocp-release
- mirrors:
  - ptl01ocpjmphst101.na.xxxx.com:5000/ocp4/openshift4
  source: quay.io/openshift-release-dev/ocp-v4.0-art-dev


Actual results: OCP install trying to pull images from quay.io instead of pulling from internal registry


Expected results: OCP install should refer to internal registry instead of going to the internet.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 3 Abhinav Dahiya 2020-06-19 14:15:22 UTC
The pull from external registry is dinner by crio, so moving to container team to triage or help fix the error better.

Comment 5 pfruth 2020-07-06 23:09:51 UTC
I found this BZ because I was having a very similar problem during an airgap install of OCP 4.4.10 on vSphere UPI.
I am using an install-config.yaml structured very much like the one depicted in the opening post of this BZ.
I thought I'd provide some feedback here for a couple reasons;
1) As a testimonial that a disconnected install (aka. airgap) does infact work with the OCP 4.4.10 level of code
2) As a suggestion to check your pullSecret, if you see the error messages (as seen above, and as shown in the example below)

The TL;DR is, in my case, the problem was ultimately due to an error in the way I created the credentials used in the pullSecret.

The longer story

During startup of the bootstrap server, I was seeing these messages being repeatedly issued in the output of 'journalctl -b -f -u bootkube.service'

...
....
Jul 05 23:55:57 bootstrap bootkube.sh[3467]: Starting etcd certificate signer...
Jul 05 23:55:58 bootstrap bootkube.sh[3467]: Error: unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e9c82181e91ed172383760a970053304ba850608cfcaa6e4749661811132e454: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e9c82181e91ed172383760a970053304ba850608cfcaa6e4749661811132e454: (Mirrors also failed: [installer.internal.net:5000/x86_64/4.4.10@sha256:e9c82181e91ed172383760a970053304ba850608cfcaa6e4749661811132e454: Error reading manifest sha256:e9c82181e91ed172383760a970053304ba850608cfcaa6e4749661811132e454 in installer.internal.net:5000/x86_64/4.4.10: unauthorized: authentication required]): quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e9c82181e91ed172383760a970053304ba850608cfcaa6e4749661811132e454: Error reading manifest sha256:e9c82181e91ed172383760a970053304ba850608cfcaa6e4749661811132e454 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: unauthorized: access to the requested resource is not authorized
Jul 05 23:55:58 bootstrap bootkube.sh[3467]: Error: Failed to evict container: "": Failed to find container "etcd-signer" in state: no container with name or ID etcd-signer found: no such container
Jul 05 23:55:58 bootstrap systemd[1]: bootkube.service: Main process exited, code=exited, status=1/FAILURE
Jul 05 23:55:58 bootstrap systemd[1]: bootkube.service: Failed with result 'exit-code'.
Jul 05 23:56:03 bootstrap systemd[1]: bootkube.service: Service RestartSec=5s expired, scheduling restart.
Jul 05 23:56:03 bootstrap systemd[1]: bootkube.service: Scheduled restart job, restart counter is at 2.
Jul 05 23:56:03 bootstrap systemd[1]: Stopped Bootstrap a Kubernetes cluster.
Jul 05 23:56:03 bootstrap systemd[1]: Started Bootstrap a Kubernetes cluster.
....
...


The fundamental cause of my problem was due to the way I generated the base64 encoded userid:password portion of the pullSecret.
I incorrectly used;
echo "userid:password" | base64 -w0

when I should have used;
echo -n "userid:password" | base64 -w0

Notice the "-n" option must be used to suppress the newline character being included in the base64 encoding.

After I created a pullSecret with the correctly generated credential, I am now successfully able to perform an airgap install, using a private image registry, on OCP 4.4.10


Note You need to log in before you can comment on or make changes to this bug.