Description of problem: Install v4.0 cluster with new installer. After install complete, checked that some basic images in master/worker node were pulled from neither registry.svc.ci.openshift.org nor quay.io registry. List them as following: # crictl images IMAGE TAG IMAGE ID SIZE docker.io/grafana/grafana 5.2.4 920eb69ade2a2 250MB docker.io/openshift/oauth-proxy v1.1.0 90c45954eb03e 242MB docker.io/openshift/prometheus-alertmanager v0.15.2 68bbd00063784 242MB docker.io/openshift/prometheus-node-exporter v0.16.0 f9f775bf6d0ef 225MB docker.io/openshift/prometheus v2.5.0 59c3f2318e23a 273MB k8s.gcr.io/pause 3.1 da86e6ba6ca19 747kB Version-Release number of the following components: # bin/openshift-install version bin/openshift-install v0.5.0-master-9-g98e2d2b9b015a88d980503a025824bfa2b713906 Terraform v0.11.8 How reproducible: always Steps to Reproduce: 1. set env variables for installation 2. run "bin/openshift-install create cluster --dir demo" to deploy ocp on aws 3. Actual results: See above list. Expected results: According to further confirm with dev, all images(excluded image streams and samples) should be prefixed with nightly registry(registry.svc.ci.openshift.org) and later should be quay.io. So track this issue down in this bug. Additional info: Please attach logs from ansible-playbook with the -vvv flag
These images are not referenced anywhere in the installer. The grafana, oauth-proxy, and prometheus images are the responsibility of the monitoring team. You'll need to assign this bug to them.
Ok, since there are several images related to different components, assign the bug to monitoring team first.
The new installer uses origin images when installing v4.0 cluster, and monitoring pods are controlled by cluster-monitoring-operator pod, and it uses registry.svc.ci.openshift.org prefix, other images prefix uses docker.io or quay.io # oc -n openshift-monitoring get pod cluster-monitoring-operator-67bd456668-6zkjc -oyaml | grep image: image: registry.svc.ci.openshift.org/openshift/origin-v4.0-2018-12-06-021016@sha256:398213ab383af6c260e36b414e8847e855e690e94a1f913fd16bfd803e80cc1b docker.io/grafana/grafana:5.2.4 docker.io/openshift/oauth-proxy:v1.1.0 docker.io/openshift/prometheus-alertmanager:v0.15.2 docker.io/openshift/prometheus-node-exporter:v0.16.0 docker.io/openshift/prometheus:v2.5.0 quay.io/coreos/configmap-reload:v0.0.1 quay.io/coreos/kube-rbac-proxy:v0.4.0 quay.io/coreos/kube-state-metrics:v1.4.0 quay.io/coreos/prom-label-proxy:v0.1.0 quay.io/coreos/prometheus-config-reloader:v0.26.0 quay.io/coreos/prometheus-operator:v0.26.0 registry.svc.ci.openshift.org/openshift/origin-v4.0-2018-12-06-021016@sha256:398213ab383af6c260e36b414e8847e855e690e94a1f913fd16bfd803e80cc1b @Frederic Should the image prefix must use registry.svc.ci.openshift.org or quay.io?
This has been fixed about a month ago with https://github.com/openshift/cluster-monitoring-operator/pull/193 Please verify with an up to date payload.
# ./openshift-install version ./openshift-install v0.12.0 1. Pull latest installer binary from https://github.com/openshift/installer/releases. 2. Trigger installation with the binary Will encounter a dead loop between resource limit and bz1673185. So will verify the bug in next release build which should fix bz1673185.
Images are use quay.io registry now configmap-reloader: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:24eb3125b5fec17e2db68b7fcd406d5aecba67ebe6da18fbd9c2c7e884ce00f8 cluster-monitoring-operator: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2d0d8d43b79fb970a7a090a759da06aebb1dec7e31fffd2d3ed455f92a998522 prometheus-config-reloader: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:31905d24b331859b99852c6f4ef916539508bfb61f443c94e0f46a83093f7dc0 kube-state-metrics: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3f0b3aa9c8923c95233f2872a6d4842796ab202a91faa8595518ad6a154f1d87 kube-rbac-proxy: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:451274b24916b97e5ba2116dd0775cdb7e1de98d034ac8874b81c1a3b22cf6b1 k8s-prometheus-adapter: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:580e5a5cd057e2c09ea132fed5c75b59423228587631dcd47f9471b0d1f9a872 prometheus-operator: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5b4ba55ab5ec5bb1b4c024a7b99bc67fe108a28e564288734f9884bc1055d4ed prometheus-node-exporter: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5de207bf1cdbdcbe54fe97684d6b3aaf9d362a46f7d0a7af1e989cdf57b59599 prometheus-alertmanager: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9d8b88bd937ccf01b9cb2584ceb45b829406ebc3b35201f73eead00605b4fdfc prometheus: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b50f38e8f288fdba31527bfcb631d0a15bb2c9409631ef30275f5483946aba6f telemeter: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c6cbfe8c7034edf8d0df1df4208543fe5f37a8ad306eaf736bcd7c1cbb999ffc prom-label-proxy: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:efe301356a6f40679e27e6e8287ed6d8316e54410415f4f3744f3182c1d4e07e grafana: quay.io/openshift/origin-grafana:latest oauth-proxy: quay.io/openshift/origin-oauth-proxy:latest oauth-proxy and grafana uses origin images we have Bug 1677232 to track RHCOS build: 47.318 # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-02-18-224151 True False 57m Cluster version is 4.0.0-0.nightly-2019-02-18-224151
Need more images check, change status back.
Version: # ./openshift-install version ./openshift-install v0.12.0 # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.3 True False 1m Cluster version is 4.0.0-0.3 Checked all basic images on master/worker nodes after fresh install, still one image is not pulled from quay or svc registry. sudo crictl images|grep -vE 'quay|svc' IMAGE TAG IMAGE ID SIZE k8s.gcr.io/pause 3.1 da86e6ba6ca19 747kB
According to comment9 and comment7, monitoring related images are landed in correct place now. Only pause image is still wrong, so change component to containers team for further fix first. Please feel free to correct it if the component is wrong. # cat /etc/crio/crio.conf |grep pause_image pause_image = "k8s.gcr.io/pause:3.1"
(In reply to liujia from comment #10) > According to comment9 and comment7, monitoring related images are landed in > correct place now. Only pause image is still wrong, so change component to > containers team for further fix first. Please feel free to correct it if the > component is wrong. > > # cat /etc/crio/crio.conf |grep pause_image > pause_image = "k8s.gcr.io/pause:3.1" For the pause_image part, refer to https://github.com/openshift/machine-config-operator/pull/471
Fixed in https://github.com/openshift/machine-config-operator/pull/471 and https://github.com/openshift/installer/pull/1292
Checked and this issue has been fixed, so move to verified. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-04-03-202419 True False 131m Cluster version is 4.0.0-0.nightly-2019-04-03-202419 # oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-133-19.ap-northeast-1.compute.internal Ready worker 135m v1.12.4+509916ce1 10.0.133.19 <none> Red Hat Enterprise Linux CoreOS 410.8.20190329.0 (Ootpa) 4.18.0-80.el8.x86_64 cri-o://1.12.10-2.rhaos4.0.git2c94bb7.el7 ip-10-0-142-81.ap-northeast-1.compute.internal Ready master 144m v1.12.4+509916ce1 10.0.142.81 <none> Red Hat Enterprise Linux CoreOS 410.8.20190329.0 (Ootpa) 4.18.0-80.el8.x86_64 cri-o://1.12.10-2.rhaos4.0.git2c94bb7.el7 ip-10-0-145-160.ap-northeast-1.compute.internal Ready worker 135m v1.12.4+509916ce1 10.0.145.160 <none> Red Hat Enterprise Linux CoreOS 410.8.20190329.0 (Ootpa) 4.18.0-80.el8.x86_64 cri-o://1.12.10-2.rhaos4.0.git2c94bb7.el7 ip-10-0-155-245.ap-northeast-1.compute.internal Ready master 144m v1.12.4+509916ce1 10.0.155.245 <none> Red Hat Enterprise Linux CoreOS 410.8.20190329.0 (Ootpa) 4.18.0-80.el8.x86_64 cri-o://1.12.10-2.rhaos4.0.git2c94bb7.el7 ip-10-0-164-166.ap-northeast-1.compute.internal Ready master 144m v1.12.4+509916ce1 10.0.164.166 <none> Red Hat Enterprise Linux CoreOS 410.8.20190329.0 (Ootpa) 4.18.0-80.el8.x86_64 cri-o://1.12.10-2.rhaos4.0.git2c94bb7.el7 ip-10-0-167-245.ap-northeast-1.compute.internal Ready worker 136m v1.12.4+509916ce1 10.0.167.245 <none> Red Hat Enterprise Linux CoreOS 410.8.20190329.0 (Ootpa) 4.18.0-80.el8.x86_64 cri-o://1.12.10-2.rhaos4.0.git2c94bb7.el7 # oc get images --no-headers| awk '{print $2}'| awk -F/ '{print $1,$2}'| sort -u quay.io openshift-release-dev registry.access.redhat.com jboss-amq-6 registry.redhat.io 3scale-amp21 registry.redhat.io 3scale-amp22 registry.redhat.io 3scale-amp23 registry.redhat.io 3scale-amp24 registry.redhat.io dotnet registry.redhat.io fuse7 registry.redhat.io jboss-datagrid-6 registry.redhat.io jboss-datagrid-7 registry.redhat.io jboss-datavirt-6 registry.redhat.io jboss-decisionserver-6 registry.redhat.io jboss-eap-6 registry.redhat.io jboss-eap-7 registry.redhat.io jboss-eap-7-tech-preview registry.redhat.io jboss-fuse-6 registry.redhat.io jboss-processserver-6 registry.redhat.io jboss-webserver-3 registry.redhat.io openshift3 registry.redhat.io redhat-openjdk-18 registry.redhat.io redhat-sso-7 registry.redhat.io rhdm-7 registry.redhat.io rhdm-7-tech-preview registry.redhat.io rhoar-nodejs registry.redhat.io rhpam-7 registry.redhat.io rhpam-7-tech-preview registry.redhat.io rhscl # for i in `oc get nodes -o wide --no-headers | awk '{print $6}'`; do echo "========================= $i ======================="; ssh -i ~/.ssh/private_key.pem -o StrictHostKeyChecking=no -o ProxyCommand='ssh -i ~/.ssh/private_key.pem -A -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -W %h:%p core@$(oc get service -n openshift-ssh-bastion ssh-bastion -o jsonpath="{.status.loadBalancer.ingress[0].hostname}")' core@$i cat /etc/crio/crio.conf|grep -i "pause_image = "; done ========================= 10.0.133.19 ======================= pause_image = "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0f4767e691bd6b984691dd48a13313c13fece8442d0bd43756f8e9d0145861d4" Killed by signal 1. ========================= 10.0.142.81 ======================= Killed by signal 1. pause_image = "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0f4767e691bd6b984691dd48a13313c13fece8442d0bd43756f8e9d0145861d4" ========================= 10.0.145.160 ======================= Killed by signal 1. pause_image = "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0f4767e691bd6b984691dd48a13313c13fece8442d0bd43756f8e9d0145861d4" ========================= 10.0.155.245 ======================= pause_image = "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0f4767e691bd6b984691dd48a13313c13fece8442d0bd43756f8e9d0145861d4" Killed by signal 1. ========================= 10.0.164.166 ======================= Killed by signal 1. pause_image = "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0f4767e691bd6b984691dd48a13313c13fece8442d0bd43756f8e9d0145861d4" ========================= 10.0.167.245 ======================= pause_image = "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0f4767e691bd6b984691dd48a13313c13fece8442d0bd43756f8e9d0145861d4" Killed by signal 1.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758